Category: work

The Trouble With Socket Timeout

Hi. We’re currently upgrading a Ruby driver at our platform at work. At the socket level, the old version of this driver uses IO.select, which boils down to the OS’s select system call. A tried and true solution, working as expected on any scenario: it waits for a certain time, if the time runs out it simply returns nothing and resumes execution. So if a client connects to a server and it stops responding but doesn’t close the connection, the client can decide what to do with that. Here’s an example of that:

#server.rb
require 'socket'

delay = 5

server = TCPServer.new 2000

loop do
  client = server.accept
  puts "#{Time.now} > Client arrived. Sleeping for #{delay}s."
  sleep delay
  puts "#{Time.now} > Done, replying."
  client.puts "Done. Bye!"
  client.close
end
#client-io-select.rb
require 'socket'

host = '127.0.0.1'
port = 2000
timeout = 2

s = Socket.new(Socket::AF_INET, Socket::SOCK_STREAM, 0)
s.connect(Socket.pack_sockaddr_in(port, host))

rs, = IO.select([s], [], [], timeout)
if rs
  puts rs[0].read(1000)
else
  puts 'Timeout'
end

s.close

Run the server, and then run client-io-select.rb. As expected, it will timeout after 2s while the server is deliberately sleeping for 5s. Change the client timeout to 6s and it will print the server reply. The new version of the driver changed that implementation in favour of setting the timeout value as an option of the socket, as specified in the socket man page and other places. So instead of using IO.select, it’s using Socket’s setsockopt method before connecting to set both SO_RCVTIMEO and SO_SNDTIMEO, which translate to the OS’s socket options. After connecting it uses the socket read method directly, trusting on Ruby and the OS to handle timeouts, which sounds nice. However, we found that the support for those options is somewhat inconsistent through Ruby MRI’s versions – I didn’t test it on other Ruby implementations – and on different operating systems. An example of a client using this approach:

#client-socket-options.rb
require 'socket'

host = '127.0.0.1'
port = 2000
timeout = 2

tv = [ timeout, 0 ].pack 'l_2'

s = Socket.new Socket::AF_INET, Socket::SOCK_STREAM, 0
s.setsockopt Socket::SOL_SOCKET, Socket::SO_RCVTIMEO, tv
s.setsockopt Socket::SOL_SOCKET, Socket::SO_SNDTIMEO, tv
s.connect Socket.pack_sockaddr_in port, host

begin
  while data = s.read(1000)
    puts data
  end
rescue => e
  puts e
end

s.close

We ran that client on Ruby 1.8.7-p374, 1.9.3-p545 and 2.1.2 at Mac OS X 10.9.4, all of them installed via rvm. The server is the same of the first example. On old Ruby 1.8 the client timed out as expected. On the other Ruby versions it waited the server response instead. Before getting to that conclusion, we also ran some tests using C because we thought that different operating systems could follow or not those socket options. Here is the C client we wrote to test it:

//client.c
#include <stdio.h>
#include <sys/socket.h>
#include <netdb.h>
#include <string.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
    char *host = "127.0.0.1";
    int port = 2000;
    int timeout = 2;

    int sockfd, n;

    char buffer[256];

    struct sockaddr_in serv_addr;
    struct hostent *server;
    struct timeval tv;

    tv.tv_sec = timeout;

    server = gethostbyname(host);
    bcopy((char *)server->h_addr, (char *)&serv_addr.sin_addr.s_addr, server->h_length);
    serv_addr.sin_port = htons(port);
    serv_addr.sin_family = AF_INET;

    sockfd = socket(AF_INET, SOCK_STREAM, 0);
    setsockopt(sockfd, SOL_SOCKET, SO_RCVTIMEO, &tv, sizeof(struct timeval));
    setsockopt(sockfd, SOL_SOCKET, SO_SNDTIMEO, &tv, sizeof(struct timeval));
    connect(sockfd, (struct sockaddr *)&serv_addr, sizeof(serv_addr));

    n = read(sockfd, buffer, 255);

    if (n < 0) {
        perror("error reading from socket");
        return 1;
    }

    printf("%s\n", buffer);
    return 0;
}

We ran that client on Mac OS X 10.9.4 with LLVM 5.1, Ubuntu 14.04 with GCC 4.8.2 and on CentOS 5.8 with GCC 4.1.2 . We used the same server of the first example. On OS X the client timed out as expected, but on Ubuntu and CentOS it didn’t. Don’t forget to test it yourself, specially with newer Ruby versions: one of the posts we found while investigating this described a different behaviour because it was based on Ruby 1.8 five years ago. I couldn’t find the reason behind the difference between Ruby versions – it might be a build option that had a default before, but I can’t pinpoint why without some better knowledge of the Ruby codebase. The same applies for the different operating systems. But the lesson is: setting socket options for sockets on those Ruby builds does not produce the expected behaviour currently.

manually migrate from cvs to git

Hey. Recently at work we had to work with a project still hanging around on the company’s CVS. The team quickly figured “what the hell, let’s move that to Git”. The process was painless, but involved dealing with long forgotten CVS commands. So here’s how I did it using just the command line.

First I had to setup CVSROOT environment and create the cvspass used to login on CVS:

export CVSROOT=:pserver:my_user_name@cvs.company.com:2401/opt/cvs/data
touch ~/.cvspass
cvs login

There were actually three modules to that project. I checked out all three, but I’ll show only one here:

cd my-dev-projects
cvs checkout ThatOldProjectModule
cd ThatOldProjectModule

Now the magic. Git comes with a CVS import utility – make sure it’s installed and cvsps binary is in your path. All I had to do was call it from the CVS working copy and point a destination folder where the new Git repository would sit:

git cvsimport -C ../that-old-project-module .
cd ../that-old-project-module
git log

Yes. Everything was there. The repository was local, so I created the project on the company’s Git server and pushed:

git remote add origin git@ngit.company.com:that-old-project/that-old-project-module.git
git push origin master

After checking that everything was fine on Git the CVS repository can be deleted. As I didn’t have permissions to do so on our server I just cleaned up everything and left a note.

cd ../ThatOldProjectModule
cvs remove -f -R *
echo "This project was migrated to Git" > find-me-on-git.txt
cvs add find-me-on-git.txt
cvs commit -m "Moved project to Git - you can find me on http://ngit.company.com/that-old-project"

One last thing, I noticed at this point that folders weren’t removed. I really didn’t care, but a quick search showed that folders can’t be removed from CVS – I never knew it. But empty folders can be pruned out on update or checkout:

cvs update -p

Not that it really mattered: everything was ready for some heavy coding on Git.