The Trouble With Socket Timeout

Hi. We’re currently upgrading a Ruby driver at our platform at work. At the socket level, the old version of this driver uses IO.select, which boils down to the OS’s select system call. A tried and true solution, working as expected on any scenario: it waits for a certain time, if the time runs out it simply returns nothing and resumes execution. So if a client connects to a server and it stops responding but doesn’t close the connection, the client can decide what to do with that. Here’s an example of that:

#server.rb
require 'socket'

delay = 5

server = TCPServer.new 2000

loop do
  client = server.accept
  puts "#{Time.now} > Client arrived. Sleeping for #{delay}s."
  sleep delay
  puts "#{Time.now} > Done, replying."
  client.puts "Done. Bye!"
  client.close
end
#client-io-select.rb
require 'socket'

host = '127.0.0.1'
port = 2000
timeout = 2

s = Socket.new(Socket::AF_INET, Socket::SOCK_STREAM, 0)
s.connect(Socket.pack_sockaddr_in(port, host))

rs, = IO.select([s], [], [], timeout)
if rs
  puts rs[0].read(1000)
else
  puts 'Timeout'
end

s.close

Run the server, and then run client-io-select.rb. As expected, it will timeout after 2s while the server is deliberately sleeping for 5s. Change the client timeout to 6s and it will print the server reply. The new version of the driver changed that implementation in favour of setting the timeout value as an option of the socket, as specified in the socket man page and other places. So instead of using IO.select, it’s using Socket’s setsockopt method before connecting to set both SO_RCVTIMEO and SO_SNDTIMEO, which translate to the OS’s socket options. After connecting it uses the socket read method directly, trusting on Ruby and the OS to handle timeouts, which sounds nice. However, we found that the support for those options is somewhat inconsistent through Ruby MRI’s versions – I didn’t test it on other Ruby implementations – and on different operating systems. An example of a client using this approach:

#client-socket-options.rb
require 'socket'

host = '127.0.0.1'
port = 2000
timeout = 2

tv = [ timeout, 0 ].pack 'l_2'

s = Socket.new Socket::AF_INET, Socket::SOCK_STREAM, 0
s.setsockopt Socket::SOL_SOCKET, Socket::SO_RCVTIMEO, tv
s.setsockopt Socket::SOL_SOCKET, Socket::SO_SNDTIMEO, tv
s.connect Socket.pack_sockaddr_in port, host

begin
  while data = s.read(1000)
    puts data
  end
rescue => e
  puts e
end

s.close

We ran that client on Ruby 1.8.7-p374, 1.9.3-p545 and 2.1.2 at Mac OS X 10.9.4, all of them installed via rvm. The server is the same of the first example. On old Ruby 1.8 the client timed out as expected. On the other Ruby versions it waited the server response instead. Before getting to that conclusion, we also ran some tests using C because we thought that different operating systems could follow or not those socket options. Here is the C client we wrote to test it:

//client.c
#include <stdio.h>
#include <sys/socket.h>
#include <netdb.h>
#include <string.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
    char *host = "127.0.0.1";
    int port = 2000;
    int timeout = 2;

    int sockfd, n;

    char buffer[256];

    struct sockaddr_in serv_addr;
    struct hostent *server;
    struct timeval tv;

    tv.tv_sec = timeout;

    server = gethostbyname(host);
    bcopy((char *)server->h_addr, (char *)&serv_addr.sin_addr.s_addr, server->h_length);
    serv_addr.sin_port = htons(port);
    serv_addr.sin_family = AF_INET;

    sockfd = socket(AF_INET, SOCK_STREAM, 0);
    setsockopt(sockfd, SOL_SOCKET, SO_RCVTIMEO, &tv, sizeof(struct timeval));
    setsockopt(sockfd, SOL_SOCKET, SO_SNDTIMEO, &tv, sizeof(struct timeval));
    connect(sockfd, (struct sockaddr *)&serv_addr, sizeof(serv_addr));

    n = read(sockfd, buffer, 255);

    if (n < 0) {
        perror("error reading from socket");
        return 1;
    }

    printf("%s\n", buffer);
    return 0;
}

We ran that client on Mac OS X 10.9.4 with LLVM 5.1, Ubuntu 14.04 with GCC 4.8.2 and on CentOS 5.8 with GCC 4.1.2 . We used the same server of the first example. On OS X the client timed out as expected, but on Ubuntu and CentOS it didn’t. Don’t forget to test it yourself, specially with newer Ruby versions: one of the posts we found while investigating this described a different behaviour because it was based on Ruby 1.8 five years ago. I couldn’t find the reason behind the difference between Ruby versions – it might be a build option that had a default before, but I can’t pinpoint why without some better knowledge of the Ruby codebase. The same applies for the different operating systems. But the lesson is: setting socket options for sockets on those Ruby builds does not produce the expected behaviour currently.

store numbers compactly in readable strings

Hey. While working on my masters project with a friend, we stumbled upon a minor puzzle. The storage we were going to use was designed to store only string values. But we wanted to store triples of large integers, so just writing them as decimals on strings would use more space than necessary. A number up to (2^32)-1 (i.e., 4294967295) can be stored on mere 32 bits, but when encoded in UTF-8 it takes 80 bits.

Well, reducing the space we needed to store those triples by half could help the project, so I looked around for any tools that could encode numbers as readable strings. Something like Base64 encoding, but that didn’t pad the numbers so much that the resulting string isn’t that smaller than the decimal number. Also, I took the chance to write that in Ruby, as I’m working with it now, and publish my first gem.

Decimal numbers are our way of representing values on base 10, that is, using 10 symbols. Base64, as the name says, uses 64 symbols. The chosen symbols are readable characters – all 26 alphabet letters in upper and lower case, the numbers 0 to 9, plus “+” and “/”. To represent a number what I had to to then was to actually change the base of the number from 10 to 64. This way the number 0 would be “A”, 1 would be “B”, 50 is “y”, 64 is “BA” and so on.

After experimenting a bit, mostly taking care with string building, I noticed that the same code could be used to any set of symbols. And it was a good perspective: there are actually 95 readable characters on the good old ASCII table. So instead of using just the 64 characters of Base64, I also kept a 95 characters set around for even better usage of space.

Code written, I got to the task of setting it up as a gem. It was actually simple, pretty straightforward as in the guide. You keep your code on the lib folder, add gemspec on the Gemfile and create a gemspec file. After that, you create and account on RubyGems, get the API key and then gem build, gem push and that’s it, gem published.

Really sweet. Now anyone who wishes to use it just have to run “gem install num_coder”, and run any of the examples described on the project README. For instance, you could get a list of numbers in the billions and encode it in a single string and back:

> NumCoder.fixed_encode95 [1234567890,1876543290,6758493021], 5
=> “/.y5M7#c69r|iNl”
> NumCoder.fixed_decode95 “/.y5M7#c69r|iNl”, 5
=> [1234567890, 1876543290, 6758493021]

Each number using five characters instead of the expected ten. The representation could be even more compact using an even larger character set for symbols, going for the rest of the UTF-8 range. But then it wouldn’t be that readable, depending on the platform. Halving the space will do for now!

(1000) Days with Elle, and counting

Hey. Back after a long time. Today I made a quick mental calculation on how long I’ve been with my wife, and it sounded like we were approaching a round number of days. Well, the mental calculation was not enough to satisfy my curiosity then, so I looked for a site that could tell me the days that passed since a certain date. I found some, but none were good looking enough to show to my wife.

The idea hit to make a simple static website and show the dates on style. So I could test two new things I hadn’t before: AWS S3 Web Hosting and Twitter Bootstrap. And later slap some AdSense to it, of course.

Twitter Bootstrap was simpler than I thought. First you download their zip from the website. After you link their CSS`s and JS`s – which is not clear on the website – you can follow their recipes. For the datepicker I used Andrew Rowls adaptation. I still miss a way to have the date always showing on internationalized format, but I can add that later.

I wrote everything with HTML and JS only, so I didn’t need an EC2 instance this time. To host those files on S3 you have to create a bucket with the name of your site, and it must be a subdomain. For example, I had to create a bucket called http://www.getdays.info instead of just getdays.info . The files on the bucket must be publicly readable. The easiest way to do it is to set a policy on your bucket. Here’s my example:

{
  "Version": "2008-10-17",
  "Statement": [
  {
    "Sid": "AllowPublicRead",
    "Effect": "Allow",
    "Principal": {
        "AWS": "*"
    },
    "Action": "s3:GetObject",
    "Resource": "arn:aws:s3:::www.getdays.info/*"
  }
]}

After that I set up the bucket as a website and pointed my DNS – I use no-ip – to redirect anything to www, and www as a CNAME redirect to the hostname S3 gave me.

That was it. Next month my wife and I will celebrate 1000 days together – a lot longer than just some days of summer.

Links:

Count the days
Twitter Bootstrap
Andrew’s modified Twitter Bootstrap datepicker
(500) Days of Summer

start streaming live – it’s simpler than it looks

Hey. At first people searching for streaming solutions might be afraid when they notice that there are lots and lots of technologies around, both old and new. From VLC plugins to Helix servers, from RTMP to Pseudostreaming, the options are plenty. And streaming can be used both for live or on demand – recorded – video, so it may seem even more confusing.

But when it comes down to it, the most used approaches for live stuff are RTMP streaming to Flash Players – which works on any desktop browser and some mobile devices – and HTTP Streaming to Safari/iOS. Lots of servers can do both: Adobe Flash Media Server – FMS -, Wowza and Red5 are the most common.

Here I’ll focus on a live transmission using FMS. What’s needed before starting is:

  • A server somewhere running a Red Hat like Linux (e.g. CentOS) or Windows;
  • A webcam or a video capturing device;
  • An Adobe account – you can create them for free at https://www.adobe.com/cfusion/membership/ .

You will setup three things:

  1. An encoder on the computer with a webcam. This will take the video signal and convert it to one of the compatible encoding formats – preferably H.264 for video and MP3 for audio. Then it will publish, i.e., stream it to the streaming server that will broadcast it;
  2. A streaming server. This will receive a published stream and allow lots of viewers to come and watch the video. These servers are configurable and can run applications that will talk with any connecting player or clients and allow for other functionalities, like DVR or access control;
  3. A web page with a flash player or an HTML5 player.

First, download and install Flash Media Live Encoder – FMLE. When you run it you will be able to see if your webcam is working. Let’s begin with one of the custom presets – choose the Medium Bandwidth preset for H.264. We’ll fill in the output options later.

Now download Flash Media Server 4.5. FMS is somewhat picky when it comes to environment. So you will probably want to install it to a virtual machine or, even better, on a hosted server. You can try Amazon’s default instance on AWS EC2, or fire up a server on a service as Rackspace or Linode. Don’t forget to choose a Red Hat like Linux. Installation is pretty straightforward: just untar and run installFMS with all default options.

The installation process will also start up FMS for you if you let it. So after he starts you can actually check if he’s working by opening fms_adminConsole.swf which you can find a copy inside FMS at webroot/swfs – you can run it on your computer standalone or from a webpage if you’re running Apache. On AdminConsole point to your server instalation using the username and password you provided during installation. If he connects everything is fine.

Some theory here: FMS is a middleman to your transmission. It’ll sit between your encoder system – your own PC with a webcam in this case – and the audience. Lots of FMS instances can share the burden of a large audience while using one single encoder, and FMS runs applications to customize the video delivery – e.g., you could ask every connection to contain a passkey, or measure the user’s bandwidth and choose to stream a smaller video. An example application that ships with him is the “live” application – confusing name, huh? – that simply accepts publishing streams and clients requests.

With FMS up and running you can stream to it. You’ll have to point FMLE to a RTMP url. The url is built like this:

rtmp://server_name/application/instance/mp4:streamName

As we’ve seen above, the application is “live”. If you don’t choose an instance name – which is just a way of allowing the same application code to run independently, allowing restarts and such separately – it will use a default name, but let’s specify one. On FMLE you have to split the url into two parts. For example:

FMS URL: rtmp://my.web.server/live/tv

Stream: mp4:abcnews

That should be enough. Hit the green Start button and check on AdminConsole. A new instance will show up, and if you choose Streams instead of Log you’ll be able to see your stream. You’re publishing, now you need to play it back.

There are a lot of flash players for you to embed on your page. Two well know ones are JWPlayer and Flowplayer. Let’s pick JW Player. You can just follow their setup instructions, or copy from my example repository on GitHub. The page is pretty simple, you won’t have trouble to understand: https://github.com/moret/jw-player-basic-rtmp

So that’s it! You can now post your webpage somewhere and show it to someone. You’ll be only limited in access by your streaming server license.

manually migrate from cvs to git

Hey. Recently at work we had to work with a project still hanging around on the company’s CVS. The team quickly figured “what the hell, let’s move that to Git”. The process was painless, but involved dealing with long forgotten CVS commands. So here’s how I did it using just the command line.

First I had to setup CVSROOT environment and create the cvspass used to login on CVS:

export CVSROOT=:pserver:my_user_name@cvs.company.com:2401/opt/cvs/data
touch ~/.cvspass
cvs login

There were actually three modules to that project. I checked out all three, but I’ll show only one here:

cd my-dev-projects
cvs checkout ThatOldProjectModule
cd ThatOldProjectModule

Now the magic. Git comes with a CVS import utility – make sure it’s installed and cvsps binary is in your path. All I had to do was call it from the CVS working copy and point a destination folder where the new Git repository would sit:

git cvsimport -C ../that-old-project-module .
cd ../that-old-project-module
git log

Yes. Everything was there. The repository was local, so I created the project on the company’s Git server and pushed:

git remote add origin git@ngit.company.com:that-old-project/that-old-project-module.git
git push origin master

After checking that everything was fine on Git the CVS repository can be deleted. As I didn’t have permissions to do so on our server I just cleaned up everything and left a note.

cd ../ThatOldProjectModule
cvs remove -f -R *
echo "This project was migrated to Git" > find-me-on-git.txt
cvs add find-me-on-git.txt
cvs commit -m "Moved project to Git - you can find me on http://ngit.company.com/that-old-project"

One last thing, I noticed at this point that folders weren’t removed. I really didn’t care, but a quick search showed that folders can’t be removed from CVS – I never knew it. But empty folders can be pruned out on update or checkout:

cvs update -p

Not that it really mattered: everything was ready for some heavy coding on Git.