Compressing CloudFront Assets and

Amazon’s web services have made rebuilding so much easier. We’re gradually moving more and more static assets to CloudFront (although most visitors are in the UK responses have much lower latencies than direct from S3 or even our own nginx servers). CloudFront doesn't support serving gzip'ed content direct from S3 out of the box.

Because of this, up until last week we were serving uncompressed assets, at least anything that wasn’t already compressed (such as images). Last week we put together a simple static assets nginx server to help compress things.

Whilst doing the work for I realised it would be trivial to write an application that would let any CloudFront user compress to any S3 bucket by using an equivalent URL structure. So I knocked up a quick node.js app that’s hosted on Heroku for all to use:

S3 assets can be referenced through a pretty simple URL structure. By creating an app that behaves in the same way, and proxies (whilst compressing) the response, it would be easy to create a compressible S3 for everyone.

For example, the URL references the S3 bucket pingles-example and the object we want to retrieve is identified by the name /sample.css.

The same resource can be accessed through and will be gzip compressed. CloudFront now lets you specify custom origins so for the above you’d add to setup a CloudFront distribution for the pingles-example S3 bucket.

At the moment it will only proxy public resources. Response latency also seems quite high at the moment but given the aim is to get content into the highly-cached and optimised CloudFront I’m not too fussed by it.

Introducing node-hdfs: node.js client for Hadoop HDFS

I’m very happy to announce a very early cut of a node.js library for accessing Hadoop’s filesystem: node-hdfs. This is down to a lot of work from a colleague of mine: Horaci Cuevas.

A few months ago I was tinkering with the idea of building a Syslog to HDFS bridge: I wanted an easy way to forward web log (and other interesting data) straight out to HDFS. Given I’d not done much with node.js I thought it might be a fun exercise.

During about a week of very late nights and early mornings I followed CloudKick’s example to wrap Hadoop’s libhdfs and got as far as it reading and writing files. Horaci has picked the ball up and run far and wide with it.

After you’ve run node-waf configure && node-waf build you can write directly to HDFS:

There’s some more information in the project’s README.

Once again, massive thanks to Horaci for putting so much into the library; forks and patches most certainly welcome, I’m pretty sure the v8 C++ I wrote is wrong somehow!