Accelerating Amazon Simple Storage Service S3Posted by Max Robbins on October 22nd, 2011
Our customers and our internal development group uses the Amazon Web Services S3 service. Its a great resource for web based storage. It has an amazing uptime and its very straightforward. Our only complaint is that is not fast.
We generally specialize in sitting between the client browsers and the Web tier. With a little digging into the AWS docs we began to look at S3 as a great big web server.
We wanted to set up an environment that would simulate a busy web site grabbing files off S3 and see how the aiScaler would effect the results. The details of the test are linked at the bottom of this post on our wiki.
With cached object we saw the time to load a 50 Kilobyte file was on average about 60 Milliseconds from s3. It was about 4 milliseconds if accessed from aiScaler, providing it had been cached. So in short the first user would still endure the 60 Milliseconds, but the second user and all subsequent for the TTL specified would get it in about 4ms. This factoring in network latency.
We did find an interesting fact. Files that where not cached loaded about 5% faster with aiScaler. We believe this has to do with more efficient handling of the session management. We also found that load times for files off s3 without aicache ranges widely from 35 Milliseconds to over 200. The same files coming from aiScaler where a uniform 4 to 6 milliseconds.
The rule of thumb in dynamic caching is that you need to have a good percentage of your content being accessed multiple times within the Time to live to make it make sense.
In a specific environment you would need to look at your log files and see if there is a certain percentage of files that are repeatedly coming from S3 within short periods of time, i.e. shorter than their being replaced by another file. While maybe this is only 1% of your files, if that are being accessed 10’s, 100’s, or thousands of times within a reasonable TTL, (settable by pattern) the aiScaler can have an enormous impact on your site speed.
We found that micro instances where basically worthless as they have no throughput. Anything running aiScaler on large instance and up significantly outperformed S3 both in the time to generate a specific file and the total throughput.
There are scenarios where this makes sense as described above. There are also scenarios where it makes no sense. aiScaler as a RAM based cache must consume the entire file before serving it. Trying this with large files, say 100MB will actually slow down the initial request. Also you need to be mindful of the memory footprint. Additionally not a lot can be done for streaming protocols, with the exception of HTTP based streaming.
The synopsis is that aiScaler can make a big impact as a caching tier between S3 and your web app, or directly serving to your end users client.
Our entire test environment is show in detail here: