Offloading web traffic using Amazon’s S3 service

We have a couple of fairly high traffic sites that have large images designed to be used for desktop backgrounds. To save a bit of bandwidth, we decided to give Amazon’s S3 webservice a spin.

Signing up was fairly painless. They required a credit card (so they could bill us $.15/G storage and $.20/G transfer). After I signed up I quickly received an email that contained a link to my public and secret keys.

This is a fairly new service and the client tools are just getting started. For my purposes of uploading several images, I decided to use jSh3ll to ‘browse’ my S3 storage and a custom ruby script to upload a large amount of files.

After downloading and installing jSh3ll, I created my first bucket:

jSh3ll> bucket www.hpnotiq.com
Bucket set to 'www.hpnotiq.com'
jSh3ll> createbucket
Created bucket 'www.hpnotiq.com'

I then hacked out a quick ruby script using the example ruby S3 library to upload all the image files I wanted to store on S3.

  #!/usr/bin/env ruby

require 'S3'
require 'pp'

conn = S3::AWSAuthConnection.new("<YOUR PUBLIC KEY>", "<YOUR PRIVATE KEY>", false)

Dir["images/*.jpg"].each do |filename|
  basename = filename.split('/')[-1]
  pp response = conn.put('www.hpnotiq.com',
                         "images/wallpapers/#{basename}",
                         File.new(filename).read,
                         {
                           "x-amz-acl" => "public-read",
                           "Content-Type" => "image/jpeg"
                         }
                        )
end
  

A few notes about the code:

  • We decided to use the domain name for the bucket and the directory+file for the object id
  • The S3 ruby libraries don’t stream, so the whole file is loaded into memory and put on the server
  • The hash at the end sets some headers for the ‘put’. The first tells S3 that this file can be read publicly. The second is the content type for the file(which will be set in the header when the file is downloaded)

Back to JSh3ll to see what’s there:

  jSh3ll> list
Item list for bucket 'www.hpnotiq.com'
key=images/wallpapers/blackdrink_1024.jpg, owner=steveny, size=75625 bytes, last modified=Tue Mar 28 11:54:57 EST 2006
key=images/wallpapers/blackdrink_800.jpg, owner=steveny, size=47251 bytes, last modified=Tue Mar 28 11:54:59 EST 2006
key=images/wallpapers/bottle_1024.jpg, owner=steveny, size=56590 bytes, last modified=Tue Mar 28 11:55:00 EST 2006
  

Now that all the files are on the service, I just needed to write an apache rewrite rule to redirect people to the images’ new location:

  RewriteEngine on
RewriteRule ^/images/wallpapers/(.*)$ http://s3.amazonaws.com/www.hpnotiq.com/images/wallpapers/$1 [R,L]
  

Bringing up my browser and looking at the headers, we can see where the request gets redirected to s3.amazonaws.com and the Content-Type is set correctly:

  GET /images/wallpapers/blackdrink_1024.jpg HTTP/1.1

Host: www.hpnotiq.com
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20051010 Firefox/1.0.7 (Ubuntu package 1.0.7)
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache

HTTP/1.x 302 Found
Date: Tue, 28 Mar 2006 17:49:27 GMT
Server: Apache
Location: http://s3.amazonaws.com/www.hpnotiq.com/images/wallpapers/blackdrink_1024.jpg
Content-Length: 326
Keep-Alive: timeout=2, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=iso-8859-1

----------------------------------------------------------

http://s3.amazonaws.com/www.hpnotiq.com/images/wallpapers/blackdrink_1024.jpg

GET /www.hpnotiq.com/images/wallpapers/blackdrink_1024.jpg HTTP/1.1
Host: s3.amazonaws.com
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20051010 Firefox/1.0.7 (Ubuntu package 1.0.7)
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache

HTTP/1.x 200 OK
x-amz-id-2: GxsekKOL2jdljb5K6/RWPlswvBhbNjYrP8klHs2IXGqwZcjzRQ3FIsEhPo/L/Gfe
x-amz-request-id: 6E6912FF565D7B04
Date: Tue, 28 Mar 2006 17:49:29 GMT
Last-Modified: Tue, 28 Mar 2006 16:54:57 GMT
Etag: "a58c016d36905ba89cf17ea99a574cb3"
Content-Type: image/jpeg
Content-Length: 75625
Server: AmazonS3
  
del.icio.us:Offloading web traffic using Amazon's S3 service digg:Offloading web traffic using Amazon's S3 service spurl:Offloading web traffic using Amazon's S3 service wists:Offloading web traffic using Amazon's S3 service simpy:Offloading web traffic using Amazon's S3 service newsvine:Offloading web traffic using Amazon's S3 service blinklist:Offloading web traffic using Amazon's S3 service furl:Offloading web traffic using Amazon's S3 service reddit:Offloading web traffic using Amazon's S3 service fark:Offloading web traffic using Amazon's S3 service blogmarks:Offloading web traffic using Amazon's S3 service Y!:Offloading web traffic using Amazon's S3 service smarking:Offloading web traffic using Amazon's S3 service magnolia:Offloading web traffic using Amazon's S3 service segnalo:Offloading web traffic using Amazon's S3 service gifttagging:Offloading web traffic using Amazon's S3 service

3 Responses to “Offloading web traffic using Amazon’s S3 service”

  1. Vlad Stesin Says:

    Is it working well for you? Have there been any outages or problems at all? We’re also considering doing something similar but it’s a little scary.

  2. steveny Says:

    It’s working fairly well. S3 had an outage over the weekend for a few hours that caused us some problems. I am working on a php script that will look at the headers from S3 and compare the MD5 sum contained in the header and compare it to the MD5 of the local file. If the MD5 hashes are the same, I will redirect to S3. This should solve any file changes/s3 downtime issues.

    Carson is also said to be working on an apache module that will do all of what I am describing transparently. Look for more posts soon!

Leave a Reply