Amazon EC2 persistent storage
14 Apr 2008Werner Vogels, CTO of amazon has just written about Amazon's latest feature addition: persistent storage for amazon ec2 -- This recent announcement come a few weeks after Amazon announced static IP's (Elastic IP's) and Availability Zones -- or the ability to specify the location of an instance on creation.
Persistent storage for Amazon EC2 will be offered in the form of storage volumes which you can mount into your EC2 instance as a raw block storage device. It basically looks like an unformatted hard disk. Once you have the volume mounted for the first time you can format it with any file system you want or if you have advanced applications such as high-end database engines, you could use it directly.
Reading the post, what the technology sounds like is some sort of home grown SAN -- there are however some limitations -- the storage device can only be mounted by one instance at a time, and more annoying is only available from one availability zone. However, one nice and unexpected feature is the ability to store snapshots of the volume to S3 and then create volumes in other Availability Zones from that snap shot.
I think it is an important step forward, and the pace of development at Amazon is impressive... but I'm really getting annoyed by features that are missing something, or have some constraints.What gets to me the most is that only one instance and mount one volume at any given time -- a truly distributed file system that allowed multiple running instances to use it concurrently would really blow me away.
I guess the context of this is really a database server, and you would only have one storage device per zone to correspond to each mysql-slave per zone... Another scenario would be an Apache SOLR master instance would use the volume as the persistent storage for the Lucene Index and replicate out to the Slaves that would just store on the transient EC2 drive.