For those who missed it, here’s the text of my talk at
CloudCamp London yesterday. CloudCamp was great fun, thanks Chris!
Slide 1
Hi, I’m Luke from Hybrid Logic and I’m going to talk about filesystem snapshots and how they are useful in cloud computing.
Slide 2
A snapshot is an instantaneous point-in-time copy of your filesystem. The blocks that haven’t changed aren’t needlessly copied so you can store lots of snapshots with less disk space than you’d expect.
What are snapshots good for? Well, have you ever deleted important files by accident? Keeping snapshots lets you quickly “roll back time”.
Also, if you can copy your snapshots onto a different server, they can act as a great backup which you can recover very quickly from.
Cloud instances aren’t perfect, and data loss/instance failure in not un-heard-of in public clouds. Whole industries have grown up around dealing with the transient, ephemeral nature of cloud instances.
Being able to take a snapshot of your server and clone it brings a new level of manageability as well. If you’ve ever started up an EC2 instance, then you have – perhaps unwittingly – cloned a snapshot of a disk image.
Slide 3

Infrastructure is the underlying compute hardware, whether real or virtualised. With respect to storage, the infrastructure corresponds to the block device exposed by, say, EBS on EC2, or the physical hard disk in a non-cloud data centre.
The platform includes the Operating System and crucially the Fileystem which you choose to install on your cloud instances.
My claim is that it’s better to have the snapshotting done at the filesystem level, than to rely on the underlying infrastructure’s snapshotting capabilities, if they exist at all.
Slide 4
The primary benefit of doing this is the removal of vendor lock-in. By having snapshots at the platform level you can replicate data between servers in entirely different cloud infrastructures, for example, you can move data between EC2 to ElasticHosts and back again. Plus you can move snapshots in and out of the cloud entirely, allowing you to build hybrid clouds without expensive, complex virtualisation in your own data centre. In total, this reduces your dependence on any one provider, which reduces your risk of downtime.
Slide 5
Relying on infrastructure for your snapshots brings some other problems too. When you take a snapshot with something like EBS, because the infrastructure can’t communicate “up” to the platform, it has no way of telling the filesystem that the snapshot is about to happen. If the filesystem is mid-way through a write when the snapshot takes place, you’ll end up with a corrupt snapshot.
One solution is to use a “pausable” filesystem, such as XFS, so you can flush it to disk and block the flow of writes during a snapshot. But because you require interaction between the two different layers, the process of pausing the filesystem and taking the snapshot can take a long time, which has been known to crash MySQL.
ZFS allows the unification of these layers. By some Linux kernel hackers this has been described as a “rampant layering violation” but I prefer to think of it as a elegant refactoring, because in fusing these two layers together ZFS becomes faster and smarter, guaranteeing O(1), consistent filesystem snapshots.
Slide 6

XFS on EBS gives you vendor lock-in and so do any other infrastructure-based solutions. You also can’t use it to do live migration of snapshots from one server to another, called send/recv replication.
Btrfs is the Linux answer to the next-gen filesystem but it’s immature and not yet production ready.
Veritas does snapshots, but while it’s mature and stable, it’s very expensive.
This leaves ZFS, which is mature, stable and fast, and which allows you to send incremental changes between snapshots from one server to another. The only thing holding it back from mass adoption is the a lack of a performant Linux kernel port. But ZFS for Linux is coming in December. I’ve tested the beta, and it’s promising.
Here’s an example of how to do an incremental send and receive of a snapshot with ZFS to keep a slave up-to-date with the filesystem on a master.
Slide 7

We create a zfs filesystem called “bucket1″. We put some data into that filesystem and then we snapshot it.
Then we send the first snapshot in full over to the slave which receives it and saves it to disk.
Then we change some bytes in the data on the master, snapshot the filesystem again, and send an incremental diff over to the slave.
This means that only the blocks that have changed get sent from one machine to another, so it’s very efficient.
Slide 8
We’re doing some cool stuff with this incremental zfs replication. We’ve built an asynchronously replicated cluster filesystem on top of it and we’re using that to build web clusters which have these nice properties. You can kill any machine safely in the knowlegde that a 10-second old backup of all its data will be stored safely across the cluster. By mounting many snapshots read-only, you can get horizontal scalability for read-heavy loads. And by picking the latest snapshot and stashing any others after a netsplit, you gain partition tolerance.
Furthermore, the incremental snapshots trick lets us automatically bring offline machines up to date from any timestamp, efficiently sending only the data which has changed between the time the machine went offline to when it came back.
In conclusion, ZFS let’s you do all this, it already runs on FreeBSD (our primary platform) and it’s coming to Linux in December, so check it out.
Slide 9
Thanks!
Follow us on Twitter: @hybridcluster / @lmarsden
Native ZFS on Linux, GA in December 2010: zfs.kqinfotech.com