Stability, performance and rendering improvements

November 19th, 2010

It’s been a busy first week of the beta here at Hybrid Logic HQ, and we’re very pleased by the response we’ve had to the start of the beta — thank you! There’s been a buzz of activity on the forums and we love it when you give us feedback, so please carry on experimenting with our software and tell us what you think.

Along with the awesome feedback from yourselves, which we taking careful note of, we’ve also been doing some improvements of our own. Here’s a quick breakdown of the fixes and improvements which have now been deployed across all your clusters:

  • At the deployment level, we can now add new instances to an existing cluster. This means we can unintrusively upgrade a cluster to include new or better spec machines (irrespective of physical location) so that you can scale your hosting operation seamlessly.
  • The “God Pod” has had some significant responsiveness improvements. When we first launched, it wasn’t the most responsive user experience in the world. It’s much quicker and more accurate now, so give it a go!
  • We’ve made significant improvements to the stability of the core web hosting platform. We’ve solved several problems which were causing “Default site on X” error messages where your websites should have been. Another bug was causing databases to sometimes become inaccessible, and we’ve solved that too. Stability is looking a lot better.
  • We’ve improved the intelligence of the core load balancing algorithms, meaning that the decisions to move a site from one server to another (due to load) is now a fair bit smarter, and you should see fewer unnecessary load balancing events. As ever, there’s still room for improvement.
  • We’ve enabled swap on all your machines, so that if your 1.4GB memory does ever get fully used up, your instances will just become slow for a few minutes as they recover, rather than falling over or crashing completely.
  • When a site is about to be moved from one server to another, what happens internally is that requests for that site get “paused” by the distributed proxying layer which runs on top of the web and database servers. This pausing happens so that during the transfer of the site or database from one server to another, none of the requests return error messages — rather, the user just experiences a slow page load. The Load Balancing Diagram in the God Pod now shows a dotted line around a site when it is paused. This gives you a better insight into what’s happening within the cluster during the process of moving sites from one server to another to keep your servers healthy and balanced.
  • Performance has been improved massively. Previously, load balancing events caused sites to be blocked for up to 20 seconds. We’ve managed to get this down to 3-6 seconds in most cases, resulting in fewer requests building up. We’ve also made some code changes which have made everything feel a lot snappier. We will be continuing to optimise for performance over the coming weeks and months — this is only the start!
  • Numerous tweaks and improvements to functionality in the Control Panel have also been deployed (more details on this will be posted to our forum in due course).

We can’t wait to see how much better we can make it next week!

Beta launch press release

November 17th, 2010

HEADLINE: Beta Testing Begins On A New Cloud Web Hosting Platform

SUMMARY: Hybrid Logic Ltd announce the start of beta testing for their PaaS (platform as a service) web cluster. Hybrid Web Cluster offers standard LAMP web hosting with the redundancy, fault tolerance and scalability of the cloud.

November 17, 2010 — Hybrid Logic Ltd, a United Kingdom-based company today announced the commencement of the first round of beta testing of their PaaS (platform-as-a-service) software product, Hybrid Web Cluster in partnership with CloudSigma AG of Zurich, Switzerland.

Hybrid Web Cluster is a cloud web hosting platform designed to run over any number of real physical servers, cloud server instances or a combination of the two. Due to recent advances in file system technology (ZFS) in combination with advances made by the software developers at Hybrid Logic Ltd; it is now possible to offer standard LAMP web hosting with a revolutionary level of redundancy, fault tolerance and scalability, at a price compatible with the commodity web hosting market.

ZFS is one of the key pieces of technology that enables Hybrid Web Cluster to offer near-instant data replication. This means that if any one of the nodes in your cluster goes down, some other nodes will always have a copy of every website hosted on the failed server no more than a few seconds old. The web cluster will automatically and instantly reorganise itself so that your websites never experience downtime – a failed server results in a slightly slow page load for a few seconds rather than hours of downtime.

Because ZFS support is crucial to the implementation of Hybrid Web Cluster, a cloud infrastructure provider willing to support the latest version of FreeBSD was essential – ZFS support is also available in Linux and Solaris, but both options have significant drawbacks; Solaris is too far from Linux to feel comfortable for most users, and Linux’s ZFS support is currently too slow to be usable in a production environment. It was therefore essential to find a suitable cloud infrastructure provider who would support the latest version of FreeBSD – The CloudSigma product with its support for FreeBSD 8.1 and ZFS provided the ideal infrastructure choice for Hybrid Web Cluster, and after talking to the friendly team at CloudSigma, a partnership agreement was reached which sees CloudSigma sponsoring the Hybrid Web Cluster beta testing programme.

Patrick Baillie CEO of CloudSigma commented “Hybrid Web Cluster is an exciting product and use of our cloud. It provides the potential to generate an additional revenue stream from our existing infrastructure investment. We see this product expanding our customer base by offering a more managed approach from our core offering. The white label support and sophisticated integrated billing and accounting system gives us the flexibility we require”.

Today the first round of beta testing began; 15 clusters have been provisioned and the first beta testers have each received login credentials for their very own test web cluster. Beta testers have full administrative control over their own cluster running on CloudSigma’s infrastructure – it is possible to set up real websites (including WordPress blogs), watch a live graphical visualisation of the load balancing and replication algorithms at work, pull the plug on a server and watch how the sites that are hosted on it stay live, it is also possible to generate load on individual websites and watch how the cluster’s load balancing algorithms respond. Beta testers can also explore our next-generation web hosting control panel which includes advanced ticketing and billing systems, automated domain registration, white label support and full internationalization.

If you would like to learn more about Hybrid Web Cluster, watch video demonstrations, or sign up for the next round of beta testing to try out your own web cluster, please visit hybrid-cluster.com.

About Hybrid Web Cluster
Hybrid Web Cluster is a Platform-as-a-Service (PaaS) software product developed by Hybrid Logic Ltd, a company based in London, United Kingdom. The software provides commodity LAMP web hosting in a distributed and fault-tolerant manner across a cluster of servers. You can run a web cluster across multiple physical locations, using a mix of virtualised cloud infrastructure and physical hardware to build a true Hybrid Cloud.

About CloudSigma AG
CloudSigma AG, based in Zürich, Switzerland provides a pure Infrastructure-as-a-Service (IaaS) platform offering high security, flexible cloud servers. Our innovative web console as well as API are designed to make cloud computing and cloud hosting straightforward. High availability redundant infrastructure is backed up by a generous Service Level Agreement that covers not only availability but also performance.

CloudSigma’s unique approach extends completely open software and networking layers to customers allowing them to run any operating system and applications they chose and to implement their own customised networking policies. CloudSigma bills by each raw resource (CPU, RAM, storage etc.) individually in a transparent, unbundled manner.

The God Pod is ready…

November 12th, 2010

Watch this space.

Beta Programme starting this week!

November 8th, 2010

Just sent this out to our elite team of beta testers:

We are very pleased to announce that we will be spinning up your beta cluster during the course of this week!

We are staggering the release of the beta clusters so that we can give everyone some personal attention. If you haven’t received your login details by the end of the week, don’t worry, we will be working on it. We expect to have all the clusters up and running by next Wednesday at the latest.

We are also putting together a video walkthrough which we’ll be launching this Wednesday. This will give you a good overview of the whitelabel cloud deployment technology that we’ve been building as well as what you can do with our web hosting platform.

It’s a beta preview, not the final version

As this is a beta preview, not everything is finished and fully working, but we do have more than the basics in place. You will be able to set up WordPress blogs with one click and upload your own PHP/MySQL websites via FTP. You can set up fully-replicated databases through our Control Panel. You will also have your own set of nameservers so that you can try pointing any real live domains you wish at your web cluster — although we’d recommend not deploying your company’s live website to your test cluster just yet!

We also have a helpdesk and billing system which you’ll have a chance to get to grips with. You’ll also be able to add web hosting reseller users with their own logins and change the colour scheme and header image for complete white-label brandability, either globally for your whole cluster or on a per-user basis.

And here’s the exciting bit: you’ll get a chance to play with our distributed load-balancing and failure tolerance algorithms in real-time. We have a really slick web interface in the works for viewing the live state of your cluster. You’ll also be able to drag up and down sliders to adjust the load (requests per second) on the different websites you’ve set up. And you’ll be able to pull the plug on a server and see that within seconds, the cluster reconfigures itself so that all your websites stay online. You can then turn the “failed” server back on and watch how the cluster redistributes load to it when it recovers.

But it’s not completely finished: it won’t do email (yet), it won’t do SSL (yet), and it might not always look gorgeous. It will definitely be a bit rough around the edges. This is where we need your help.

How you can help us during the beta

During the beta programme, we’ll be working flat-out to react quickly to any problems or issues you report. We’ll be pushing a lot of code updates and adding features in response to your feedback from one day to the next. This is a crucial part of our development process and we’re excited to have you on board.

Every element in our Control Panel has a flag icon next to it, which you can click to tell us what you think of it. If anything breaks please tell us by clicking on the flag and typing a quick description of what happened. This will automatically raise a ticket in your cluster’s helpdesk, which will be configured to notify us here at Hybrid Logic HQ.

We’ll also be launching a public forum this Wednesday. We really want to foster a community around the beta so please do sign up when we send you the link. During the beta programme, you’ll be able to contact us by clicking the flags, by logging on to the forum (our preferred way for us to discuss feature requests), or we can provide email, phone or Skype support. We’ll send you all the contact details at the same time that we send round your initial Control Panel logins.

It comes in two parts

You’ll actually get login credentials for two systems: your cluster control panel (the CP), and our master control panel (the Metapanel). If you’ve ever bought a dedicated server, you’ll be familiar with the concept: your cluster control panel is hosted on your cluster and lets you add users, websites and databases. The master control panel is where you log in to manage the cluster itself. The way I like to think of it is that the Control Panel is looking at the cluster from within, while the Metapanel is looking at the cluster from above.

We’ll give you full administrative access to your CP and a regular user account on the Metapanel. Both will be fully skinnable and support setting up reseller accounts. This should give you some idea of the possibilities for reselling entire clusters to your customers as well as reselling cloud web hosting on your own cluster. This may be of particular interest to you IaaS guys out there.

We’ll email you again mid-week with an update on our progress and the video for you to watch. If you’ve got any immediate questions, feel free to hit reply to this email.

Thank you for getting involved. We couldn’t do it without you.

Lightning talk at CloudCamp London

October 21st, 2010
For those who missed it, here’s the text of my talk at CloudCamp London yesterday. CloudCamp was great fun, thanks Chris!

Slide 1

Hi, I’m Luke from Hybrid Logic and I’m going to talk about filesystem snapshots and how they are useful in cloud computing.

Slide 2

A snapshot is an instantaneous point-in-time copy of your filesystem. The blocks that haven’t changed aren’t needlessly copied so you can store lots of snapshots with less disk space than you’d expect.

What are snapshots good for? Well, have you ever deleted important files by accident? Keeping snapshots lets you quickly “roll back time”.

Also, if you can copy your snapshots onto a different server, they can act as a great backup which you can recover very quickly from.

Cloud instances aren’t perfect, and data loss/instance failure in not un-heard-of in public clouds. Whole industries have grown up around dealing with the transient, ephemeral nature of cloud instances.

Being able to take a snapshot of your server and clone it brings a new level of manageability as well. If you’ve ever started up an EC2 instance, then you have – perhaps unwittingly – cloned a snapshot of a disk image.

Slide 3

The cloud storage model

Infrastructure is the underlying compute hardware, whether real or virtualised. With respect to storage, the infrastructure corresponds to the block device exposed by, say, EBS on EC2, or the physical hard disk in a non-cloud data centre.

The platform includes the Operating System and crucially the Fileystem which you choose to install on your cloud instances.

My claim is that it’s better to have the snapshotting done at the filesystem level, than to rely on the underlying infrastructure’s snapshotting capabilities, if they exist at all.

Slide 4

The primary benefit of doing this is the removal of vendor lock-in. By having snapshots at the platform level you can replicate data between servers in entirely different cloud infrastructures, for example, you can move data between EC2 to ElasticHosts and back again. Plus you can move snapshots in and out of the cloud entirely, allowing you to build hybrid clouds without expensive, complex virtualisation in your own data centre. In total, this reduces your dependence on any one provider, which reduces your risk of downtime.

Slide 5

Relying on infrastructure for your snapshots brings some other problems too. When you take a snapshot with something like EBS, because the infrastructure can’t communicate “up” to the platform, it has no way of telling the filesystem that the snapshot is about to happen. If the filesystem is mid-way through a write when the snapshot takes place, you’ll end up with a corrupt snapshot.

One solution is to use a “pausable” filesystem, such as XFS, so you can flush it to disk and block the flow of writes during a snapshot. But because you require interaction between the two different layers, the process of pausing the filesystem and taking the snapshot can take a long time, which has been known to crash MySQL.

ZFS allows the unification of these layers. By some Linux kernel hackers this has been described as a “rampant layering violation” but I prefer to think of it as a elegant refactoring, because in fusing these two layers together ZFS becomes faster and smarter, guaranteeing O(1), consistent filesystem snapshots.

Slide 6

Comparison: filesystems with snapshots

XFS on EBS gives you vendor lock-in and so do any other infrastructure-based solutions. You also can’t use it to do live migration of snapshots from one server to another, called send/recv replication.

Btrfs is the Linux answer to the next-gen filesystem but it’s immature and not yet production ready.

Veritas does snapshots, but while it’s mature and stable, it’s very expensive.

This leaves ZFS, which is mature, stable and fast, and which allows you to send incremental changes between snapshots from one server to another. The only thing holding it back from mass adoption is the a lack of a performant Linux kernel port. But ZFS for Linux is coming in December. I’ve tested the beta, and it’s promising.

Here’s an example of how to do an incremental send and receive of a snapshot with ZFS to keep a slave up-to-date with the filesystem on a master.

Slide 7

Worked example of incremental ZFS replication

We create a zfs filesystem called “bucket1″. We put some data into that filesystem and then we snapshot it.

Then we send the first snapshot in full over to the slave which receives it and saves it to disk.

Then we change some bytes in the data on the master, snapshot the filesystem again, and send an incremental diff over to the slave.

This means that only the blocks that have changed get sent from one machine to another, so it’s very efficient.

Slide 8

We’re doing some cool stuff with this incremental zfs replication. We’ve built an asynchronously replicated cluster filesystem on top of it and we’re using that to build web clusters which have these nice properties. You can kill any machine safely in the knowlegde that a 10-second old backup of all its data will be stored safely across the cluster. By mounting many snapshots read-only, you can get horizontal scalability for read-heavy loads. And by picking the latest snapshot and stashing any others after a netsplit, you gain partition tolerance.

Furthermore, the incremental snapshots trick lets us automatically bring offline machines up to date from any timestamp, efficiently sending only the data which has changed between the time the machine went offline to when it came back.

In conclusion, ZFS let’s you do all this, it already runs on FreeBSD (our primary platform) and it’s coming to Linux in December, so check it out.

Slide 9

Thanks!

Follow us on Twitter: @hybridcluster / @lmarsden

Native ZFS on Linux, GA in December 2010: zfs.kqinfotech.com

Parallel spin-up

October 8th, 2010

Here’s a little taste of where we’re going with automatically spinning up web clusters on our shiny new cloud infrastructure:

It will be yours to play with soon :-)

No longer a critical event

August 6th, 2010

web cluster redundancyDoes this scare the hell out of you?

It used to scare me too, until I started using Hybrid Web Cluster. Now this isn’t a critical event any more. I can be developing on three virtual machines, one of them crashes, and I don’t even notice! All my websites and databases just carry on running as normal.

Find out more…

Press release: Beta testing due to begin in October

August 1st, 2010

HEADLINE: New cloud web hosting platform looking for beta testers

Hybrid Web Cluster is a cloud web hosting platform designed to run either on real servers, cloud server instances or a combination of the two. Due to some key enabling technologies becoming available (particularly the ZFS filesystem) combined with technology advances made by the cluster development team, this new product is able to offer a number of features not previously seen in products of this type:

  • A user-configurable level of replication redundancy — Near-live backups can be stored on any number of nodes in the cluster and in the event of a node failure, service is automatically and instantly restored from a backup no more than 10 seconds old. In the event of an accidental deletion, files can be quickly and easily recovered by “rolling back time” – a feature provided in the web hosting control panel.
  • Complete fault tolerance and no single point of failure — Any node (or several nodes) can fail and the cluster will automatically repair itself. Hardware failures are no longer critical, replacements can be carried out as part of a maintenance schedule rather than as an emergency event.
  • A high degree of scalability — Standard LAMP web applications can run unmodified and scale from zero resource usage to requiring two dedicated servers (one for database and one for web) this scaling happens automatically and instantly to cope with variations in demand. With minor modifications to the application code, next generation multi-master database technology allows the cluster to scale even beyond the 2 server-per-site limitation and be capable of handling extremely high traffic loads.

After several years in development this new web cluster system is due to begin the first round of beta testing in October 2010 and Hybrid Logic Ltd. is seeking interested parties to try the beta version for free, initially on cloud infrastructure, but later stand-alone distributions will be available. Beta testers will be offered a discount on the full price of the system after its launch date.

Awesome profile visualisation

June 26th, 2010

This is call profile graph of my latest invention, AwesomeProxy. This lets us move sites and databases between servers without a single failed HTTP request. Working on some optimisations, I wanted to see how much time was spent in each function call.

Gprof2Dot outputs pretty awesome graphs. This is what you get when you run AwesomeProxy for three minutes at 10 requests per second. I really like how you can see the structure of the code :)

Variations on a logo

June 1st, 2010

Well, I’ve just put the new logo live. I’ve also done some smaller images which won’t look out of place on CloudBook.

What do you think?