Editor’s Note: As of January 2022, iland is now 11:11 Systems, a managed infrastructure solutions provider at the forefront of cloud, connectivity, and security. As a legacy iland.com blog post, this article likely contains information that is no longer relevant. For the most up-to-date product information and resources, or if you have further questions, please refer to the 11:11 Systems Success Center or contact us directly.
In this post, we want to focus on the availability aspects of migrating a typical application to the cloud.
One of the advantages of virtualization is the ability to encapsulate the contents of a physical server into a number of files on shared storage. In the event of the physical virtualization host failing, the VM can automatically restart on another available host in a crash-consistent manner. This is known as HA (High Availability) and protects VMs in the case of unplanned downtime; it just requires a reboot.
Capabilities such as vMotion or Live Migration allow running VMs to be migrated from one host to another without loss of service. This is great for planned downtime when VMs can be migrated away from a host to allow maintenance to be performed on the host without affecting production applications.
There is also the concept of dynamic resource scheduling, where the initial optimal placement of VMs is supported through a cluster of hosts. Those VMs can then be moved around using vMotion to balance the load across the cluster from a CPU and RAM perspective. The least active VMs are usually the ones that get moved around in this scenario.
Many IT teams have become accustomed to these capabilities for their traditional applications over the past few years and, understandably, expect the same facilities when migrating apps to the cloud.
However, hyperscale clouds, such as Amazon Web Services and Microsoft Azure, have used a different methodology from the outset – that of “design for failure.” When designing applications for the cloud (typically new builds), the idea is to take into account that your VMs are going to fail regularly, and design around that using “availability zones or sets.” This ensures that there is always more than one VM carrying out a particular function to account for the fact that potentially at least one is likely to be down at any given time.
Indeed, single VM applications will not qualify for a SLA guarantee, and Microsoft recommends not using single VMs for this very reason.
In the case of Azure, availability sets are designed using the concept of fault and update domains. Fault domains define the group of virtual machines that share a common power source and network switch, while update domains indicate groups of virtual machines and underlying physical hardware that can be rebooted at the same time.
In this way, VMs are grouped together to try to protect against unplanned failure in the event of host issues or reboots following patching of the virtualization software. As the underlying hosts are running Hyper-V on Windows Server, they are subject to the usual “Patch Tuesday” updates.
While Hyper-V supports live migration, Microsoft Azure does not hence the need for availability sets.
This, of course, does not play very nicely with the traditional workloads that we’ve discussed and often leads to customers having to double up on VMs just to ensure that they can maintain service. This can add significantly to cost.
This has led to many of the pets vs cattle’analogies for traditional versus cloud-native apps. Administrators cared for their traditional VMs, whereas the VMs associated with cloud-native apps can be created and destroyed with no real feeling of ownership, and they might only exist for minutes or hours rather than months or years.
For customers wanting to migrate existing near-legacy on-premise applications to the cloud, the iland 11:11 Cloud offers all the features that customers are used to in their own facilities: HA, vMotion and DRS, as well as host affinity and anti-affinity rules. With this architecture, 11:11 Systems is able to offer customers a 100% availability SLA and ensure that the cost of running legacy applications in the 11:11 Cloud is kept to a minimum without the need to use multiple VMs to ensure availability.
Learn more about the 11:11 Cloud Platform here: Cloud