Editor’s Note: As of January 2022, iland is now 11:11 Systems, a managed infrastructure solutions provider at the forefront of cloud, connectivity, and security. As a legacy iland.com blog post, this article likely contains information that is no longer relevant. For the most up-to-date product information and resources, or if you have further questions, please refer to the 11:11 Systems Success Center or contact us directly.
Disaster Recovery (DR) has become a vital aspect of both a company’s IT and overall business strategy with the primary objective being to recover from an outage or disaster as quickly as possible, thereby minimizing downtime. Combining Zerto replication software with the 11:11 Cloud offers a reliable and secure way to recover your production site in a cloud-based environment as quickly and easily as possible. Zerto provides an interface that is easily accessible and allows you to be in complete control of your DR strategy. As a service provider and recovery site, 11:11 Systems offers full support and assistance to ensure you are confident, comfortable, and protected in the case of a disaster. Let’s take a look at the basics, and in upcoming blogs, I’ll be going into more detail on other areas.
The Basics of Zerto Replication
In order to replicate with Zerto, Zerto Virtual Manager (ZVM) and Virtual Replication Appliances (VRAs) are required. These servers are deployed on the customer’s production end. The ZVM is the brains of the Zerto functionality, and this can be installed on a Windows 2008 R2 server or higher. From the ZVM, you will connect to your vCenter and have the ability to install VRA servers and create Virtual Protection Groups (VPGs). After you install Zerto, the VRAs are pushed out by the ZVM server. A VRA will need to be deployed on all hosts in the production environment that have a server you want to protect. These VRA servers will act as the proxy for replication; they read the iSCSI connection from the VM to the host and replicate all changes to the 11:11 Cloud. The VRAs do require an IP and need to be on a network that can communicate with the ZVM server.
Once the ZVM and VRA servers have been installed and configured, the last step is to create VPGs. VPGs are groupings of VMs that replicate and failover together. For example, you might group your Domain Controllers in one VPG and then group Web Servers in another VPG. During an outage where you need to failover, you may decide to failover the Domain Controller VPG to ensure the domain is online first then failover the Web Servers VPG shortly after.
Replication occurs from your production site to 11:11 over a Replication VPN Tunnel. 11:11 will assist with creating this VPN, along with ensuring any issues over time are resolved. In your 11:11 environment, you have two firewalls: one is dedicated to the replication VPN and the other is for your failover environment. These DR firewalls give you full control of your environment during test and live failover situations. They also allow you to fully test your recovery environment using a Zerto Test Failover while maintaining replication during the testing period.
Understanding RPO and RTO Requirements
Zerto replicates data in real time from your production environment to 11:11. This means that changes made on one of your production servers are automatically replicated to the recovery site, enabling a low Recovery Point Objective (RPO). The RPO refers to the amount of time or data that your business can tolerate being lost during a DR Scenario. For instance, pretend your production center was to go fully offline due to a tornado or hurricane. If you were able to recover to a point in time from five minutes prior to the outage, two hours before, or even a day before, would your business still be able to operate? What would be the customer impact and potential business losses? The goal should be to minimize downtime as much as possible for your critical systems. With the real-time replication in Zerto, you should expect minimal RPO times averaging around 15 to 20 seconds. Other considerations can determine the RPO in Zerto, such as bandwidth and data change. More on that later.
Another aspect to look into when researching a DR service is the Recovery Time Objective (RTO). The RTO is the time that can be allowed for the actual recovery of your production environment to ensure business continuity. Where your RPO might be to always have the ability to recover to at least five minutes ago, RTO is being able to recover to that point within 30 minutes or an hour. A failover in Zerto can be initiated in just a few clicks, which allows you to quickly failover the servers needed to restore business functionality. Once the failover starts, your servers will be automatically imported and powered on at the 11:11 Cloud recovery site. There is no need to revert from snapshots or wait on any extra processes for recovery – just a simple import and power on. With the 11:11 Cloud Console, you are able to initiate a failover, even in the event that all of your production servers, including your Zerto infrastructure, are offline. There are other factors that go into this, such as the number of servers and boot delays if required.
Bandwidth and Data Change Considerations
Zerto’s real-time replication and RPO does come with some challenges for companies with lower bandwidth and/or high amounts of data change. Because changes are happening in real time, certain large processes or causes for high data change can cause network saturation. Large SQL queries or a backup of an application may cause a high amount of data changes on your servers. If there is more change happening at a point in time than the bandwidth is able to handle, you may see some saturation in the network. This causes the RPOs to grow for your VPGs. This may also happen if you are trying to protect a lot of data but have limited bandwidth availability. For instance, if you have 30 VMs that total 10TB but only a 50Mb connection, you may see issues. Even if there isn’t one big, global workload that generates a lot of change, the bandwidth you have available may not be able to handle the daily rate of change on all servers. You must also consider that the bandwidth consumed by Zerto might affect the available bandwidth to your actual production network. If the network your production servers or end-users are connected to is the same network being used for replication, you might see contention in the bandwidth usage, again impacting the performance of replication.
These bandwidth and data change issues are addressed by 11:11 Systems during the DR solution design process – along with defining the required RTOs and RPOs for different VPGs. In my next blog, I’ll be exploring best practices for Zerto installation and configuration – stay tuned.