Update: Migrate VM from Hyper-V to vSphere with Pre-Installed VMware Tools (vSphere 7 and 8 Edition)

I had previously written a post in response to a problem a customer was facing with migrating from Microsoft Hyper-V to VM vSphere.

You can find that previous post here: Migrate VM from Hyper-V to vSphere with Pre-Installed VMware Tools

I am writing this as a follow-up, because while the workaround I documented still works (for vSphere 6.x VMware Tools), something with the VMware Tools had changed when vSphere 7 went GA.  Several attempts to manipulate the new .msi file proved to not work, and in the flurry of life, I hadn’t had a chance to really sit down and figure it out.  So, the workaround for “now” was to install the working 6.x version, get migrated, and then upgrade VMware Tools; and that still works, by the way.

Then one day, I was going through my blog comments someone had responded, saying they’d figured it out.  @Chris, thank you very much for sharing your find!

So, since vSphere 8 recently went GA, I figured I’d also test this procedure on VMware Tools 12, and I’m happy to say, it also works.  So here’s what’s changed from the previous post when you’re trying to do the same using VMware Tools 11 (vSphere 7) or VMware Tools 12 (vSphere 8).

What You Will Need

Before you can get started, you’ll need to get a few things.  For details on how to get these requirements, refer to the original post mentioned above. 

  • Microsoft Orca (allows you to edit .msi files) – This is part of the Windows SDK, so if you don’t have it, see the post referenced above for the link to download as well as the procedure to only install Orca.
  • VMware Tools 11 or 12
  • Visual C++ 2017 Redistributable (if you’re following the procedure to get the VMware Tools from your own system, be sure to grab the vcredist_x64.exe)

If you would like to skip editing the VMware Tools MSI, you can download already “jailbroken” versions below. 

Note: These worked in the testing I performed, and I will not be making any changes to them, supporting them, or be responsible for what you download off of the Internet.  To be absolutely sure you have complete control over what you install in your environment (ESPECIALLY IN PRODUCTION), download from trusted sources and perform the edit to the MSI yourself.

Edit VMware Tools MSI with Orca (for VMware Tools 11 and VMware Tools 12)

  1. Launch Orca
  2. Click Open, and browse to where you saved VMware Tools64.msi, select it, and click Open.

    Launch Orca and Open VMware Tools MSI

  3. In the left window pane labeled Tables, scroll down and click on CustomAction.
  4. In the right window pane, look for the line that says VM_LogStart, right-click it, and select Drop Row.
  5. When prompted, click OK to confirm.


  6. In the left window pane labeled Tables, scroll down and click on InstallUISequence.
  7. In the right window pane, look for the line that says VM_CheckRequirements. Right-click on this entry, and select Drop Row.
  8. When prompted, click OK to confirm.

    InstallUISequence > VM_CheckRequirements > Drop Row

  9. Click save on the toolbar, and close the MSI file. You can also exit Orca now.

Next Steps

Now that you’ve successfully edited the MSI file to be able to be installed on your Hyper-V Windows VMs, copy the installers (don’t forget vcredist_x64.exe) and install.  When it asks for a reboot, you can safely ignore it, because once the VM boots up in vSphere, it would have already taken care of that for you.  (One less disruption to your production Hyper-V virtual machine).

Thanks for reading! GLHF

If you found this useful and know of any others looking to do the same, please share and comment.  I’d like to hear if/how it’s helped you out! If you’d like to reach me on social media, you can also follow me and DM me on Twitter @eugenejtorres

Share This:

ESXi 6.0 U2 Host Isolation Following Storage Rescan

Following an upgrade to ESXi 6.0 U2, this particular issue has popped up a few times, and while we still have a case open with VMware support in an attempt to understand root cause, we have found a successful workaround that doesn’t require any downtime for the running workloads or the host in question.  This issue doesn’t discriminate between iSCSI or Fibre Channel storage, as we’ve seen it in both instances (SolidFire – iSCSI, IBM SVC – FC).  One common theme with where we are seeing this problem is that it is happening in clusters with 10 or more hosts, and many datastores.  It may also be helpful to know that we have two datastores that are shared between multiple clusters.  These datastores are for syslogs and ISOs/Templates.

 

Note: In order to perform the steps in this how-to, you will need to already
have SSH running and available on the host, or access to the DCUI.

Observations

  • Following a host or cluster storage rescan, an ESXi host(s) stops responding in vCenter and still has running VMs on it (host isolation)
  • Attempts to reconnect the host via vCenter doesn’t work
  • Direct client connection (thick client) to host doesn’t work
  • Attempts to run services.sh from the CLI causes script to hang after “running sfcbd-watchdog stop“.  The last thing on the screen is “Exclusive access granted.”
  • The /var/log/vmkernel.log displays the following at this point: “Alert: hostd detected to be non-responsive

Troubleshooting

The following troubleshooting steps were obtained from VMware KB Article 1003409

  1. Verify the host is powered on.
  2. Attempt to reconnect the host in vCenter
  3. Verify that the ESXi host is able to respond back to vCenter at the correct IP address and vice versa.
  4. Verify that network connectivity exists from vCenter to the ESXi host’s management IP or FQDN
  5. Verify that port 903 TCP/UDP is open between the vCenter and the ESXi host
  6. Try to restart the ESXi management agents via DCUI or SSH to see if it resolves the issue
  7. Verify if the hostd process has stopped responding on the affected host.
  8. verify if the vpxa agent has stopped responding on the affected host.
  9. Verify if the host has experienced a PSOD (Purple Screen of Death).
  10. Verify if there is an underlying storage connectivity (or other storage-related) issue.

Following these troubleshooting steps left me at step 7, where I was able to determine if hostd was responding on the host.  The vmkernel.log further supports this observation.

Resolution/Workaround Steps

These are the steps I’ve taken to remedy the problem without having to take the VMs down or reboot the host:

  1. Since the hostd service is not responding, the first thing to do is run /etc/init.d/hostd restart from a second SSH session window (leaving the first one with the hung services.sh restart script process).
  2. While running the hostd restart command, the hung session will update, and produce the following:

  3. When you see that message, press enter to be returned to the shell prompt.
  4. Now run /etc/init.d/vpxa restart, which is the vCenter Agent on the host.
  5. After that completes, re-run services.sh restart and this time it should run all the way through successfully.
  6. Once services are all restarted, return to the vSphere Web Client and refresh the screen.  You should now see the host is back to being managed, and is no longer disconnected.
  7. At this point, you can either leave the host running as-is, or put it into maintenance mode (vMotion all VMs off).  Export the log bundle if you’d like VMware support to help analyze root cause.

 

I hope you find this useful, and if you do, please comment and share!

Share This:

Zerto: Dual NIC ZVM

Something I recently ran into with Zerto (and this can happen for anything else) was the dilemma of being able to protect remote sites that (doesn’t happen often) happen to have IP addresses that are identical in both the protected and recovery sites.  And no, this wasn’t planned for, it was just discovered during my Zerto deployment in what we’ll call the protected sites.

Luckily, our network team had provisioned two new networks that are isolated, and connected to these protected sites via MPLS.  Those two new networks do not have the ability to talk back to our existing enterprise network without firewalls getting involved, and this is by design since we are basically consolidating data centers while absorbing assets and virtual workloads from a recently acquired company.

When I originally installed the ZVM in my site (which we’ll call the recovery site), I had used IP addresses for the ZVM and VRAs that were part of our production network, and not the isolated network set aside for this consolidation.  Note: I installed the Zerto infrastructure in the recovery site ahead of time before discussions about the isolated networks was brought up.  So, because I needed to get this onto the isolated network in order to be able to replicate data from the protected sites to the recovery site, I set out to re-IP the ZVM, and re-IP the VRAs.  Before I could do that, I needed to provide justification for firewall exceptions in order for the ZVM in the recovery site to link to the vCenter, communicate with ESXi hosts for VRA deployment, and also to be able to authenticate the computer, users, service accounts in use on the ZVM.  Oh, and I also needed DNS and time services.

The network and security teams asked if they could NAT the traffic, and my answer was “no” because Zerto doesn’t support replication using NAT.  That was easy, and now the network team had to create firewall exceptions for the ports I needed.

Well,  as expected, they delivered what I needed.  To make a long story short, it all worked, and then about 12 hours before we were scheduled to perform our first VPG move, it all stopped working, and no one knew why.  At this point, it was getting really close to us pulling the plug on the migration the following day, but I was determined to get this going and prevent another delay in the project.

When looking for answers, I contacted my Zerto SE, reached out on twitter, and also contacted Zerto Support.  Well, at the time I was on the phone with support, we couldn’t do anything because communication to the resources I needed was not working.  We couldn’t perform a Zerto re-configure to re-connect to the vCenter, and at this point, I had about 24VPGs that were reporting they were in sync (lucky!), but ZVM to ZVM communication wasn’t working, and recovery site ZVM was not able to communicate with vCenter, so I wouldn’t have been able to perform the cutover.  So since support couldn’t help me out in that instance, I scoured the Zerto KB looking for an alternate way of configuring this where I could get the best of both worlds, and still be able to stay isolated as needed.

I eventually found this KB article that explained that not only is it supported, but it’s also considered a best practice in CSP or large environments to dual-NIC the ZVM to separate management from replication traffic.  I figured, I’m all out of ideas, and the back-and-forth with firewall admins wasn’t getting us anywhere; I might as well give this a go.  While the KB article offers the solution, it doesn’t tell you exactly how to do it, outside of adding a second vNIC to the ZVM.  There were some steps missing, which I figured out within a few minutes of completing the configuration.  Oh, and part of this required me to re-IP the original NIC back to the original IP I used, which was on our production network.  Doing this re-opened the lines of communication to vCenter, ESXi hosts, AD, DNS, SMTP, etc, etc… Now I had to focus on the vNIC that was to be used for all ZVM to ZVM as well as replication traffic.  In a few short minutes, I was able to get communication going the way I needed it, so the final thing I needed to do was re-configure Zerto to use the new vNIC for it’s replication-related activities.  I did that, and while I was able to re-establish the production network communications I needed, now I wasn’t able to access the remote sites (ZVM to ZVM) or access the recovery site VRAs.

It turns out, what I needed here were some static, persistent routes to the remote networks, configured to use the specific interface I created for it.

Here’s how:

The steps I took are below the image.  If the image is too small, consider downloading the PDF here.

zerto_dual_nic_diagram

 

On the ZVM:

  1. Power it down, add 2nd vNIC and set it’s network to the isolated network.  Set the primary vNIC to the production network.
  2. Power it on.  When it’s booted up, log in to Windows, and re-configure the IP address for the primary vNIC.  Reboot to make sure everything comes up successfully now that it is on the correct production network.
  3. After the reboot, edit the IP configuration of the second vNIC (the one on the isolated network).  DO NOT configure a default gateway for it.
  4. Open the Zerto Diagnostics Utility on the ZVM. You’ll find this by opening the start menu and looking for the Zerto Diagnostics Utility.  If you’re on Windows Server 2008 or 2012, you can search for it by clicking the start menu and starting to type “Zerto.”
    zerto_dual_nic_1_4
  5. Once the Zerto Diagnostics Utility loads, select “Reconfigure Zerto Virtual Manager” and click Next.
    zerto_dual_nic_1_5
  6. On the vCenter Server Connectivity screen, make any necessary changes you need to and click Next.  (Note: We’re only after changing the IP address the ZVM uses for replication and ZVM-to-ZVM communication, so in most cases, you can just click Next on this screen.)
  7. On the vCloud Director (vCD) Connectivity screen, make any necessary changes you need to and click Next. (Note: same note in step 6)
  8. On the Zerto Virtual Manager Site Details screen, make any necessary changes you need to  and click Next. (Note: same as note in step 6)
  9. On the Zerto Virtual Manager Communication screen, the only thing to change here is the “IP/Host Name Used by the Zerto User Interface.”  Change this to the IP Address of your vNIC on the isolated Network, then click Next.zerto_dual_nic_1_9
  10. Continue to accept any defaults on following screens, and after validation completes, click Finish, and your changes will be saved.
  11. Once the above step has completed, you will now need to add a persistent, static route to the Windows routing table.  This will tell the ZVM that for any traffic destined for the protected site(s), it will need to send that traffic over the vNIC that is configured for the isolated network.
  12. Use the following route statement from the Windows CLI to create those static routes:
    route ADD [Destination IP] MASK [SubnetMask] [LocalGatewayIP] IF [InterfaceNumberforIsolatedNetworkNIC] -p
    Example:>
    route ADD 192.168.100.0 MASK 255.255.255.0 10.10.10.1 IF 2 -p
    route ADD 102.168.200.0 MASK 255.255.255.0 10.10.10.1 IF 2 -p
    
    Note: To find out what the interface number is for your isolated network vNIC, run route print from the Windows CLI.  It will be listed at the top of what is returned.
    

 

zerto_dual_nic_1_10

Once you’ve configured your route(s), you can test by sending pings to remote site IP addresses that you would normally not be able to see.

After performing all of these steps, my ZVMs are now communicating without issue and replications are all taking place.  A huge difference from hours before when everything looked like it was broken.  The next day, we were able to successfully move our VPGs from protected sites to recovery sites without issue, and reverse protect (which we’re doing for now as a failback option until we can guarantee everything is working as expected).

If this is helpful or you have any questions/suggestions, please comment, and please share! Thanks for reading!

 

Share This:

Protecting a VM with vSphere Replication

Continuing on from the previous blog about configuring array-based replication with SRM, in this blog post we’ll be going through configuring protection of a VM using vSphere Replication.  The reason I’m doing this instead of jumping right into creating the protection groups and recovery plans is because vSphere Replication can function on its own without SRM.  That said, we’ll go through the steps to protect a virtual workload using vSphere Replication, and follow this up with creating protection groups and recovery plans, which come into play in either situation (ABR vs vR) when we get to the orchestration functionality that SRM brings to the table.

vSphere Replication is included with VMware Essentials plus and above, so chances are you have this feature available to you to should you decide to use it to protect VMs using hypervisor-based replication.  In my experience, vSphere Replication works great and can be used to either migrate or protect virtual workloads, however, as stated above, can be limited.  See this previous post for the details of what vSphere Replication can and can’t do without Site Recovery Manager.

 

Procedure

In this walkthrough for protecting a VM using vSphere Replication, I will be performing the steps using a decently sized Windows VM as the asset that needs protection.  This VM is a plain installation of Windows, however, I use the fsutil to generate files of different sizes to simulate data change.

    1. In your vSphere Web Client, locate a VM that you wish to protect via hypervisor-based replication.
    2. Right-click on the VM and go to All vSphere Replication Actions > Configure Replication.how-to_vspherereplication_1_2
    3. When the wizard loads, the first screen asks for the replication type.  Select Replicate to a vCenter Server, and click Next.how-to_vspherereplication_1_3
    4. Select the Target Site and click Next.how-to_vspherereplication_1_4
    5. Select the remote vSphere Replication server (or if you only have 1, then select auto-assign), wait for validation, then click Next.how-to_vspherereplication_1_5
    6. On the target location screen, there are several options to configure, so we’ll go through each one by one:- Expand the settings by clicking the arrow next to the VM, or click the info link.how-to_vspherereplication_1_6_a– Click edit in the area labeled Target VM Location, select the target datastore and location for the recovery VM, then click OK to be returned to the previous screen.how-to_vspherereplication_1_6_b– Typically, the previous step would be enough, however, if you want to place VMDKs in specific datastores, edit their format (thick vs. thin provisioned), or assign a policy, use the edit links beside each hard disk.  Once all your settings are how you want them, click Next.

      how-to_vspherereplication_1_6_c

    7. Specify your replication options, then click Next.
      Notes:
      - Enable quiescing if your guest OS supports it, however, keep in mind
        that quiescing may affect your RPO times.
      - Enable network compression to reduce required bandwidth and free up
        buffer memory on the vSphere Replication server, however, higher CPU
        usage may result, so it is best to test with both options to see what
        works best in your environment.
      

      how-to_vspherereplication_1_7

    8. Configure RPO to meet customer requirements, enable point in time instances (snapshots in time as recovery points – maximum of 24) if needed, then click Next.
    9. Review your configuration summary, make changes if necessary, but when you’re done, click Finish.  As soon as you finish, a full sync will be initiated.

There you go, configuring vSphere replication for a VM.  The next post will cover creating protection groups and recovery plans, which we will then tie into what we’ve just performed here and with the array-based replication post.

Share This:

VMware SRM 6.1 – Configure Array-Based Replication

Introduction

 

This how-to will walk through the installation and configuration of array-based replication features for VMware Site Recovery Manager 6.1.

Before configuring array-based replication for use with VMware SRM, there are some pre-requisites.  First of all, you’re going to need to visit the VMware Compatibility Guide, which will help you determine if your specific array vendor is supported for use with SRM.  Second, there are steps to take to configure array based replication on the storage side, and that portion is out-of-scope for this blog, as I did not have access to do so.

vmware_hcl_example

There are several ways to search the compatibility guide, but to be specific, you can select entries from the areas highlighted above.  The bottom section that is highlighted will be your results once you click “Update and View Results.”  The reason why I wanted to point this step out is because if you assume your array vendor is supported, and don’t verify first, you could end up wasting your time planning and designing.

For this example, we are using SRM 6.1 with the Fibre Channel protocol on IBM SVC-fronted DS8K’s in both sites. I wanted to point that out because when I first set out to find the SRAs for use with our solution, I attempted to use the “IBM DS8000 Storage Replication Adapter”, later to find out it wasn’t the correct one.   The correct SRA for use with my environment is the “IBM Storwize Family Storage Replication Adapter”, so there may be a little bit of trial and error with this; however, if you do it up front during testing, you’ll save yourself some time later when deploying to production.

That all said, once you’ve verified your storage is supported, and what version of the SRA to download, you can get it by visiting the VMware downloads (you will need to login).  Be sure to also verify that the version of the SRA you are downloading is compatible with the version of array manager code you’re running.

 

Installing the SRA

Before you Begin – Prior to installing the SRA on the SRM server in each site (protected and recovery), you should have already paired the sites successfully.  Also, if you haven’t installed SRM yet, you will need to, otherwise the SRA installer will fail once it discovers that SRM is not installed.

Installing the SRA should be straightforward and painless, as there are not many options to configure during installation.  Once the installation is completed on both the protected and recovery SRM servers, proceed.

 

Verify That SRM Has Registered the SRAs

  1. Once you’ve installed the SRA on each site’s SRM server, log into the vSphere Web Client, and go to Site Recovery > Sites and select a site.site_recovery_sites_sra_monitor
    From this view, you can see what SRA has been installed, its status, and compatibility information.
  2. Click the rescan button to ensure the connection is valid and there are no errors.srm_sra_rescan_button

Configure Array Managers

After pairing the protected and recovery sites, you will need to configure the respective array managers so SRM can discover replicated devices, compute datastore groups, and initiate storage operations.  You typically only need to do this once, however, if array access credentials change, or you want to use a different set of arrays, you can edit the connections to update accordingly.

Pre-Requisites

  • Sites have been paired and are connected
  • SRAs have been installed at both sites and verified

Procedure

  1. In the vSphere Web Client, go to Site Recovery > Array Based Replication.srm_abr_settings_1_1
  2. On the Objects tab in the right window pane, click the icon to add an array manager.srm_abr_settings_1_2
  3. Select from one of two options for adding array managers (pair or single), then click Next.srm_abr_settings_1_3
  4. Select a pair of sites for the array manager(s), and click Next.srm_abr_settings_1_4
  5. Enter a name for the array in the Display Name field, and click Next.srm_abr_settings_1_5
  6. Provide the required information for the type of SRA you selected, and click Next.srm_abr_settings_1_6
  7. If you chose to add a pair of array managers, enter the paired array manager information, then click Next.srm_abr_settings_1_7
  8. Click-to-enable the checkbox beside the array pair you just configured, and click Next.srm_abr_settings_1_8
  9. Review your configuration, then click Finish when ready.srm_abr_settings_1_9

 

Rescan Arrays to Detect Configuration Changes

SRM performs an automatic rescan every 24 hours by default to detect any changes made to the array configurations.  It is recommended to perform a manual rescan following any changes to either site by way of reconfiguration or adding/removing devices to recompute the datastore groups.  If you need to change the default interval at which SRM performs a rescan, you can do this in the advanced settings for each site, editing the storage.minDsGroupComputationInterval advanced setting:

srm_abr_settings_1_11

To perform a manual rescan after making any configuration changes:

  1. Go to Site Recovery  > Array Based Replication
  2. Select an array for either site
  3. On the Manage tab of the selected array, click the Array Pairs sub tab
  4. Click the rescan button to perform a manual rescan.srm_abr_settings_1_10

 

Once you’ve got all of the above configured, you can begin setting up your protection groups and recovery plans.

Share This:

Product Comparison: VMware SRM & Zerto Virtual Replication

Introduction

Obviously, based on my previous blog posts, it’s apparent that I’ve been spending some time in the past few months testing VMware Site Recovery Manager and Zerto Virtual Replication to see which product best meets our business continuity and disaster recovery requirements.  My task was to compare the two products, feature for feature based on our use cases, which are primarily protection, recovery, re-protection, and workload migration.

Get comfortable, this could take a while…

Blue vs. Red

As of today, SRM and Zerto have been tested in a sandbox environment, consisting of 2 sites (Seattle and Denver), 2 vCenters, 2 physical hosts in a cluster in each site, and 1 test workload which consisted of a Windows Server VM with auto-generated files of different sizes.  The two sites, being geographically separated are joined by a dual 20 Gb/s connection, and there are no bandwidth throttling mechanisms in place outside of what’s available in the software, and it’s only used to throttle down during business hours.  The physical networking at the host level in both sites is 10GbE.

VMware’s Site Recovery Manager is the only one of the two products that has the array-based replication feature, so to make this more of an “apples-to-apples” comparison, that feature isn’t heavily reported on here, but has been tested, and it works well, so I’m happy.

Both hypervisor-based product tests that were performed have been completed in each direction, in terms of recovery testing, failover, re-protection, and migration.  The results of both solutions are similar, however, based on results, we are leaning more toward one product in terms of simplicity, flexibility, scalability, monitoring capabilities, and user experience.

Below are images of what the topology for both test environments looks like, with SRM on the left, and Zerto on the right.

If you are interested in seeing these diagrams up close, you can download the PDFs for each here:

topology_showdown_generic

^^ Not pictured in the Zerto Diagram: External PSCs for vCenter, vCenter SQL Servers, and all port communication native to vCenter components.

Product Comparison

While VMware Site Recovery Manager creates a complete solution with vSphere Replication (which can also be used without SRM), Zerto also protects using hypervisor replication.  But to compare the two, we must first compare the capabilities of each solution by comparing vSphere Replication (without SRM) to Zerto Virtual Replication.  Note that without SRM, vSphere Replication can be rather limited when it comes to several features.  The tables will lay out the use cases for either product, and their features.

Use Cases

VMware vSphere Replication Use CasesZerto Virtual Replication Use Cases
  • Data protection and disaster recovery within the same site and across sites
  • Data center migration
  • Replication engine for VMware vCloud Air Disaster Recovery
  • Replication Engine for VMware vCenter Site Recovery Manager
  • Replication & Disaster Recovery
  • Offsite Backup and Data Protection
  • Data Migrations & Workload Mobility
  • Automated Failover, Failback & Testing
  • Reduce RTO/RPO
  • Complete BC/DR solution: Business Continuity and Disaster Recovery
  • Storage Savings
  • AWS Migrations: Cloud migration to Amazon Web Services (ZVR 5.0 introduces DRaaS to Azure)
  • Cross-Hypervisor Replication: MS Hyper-V to VMware vSphere/VMware vSphere to MS Hyper-V

 

Feature Comparison: vSphere Replication (Without SRM) and Zerto Virtual Replication

VMware vSphere ReplicationFeatures & BenefitsZerto Virtual Replication Features & Benefits
Licensing RequirementVMware Essentials Plus and AboveVMware Essentials
Automation/Orchestration of Disaster RecoveryManual, PowerCLI to get basic automation (add to inventory, power on/off) ; otherwise, use SRM with vSphere Replication Full automation/orchestration features
Version CompatibilityvSphere Replication version must match vCenter versionZerto can be used with vSphere 4.0 and later, no ties to having every component match versions in respect to hypervisor/vCenter.
Automated Recovery CapabilitiesEach VM in the recovery site will need to manually be powered on. Fully automated recovery capabilities.
Automated Connection to correct network(s)Manually done when recovering with vSphere Replication. For automation of post-recovery tasks, use SRM. Fully automated
WAN CompressionNetwork compression capable with 6.1 at the cost of vSphere replication appliance CPU resources. Note: 1 vR appliance per vCenter instance is supported for a maximum of 2000 VMs protected per appliance. Built-in, often seeing a 50% compression ratio. Replication appliances are assigned a 1:1 ratio (host to VRA) with automated resource reservations to ensure best performance of replication appliances.
IP Re-AddressingManual process. For automated re-IP, use SRMBuilt in to failover plan (assigned in VPG)
Non-Disruptive TestingNot available since you cannot power on the replica VM if the original VM is still running and reachable. Use SRM with vSphere Replication to allow for recovery testing. Real or bubble networks can be used for recovery testing and isolation.
Cloning CapabilityNoneAllows for recovery site clones. This allows for full long-term archival backups of the VMs or file-level recovery from a point-in-time clone.
Failback OptionNone - SRM required.Automated failback workflow capability
Point-in-Time RecoveryAvailable with vSphere Replication 6.x - maximum of 24 PIT instances. Uses VMware Snapshots. Configurable, however, when using Offsite Backup Feature, up to 1 year. Does not use VMware Snapshots.
RDM (Raw Device Mapping) Support No physical RDM support, but virtual RDMs are supported.Both physical and virtual mode RDMs are supported.
Bandwidth ControlNoneThrottling and priorities are available in Zerto to reduce bandwidth consumption during certain times, and unlimited at others, via schedule.
vApp SupportNot SupportedZerto leverages vApps to make administration easier. If a vApp is configured for protection with a VPG, then any VM added to the vApp is automatically protected.
Storage DRS SupportNot supported, SRM is required.Storage DRS is supported and works with Zerto.
RPO Range15 minutes to 24 hoursSeconds
How VMs are ChosenSelected individually or through multi-selecting in the interface, but protection grouping is not available. VMs can be organized into Virtual Protection Groups.

 

Feature Comparison: vSphere Replication (with SRM) and Zerto Virtual Replication

VMware vSphere Replication with SRMZerto Virtual Replication
Provides planning, testing, and execution of disaster recovery for vSphere:YesYes
Designed for:SRM was designed for disaster recovery orchestration only Designed for hypervisor-based replication AND disaster recovery orchestration
Licensed:Per-VMPer-VM
Replication granularity:Per-VM or multi-select but virtual protection grouping is not available Per-VM and/or Per-Virtual Protection Group
Configure consistency groups (virtual protection groups)NoYes
Replication recovery points:Yes, up to 24 snapshotsYes, up to 14 days with standard recovery, up to 1 year with extended recovery using the Offsite Backup feature.
Compatibility:vSphere Replication works with ESX 5.x and above. SRM requires the same version of vCenter and SRM be installed at both sites. Zerto works with ESXi 4.0 U1 and above. Zerto can replicate between different versions of vCenter. Zerto can also protect and recover from vSphere to Hyper-V, Hyper-V to vSphere, and either virtualization platform to the cloud (AWS, Azure(Zerto v5.0)).
Managed with:vSphere Client PluginvSphere Client Plugin and standalone browser UI
Replication is performed with:vSphere ReplicationZerto HyperVisor-based replication through VRAs deployed to each host with protected VMs

 

Feature Comparison: VMware Site Recovery Manager & Zerto Virtual Replication API Availability

The following table displays the availability, use cases, and capabilities of both the VMware Site Recovery Manager and Zerto Virtual Replication APIs for access, integration, and automation.

VMware Site Recovery ManagerZerto Virtual Replication
Availability
  • Similar to vSphere API, uses web service that allows access to the API in Java C#, or any language that supports WSDL (Web Services Definition Language).
  • REST APIs are available to automate virtual infrastructure, allowing for benefits of software defined replication and recovery.
Use Cases
  • Automation of protection operations
  • Automation of protection operations
  • Automation of product deployment
  • Querying and Reporting
Capabilities
  • Create protection groups
  • Initiate testing
  • Initiate recovery
  • Re-protection
  • Revert Operations
  • Collect Results
  • Bulk automated VRA deployment
  • Bulk automated VPG creation
  • Automating VM protection by vSphere Folder
  • Automating VM protection with vRealize Orchestrator
  • Listing unprotected VMs
  • Listing protected VMs & VPGs
  • Long Term RPO & Storage Reporting to CSV
  • Resource reports
  • VPG, VM, VMNIC & Re-IP settings report
  • Emailing Reports
Programming Environments/Supported Languages
  • Java JAX-WS Framework
  • C# and Visual Studio
  • Java Axis Framework
  • Managed Objects as WSDL
  • All require SDK installation for each environment
  • PowerShell
  • cURL
  • Python
  • C#

 

System Requirements

The following tables below outline system requirements for both VMware Site Recovery Manager and Zerto Virtual Replication.

VMware Site Recovery Manager 6.1Zerto Virtual replication 4.5 U3
Virtualization Management
  • VMware vCenter 6.0 U2 in both protected and recovery sites.
  • VMware vCenter 4.0 U1
  • Microsoft SCVMM 2012 R2
  • As long as protected and recovery sites meet minimum versions, cross-version protection and recovery is supported.
Hypervisor
  • Minimum VMware vSphere ESXi 5.0
  • Minimum VMware vSphere ESXi 4.0 U1
  • Microsoft Windows Server 2012 R2 and Server Core
vSphere Replication Appliance
  • Minimum vSphere Replication 6.0
  • Not Required
Storage Replication Adapter
  • Depends on SAN vendor and code level, availability, and support.
  • Not Required
Client
  • vSphere Web Client - by default will match currently installed version that matches vCenter requirement for SRM.
  • vSphere Client Console (Thick Client) 4.0 and higher
  • vSphere Web Client 5.0 - 5.0 U3 - Not supported
  • vSphere Web Client 5.1 and up - Supported
  • Zerto Standalone Web UI
vSphere Replication Appliance Resource Requirements (per site)
  • 2 vCPU
  • 4 GB RAM
  • 18 GB Storage
  • According to VMware, CPU and memory resources consumed by vSphere Replication on a host or guest OS is negligible.
  • The numbers seen above are how the appliance is configured by default.
  • N/A
Zerto Virtual Replication Appliance (VRA)
  • N/A
  • 1 vCPU
  • 2GB RAM (minimum)
  • 12.5GB Storage
  • 1 of these appliances needs to be deployed (via Zerto UI) to each host that will be protecting VMs in VPGs.
  • DRS Affinity rules are created automatically by Zerto during the deployment process, so VRAs always stay on the hosts they are installed to.
Recovery Orchestration Provided By
  • Site Recovery Manager 6.1 (see versions above for compatibility) or review VMware's product interoperability matrix for all version information.
  • Zerto Virtual Replication (required before VRAs can be deployed)
SRM6.1/ZVM 4.5U3 Server Requirements (1 per site)
  • At least 2 CPUs, 4 for large environments
  • 2 GB RAM minimum - at least 6 GB if including OS requirements
  • 5 GB storage (in addition to OS requirements)
  • At least 1Gb/s NIC
    • Windows Server 2008 R2 (64-bit)
    • Windows Server 2012 R2 (64-bit)
  • Protecting up to 750 VMs and up to 5 peer sites:
    • 2 CPU (reserved)
    • 4GB RAM (reserved)
  • Protecting 751-2000 VMs and up to 15 peer sites:
    • 4 CPU (reserved)
    • 4GB RAM (reserved)
  • Protecting over 2000 VMs and over 15 peer sites:
    • 8 CPUs (reserved)
    • 8GB RAM (reserved)
  • 2GB Storage space for binaries
Supported Databases
  • Microsoft SQL Server
  • 2008 Express R2 SP2,SP3 (32-bit and 64-bit)
  • 2008 Standard/Enterprise R2 SP3 (32-bit and 64-bit)
  • 2008 Standard/Enterprise/Datacenter R2 SP2 (32-bit and 64-bit)
  • 2008 Standard/Enterprise R2 SP1 (32-bit and 64-bit)
  • 2012 Express SP2 (32-bit and 64-bit)
  • 2012 Standard/Enterprise SP2 (32-bit and 64-bit)
  • 2012 Standard/Enterprise SP1 (32-bit and 64-bit)
  • 2012 Enterprise (64-bit)
  • 2014 Standard/Enterprise (32-bit and 64-bit)
  • Oracle
    • 11g Standard ONE Edition, R2 (32-bit and 64-bit)
    • 11g Standard/Enterprise Edition, R2 (32-bit and 64-bit)
    • 12C Standard ONE Edition, R1 (32-bit and 64-bit)
    • 12C Standard/Enterprise Edition (32-bit and 64-bit)
    • Embedded SQL database for protecting up to 4 sites, 40 hosts, and 400 VMs/li>
    • Microsoft SQL Server Standard & Enterprise Editions for anything more than the above
    • Microsoft SQL Server Express
    • Supported MSSQL Database versions:
      • 2008
      • 2008 R2
      • 2012
      • 2014
    Bandwidth Requirements
    • > 10Mb/s (dedicated to move 40GB in an hour)
    • > 5Mb/s
    Number of Firewall Ports for Cross-site Communication, Replication, and Recovery
    • WAN - 7 (in addition to all vCenter related ports) See topology diagram for port listings.
    • WAN - 3 (in addition to all vCenter related ports) - See topology diagram for port listings.

     

    Steps from Installation to Protection

    The following table compares the high-level installation tasks/steps for VMware Site Recovery Manager and Zerto Virtual Replication.  These steps assume necessary pre-requisites such as vCenter installation and firewall rules have been created.

    Please note, that SRM appears to have many more steps, because SRM supports both array-based replication, in addition to vSphere Replication. If you don’t use one or the other, these steps are dramatically decreased.  In my test environment, both features have been tested, and because of that, SRM has more steps.

    VMware Site Recovery ManagerZerto Virtual Replication
    1. Build Windows VMs to host SRM in each site
    2. Build SQL Server/leverage existing, or use embedded vPostgress db.
    3. Install SRM in Protected and Recovery Sites and license
    4. Connect SRM instances in Protected and Recovery Sites
      Note: This requires a functional error-free vCenter/PSC infrastructure. PSCs should be in-sync with no errors.
    5. Pair SRM instances
    6. Install & configure Storage Replication Adapters (SRA)
    7. Pair Array Managers
    8. Configure inventory mappings
    9. Create Protection Groups and Recovery Plans
    10. Test, validate, protect, test recovery, monitor, and alert.
    11. If using vSphere Replication - Install, configure, & pair vSphere Replication Appliances in each site
    1. Build Windows VMs to host Zerto in each site
    2. Install Zerto on each ZVM and apply license on login
    3. Optional: Build/leverage existing SQL Server, or use the embedded database
      • See Database requirements in the above table for explanation on sizing the DB and when to use an external SQL server.
    4. Pair the Zerto instances
    5. Edit site settings, schedule throttling if using a shared WAN connection, and configure alerts, thresholds, etc...
    6. Deploy ZRAs (Zerto Replication Appliance - one per host that will be protecting VMs)
    7. Build Virtual Protection Groups (the VPG configuration also includes recovery options such as re-IP or pre/post scripts).
    8. Test, validate, protect, test recovery, monitor, and alert.

     

    Protection Workflow

    The following workflows have been created to illustrate the process involved in protecting virtual workloads using VMware Site Recovery Manager with vSphere Replication, and Zerto Virtual Replication.
    Individual files for each protection workflow in full-size view are here:

    srm_zerto_protection_workflows

    In the above images, SRM on the left, and Zerto on the right; visually, you can see that SRM clearly has many more steps performed in multiple places, compared to Zerto. Majority of the additional steps in the SRM protection workflow deal with the multiple layers where protection is configured via the vSphere Web Client for a single VM using vSphere Replication. On the right side (Zerto), you see that most of the steps (if not all) for protecting virtual workloads takes place at the top layer, which is the Zerto Virtual Manager UI.

    In SRM, protecting a single VM using vSphere Replication involves selecting the VM enabling vSphere Replication, going into Site Recovery, building a protection group and configuring it, followed by creating a recovery plan and configuring. The recovery plan portion of that is where customization such as boot priority and IP address changes are completed.

    In Zerto, protecting a single VM is as easy as logging into the ZVM UI, creating a VPG, and providing protection and recovery settings all within one wizard.

     

    Recovery Workflow

    The following workflows have been created to illustrate the process involved in recovering from a site failure using VMware Site Recovery Manager with vSphere Replication, and Zerto Virtual Replication.

    Individual files for each protection workflow in full-size view are here:

    srm_zerto_recovery_workflows

     

    In the above images, SRM on the left, and Zerto on the right; visually you can see that the steps to recovery are fairly similar, with the exception that recovery in SRM is performed via the vSphere Web Client, while recovery from Zerto is performed from the ZVM UI (recovery performed at the recovery site in both scenarios). The most complex part about recovering in any scenario is the organization of admins/engineers/business stakeholders to recover, re-configure, and validate the recovery process. Of course, if routine recovery testing had been taking place, a failure should basically mimic a recovery test, although, more of a commitment at this point, instead of an exercise.

    In SRM, there really is one place to take care of a recovery, and that is in Site Recovery > Recovery Plans. Locate the recovery plan for the application(s) you want to recover, and click the red button – its a no-brainer!

    In the Zerto UI home screen, toggle the failover type from test to “live”, and click the recover button. When you click the button, you will be presented with a 3 step wizard, where you will select the VPG(s) to recover; select the checkpoint to recover from, set the commit policy, re-protect; and click the “start failover” button. Recovery and re-protection all in 1 place.  The re-protection process in either product is straightforward, however, if there already isn’t a site built to re-protect to, there will be some work to do (in either case).

     

    Implementation Time and Complexity

    Planning, designing, and implementing either of these two products shouldn’t be difficult for anyone, except there are several pre-requisites that take time, change management processes and schedules to follow, or firewall rules to create and verify. With SRM, I’ve found that since this product ties to closely in to vSphere and version matching is a requirement, this could delay anyone who doesn’t have a version-aligned environment; or doesn’t have experience with vSphere or SRM. The biggest requirement for SRM? vSphere – you will have to have a vSphere deployment fully functional, and at an exact minimum version in both sites, in order to deploy SRM successfully.  Zerto doesn’t care if the vCenter/ESXi versions on both sites match, as long as the minimum supported version is in use.

    Granular requirements can make for administrative overhead and total team collaboration in the case of upgrades, maintenance, recovery, etc… because SRM relies heavily on version compatibility (as do other VMware products). In cases like this, there are specific orders of operations required for upgrades or power-on operations. These requirements are out of scope, but it pays to understand that they exist; so be sure to do some research, and if you can, test it before performing in production.

    When installing Zerto, what took the most amount of time was building the Windows VMs (a few hours x 2) to house ZVM in each site… that and firewall rules (about 2 weeks, in my case following approval, change management, and implementation). Once the VMs were built and the firewall rules were in place, the actual time taken to install Zerto was about 10-15 minutes per ZVM, and approximately 10 minutes to deploy each VRA, which can also be bulk scripted. Zerto works as long as the hypervisor and vCenter are at a minimum version supported by Zerto, but it can protect across versions, or even hypervisors (VMware vSphere & Microsoft Hyper-V)! VPG creation can vary, depending on how many VMs per VPG you want to protect, and customization of all options, with one of the longer taking items being recovery and test IP settings. That’s it. Once you have a VPG created, initial synchronization starts, and as soon as the sites are in sync,  you’ll ready to test, recover, or migrate and re-protect.

     

    Monitoring and Reporting

     

    Monitoring and Reporting with VMware Site Recovery Manager

    VMware Site Recovery Manager provides monitoring and reporting, however, is limited depending on where you are in the object hierarchy (but the data is there!):

    • number of replicated VMs per host
    • amount of data transferred
    • number of RPO violations
    • replication count
    • number of sites successfully connected

    These reports can also be expanded to show more detail, and data range can be modified. In my experience during testing, monitoring replication status and information isn’t as intuitive and centrally located as you would expect. There are several different places to monitor protection status and get additional information.

    Some of this is at the VM level, where you will see replication status, last sync point, target site, quiescing (enabled/disabled), network compression (enabled/disabled), RPO, Points in time recovery (enabled/disabled), disk status.

     

    Monitoring at the VM Object

    vm_replication_status

    At the VM (protected VM) level, you can monitor replication performance, however, it is limited to 2 counters, which are:

    • Replication Data Receive Rate (Average in KBps)
    • Replication Data Transmit Rate (Average in KBps)

    srm_vm_counters

     

    Monitoring at the Site Recovery > Sites Level

    At the site level, you can monitor things like issues, recovery plan history, and also get basic protection group and recovery plan information for Array Based Replication, Protection Groups, and Recovery Plans:

    srm_site_monitors

     

    Monitoring at the Protection Group Level

    At the protection group level, the summary tab will give you information such as status, number of VMs that are in the protection group, configuration status of those VMs, and any replication warnings (not clickable for more detail):

    srm_pg_summary

    Selecting a protection group gives you a list of recovery plans, and VMs, and general protection information, but no logging or reporting.

    srm_pg_monitors

     

    Monitoring at the Recovery Plan Level

    At the recovery plan level, when you select a recovery plan you the plan status, VM status, and recent history if the recovery plan has been run for testing or failover:

    srm_rp_summary

     

    Digging deeper into a recovery plan, you have the ability to see recovery plan steps, history, protection group general protection information, and virtual machine general protection information:

    srm_rp_monitors

     

    Monitoring vSphere Replication at the vCenter Level

    One more place that I was able to find monitoring and reporting is at the vSphere Replication level.  Going to vSphere Replication in the vSphere Web Client, then clicking on a vCenter.  From there, going to the Monitor tab, and clicking on vSphere Replication will take you the the screen in the image below where you can monitor Outgoing Replications, Incoming Replications, View Reports and Cloud Recovery Settings.  The reports section looks to contain the most information, however, there isn’t a way in the UI to export reports if a customer requests a report to show history of their replication jobs.

    Monitoring Outgoing Replications (per vCenter)

    This section displays any Point in Time snapshots that can be recovered to if it has been configured, and replication information (although very general) such as:

    • Status
    • VM
    • Target Site
    • vR Server used
    • Configured Disks
    • Last Instance Sync Point
    • Last Sync Duration
    • Last Sync Size
    • RPO
    • Quiescing (enabled/disabled)
    • Network Compression (enabled/disabled)

    monitoring_vsphere_replication_outgoing_rep

     

    Monitoring Incoming Replications (per vCenter)

    This section displays Point in Time Snapshots, Recovery history, and Replication information (again all general) such as:

    • Status
    • VM (when a VM is selected above)
    • Target Site
    • vR Server
    • Configured Disks
    • What manages the incoming replications (in this case, it’s SRM)
    • Last instance sync point
    • Last sync duration
    • Last sync size
    • RPO
    • Quiescing (enabled/disabled)
    • Network Compression (enabled/disabled)

    monitoring_vsphere_replication_incoming_rep

     

    Reporting for vSphere Replication (per vCenter)

    This section contains statistical information that can be filtered by date range.  This section is a little more detailed (my favorite view), and actually contains numbers on graphs. It contains information such as:

    • Count of replicated vs non-replicated VMs
    • Replicated VMs per by host(s)
    • Transferred bytes
    • RPO violations
    • Replications Count
    • Site connectivity status
    • vR Server Connectivity (not pictured)

    While this is great information, there is no way from the interface to export the reports if needed.

    monitoring_vsphere_replication

     

    Cloud Recovery Testing

    This section contains general information on any replications to the cloud.  Since we are not replicating to the public cloud, this section is empty, but I have shown it to display what detail it contains.

    monitoring_vsphere_replication_cloud_settings

    Based on the findings for monitoring vSphere Replication and SRM, as shown above, there are multiple places to look for information, statistics, and reports.  The problem here is that monitoring any ongoing replication jobs and/or recoveries and performance is a multi-tiered approach, and there is no centralization of information that is exportable for review.  There are too many places to look for information, and it would be too tedious to effectively monitor protection jobs, recoveries, and performance out-of-the-box.

     

    Monitoring and Reporting in Zerto Virtual Replication

    Monitoring protection status in Zerto has been intuitive, detailed, and centralized. Zerto has decided to separate the two functions into “tabs” within the UI. One tab for monitoring (includes tasks and alerts), and one tab for reporting. The ability to set Zerto up to alert via e-mail and send reports at a regular interval (and scheduled!) are natively built into the product. The product doesn’t stop with 1 e-mail address destination, as it also allows for multiple recipients via comma or semicolon separator in the site settings. In the resource reports, you can set up the sampling rate, and the sampling time interval. In terms of BC/DR solutions, it would be much more preferred to receive more information than necessary, rather than waiting for a problem to surface. Nothing is more embarrassing or resume-generating than finding out at the point of a failure that your replication product hasn’t been replicating much or hasn’t been able to meet your RPO/RTO.

    In the Zerto UI, monitoring alerts, events, and tasks is as simple as clicking on the “monitoring” tab. You can search for specific events or alerts (or both), and also modify the timeframe that you are targeting. In the reporting tab, you can get reports for the following items, and you can select any of them per VPG, or for all VPGs (and customize the reporting dates).

    • VPG Performance (RPO in seconds, IOPs, Throughput (MB/s), and WAN traffic (MB/s))
    • Outbound Protection Over Time (data in GB) – for each recovery site
    • Protection Over Time by Site (Journal Usage in GB, VMs protected by count)
    • Recovery Reports by VPG, type, and/or status
    • Resource Report – shows resources used by protected VMs, which is required by Zerto to ensure recovery capability. (Exports to Excel)
    • Usage – exports to CSV, PDF, or ZIP

    zerto_monitoring_tab

    zerto_reports_tab

     

    Conclusion

    In conclusion, both products work as advertised, and deciding which product to go with may come down to trust, flexibility, simplicity, scalability, monitoring & reporting, re-protection capabilities, and of course, cost. When considering the cost of either solution, be sure to also include the cost of human hours required to successfully deploy and support either one. Both products have their benefits and quirks, but the bottom line is that THEY BOTH WORK GREAT!

    Since I also went through the entire process from design to implementation, to protection, testing, and recovery – it took a considerable amount of time for VMware Site Recovery Manager to become usable due to some external problems we were having, so that sort of left a bad taste in my mouth (it was frustrating – but that was specific to my environment). Because Zerto wasn’t affected by those existing problems in terms of being prevented from working, it felt much simpler, but don’t get me wrong, you still have to plan for your deployment.  The time that it took to deploy and have both products functioning varied considerably, with Zerto coming in as the winner in terms of time to protection versus Site Recovery Manager in my experience (again related to the underlying problems in my environment).

    Array-based replication is an optional feature of SRM, and once we figured out what was needed on the SAN side for this to work properly, it actually runs nicely. This method has historically an expensive route to go due to the requirement of needing to have the same storage (vendor at least) in each site (protected and recovery). This also introduces another layer of complexity in configuration, administration, maintenance, and support alignment, which will involve SAN administrators.  vSphere Replication, on the other hand, is easy to set up and you can be replicating VMs using this method in a short period of time.

    Scalability of the products is another area I researched and determined that both products can protect up to 5000 VMs per vCenter instance (refer to comparison tables).

    vSphere replication (without Site Recovery Manager) has a limitation of 1 vSphere Replication Appliance per vCenter instance.  When leveraging the additional (limit) of 9 more vSphere Replication Servers per vSphere Replication appliance, you can protect up to 2000 VMs – see here for details.  When pairing vSphere Replication with Site Recovery Manager and array-based replication, you can achieve protection of up to 5000 VMs per vCenter instance. (SRM Operation Limits)

    Zerto can scale out to take advantage of cluster resources by deploying a VRA (virtual replication appliance) to each host in a cluster where you are protecting VMs. The VRAs come at no additional cost (both products are licensed per VM being protected) and can be sized as needed for best performance. When deploying Zerto VRAs, you will need IP addresses, so that’s one downside to having one per host, especially in large environments.  On the plus side, you can deploy all those VRAs from one screen and their deployments can be automated, so that saves time.

    Compatibility of each product and their requirements vary as well, with SRM having more requirements in both sites (protected and recovery). Since Zerto is basically deployed on top of a virtualization infrastructure, it is not tightly integrated into the base vSphere product nor does it rely on the same version requirements as SRM.  Zerto is very flexible in versioning for both protected and recovery sites, and it also can protect and recovery to/from vSphere and Microsoft Hyper-V, or cloud providers.

    Lastly, while I’m not seasoned programmer or script guru – at a high-level, both products can be programmatically managed, and both support PowerShell (with SRM requiring the PowerCLI add-on from VMware). Both products can also leverage vRealize Orchestrator, allowing workflow automation for protection tasks. Both products include support for multiple scripting/programming languages and have their APIs documented, however, in the case of SRM, the creation of recovery plans and forced-failovers cannot be automated (per the API documentation). Zerto can be managed through a feature-rich RESTful API that allows management of pretty much every aspect of the product and its capabilities, and their documentation is clear and full of example scripts in each of their supported languages for everyday tasks.

    I hope this information has been helpful for those who are trying to decide which product to go with, and as always, comments or questions are welcome!  And if you find this to be useful information, please share it!

    Share This:

    Zerto: Perform a VPG Move (VM Migration)

    In a situation where a workload needs to be migrated from a protected to a recovery (or site A to site B) in an effort to change where the production workload runs from, you can perform a VPG move.

    From what I’ve seen, in terms of VPG move versus Failover, is that when using the Failover option, there is an assumption that the protected site has failed, so systems may not automatically be cleaned up on the protected site.  When performing a move, the protected site is cleaned up as soon as that move is completed and committed unless you select to re-protect the workload in the other direction (can be automatic or manual for commit, maximum time you have to do it is 24 hours, and that is configurable).

    One recommendation I have here is that before you perform these steps, perform a recovery test on the VPG you’d like to move to ensure that recovery steps are completed as expected, and that the system is usable at least in a testing capacity.

    1. Log in to the Zerto UI
    2. From the dashboard screen, go to Actions > Move VPG.zerto_perform_vpg_move_1_2
    3. Select (tick the checkbox) for the VPG you want to move, and click Next.zerto_perform_vpg_move_1_3
    4. Select your options for the Execution Parameters, and click Next.  For this example, I will select “none” for the commit policy, to demonstrate where to commit the migration task when you are ready to.zerto_perform_vpg_move_1_4
      > Commit Policy: Auto-Commit - you can delay up to 24 hours (specified in minutes), or select 0 
      to automatically commit immediately when the migration process is completed.
      > Commit Policy: Auto-Rollback - You can delay up to 24 hours (specified in minutes), default 
      delay is 10 minutes
      > Commit Policy: None - You must manually select whether or not to commit or rollback, based 
      on your results.
      > Force Shutdown - Use this in the event VMware Tools isn't running, therefore, allowing an 
      automatic shutdown. Force shutdown will first attempt to gracefully shut the VM down, and if that doesn't work, 
      it will power off the VM on the protected site.
      > Reverse Protection - This will automatically sync changes from the recovery site back to the 
      protected site in case you want to be able to re-protect a system after a migration. This eliminates the need 
      to have to re-initialize synchronization in the other direction. If reverse protection is selected, a delta 
      sync will take place to re-protect after the migration is completed. Caveat - You cannot 
      re-protect if you select "NONE" as the commit policy.
      > Boot Order -(Defined in VPG Configuration, but displayed here)
      > Scripts - (Defined in VPG configuration, but displayed here)
      
    5. Review the summary, and when ready, click Start Move.
      During promotion of data, you cannot move a VM to another host.  If the host is rebooted
      during promotion, make sure the VRA on the host is running and communicating with the ZVM before 
      starting up the recovered VMs.

      zerto_perform_vpg_move_1_5

    6. Since we have selected a commit policy of “none”, once the migration is ready for completion, the Zerto UI will alert you letting you know there is a task awaiting input.  Click on the area highlighted below.zerto_perform_vpg_move_1_6_aSelect to either Commit (checkmark), or Rollback (undo Arrow):

      zerto_perform_vpg_move_1_6_b

    7. At this point, you can also choose whether or not to reverse-protect.  Make your selection and click Commit.zerto_perform_vpg_move_1_7_aThe task will update as seen below:zerto_perform_vpg_move_1_7_b

      Once you commit the move, the data in the protected site is then deleted, thus completing the migration.

    Share This:

    Zerto: Create a Virtual Protection Group (VPG)

    This blog is the next step following the creation/deployment of the VRAs.

    To begin protecting virtual machines, you will need to configure virtual protection groups (VPGs).  A virtual protection group is is an affinity grouping of VMs that make up an application.  VPGs can contain 1 or more virtual machines, and contain all the protection settings required which include:

    • Boot Order
    • re-IP settings for testing and recovery
    • Resource mappings
    • Offsite backup
    • Journaling
    • Re-protection settings

    Once a VPG is configured, initial synchronization of the protected virtual machines begins to take place, and once synced, will continuously be protected.

    Important:
    
    When performing failover, ALL VMs in the VPG will be failed over, and you are not able to select 
    specific VMs within the group to be recovered.

    Tips

    • For granular protection and failover capabilities, VPGs can be set up containing single VMs, if your migration/failover plan requires being able to pick and choose systems to recover in an order you specify, when not all involved VMs need to be migrated or failed over.
    • Do not group ALL virtual machines into 1 VPG, as performing a recovery will attempt to recover everything contained within the VPG and in some cases, that’s not the best idea.
    • Whenever possible, group servers that depend on each other or make up an application together. This will allow you to make use of boot options, order, or delay to bring them up in the correct order. This will also prevent missing crucial application servers during recovery or migration.
    • Make use of the test feature for DR testing by setting up an isolated VLAN/portgroup which will allow live testing without impacting production.
    • Make use of the re-IP feature to automate any IP address change that needs to happen either on the test network or recovery network.

    VPG Creation

    1. Log in to the Zerto UI
    2. Go to the VPGs tab, and click New VPG.create_vpg_1_2
    3. Specify a name for the VPG and set the priority, then click Next.
      In VPGs with different priorities, updates for the VPG(s) with the highest 
      priorities are transferred over the WAN before others.
      
      

      create_vpg_1_3

    4. Select the VM(s) you want to include in this VPG, press the right-arrow to move to selected VMs, then click Next.
      Using the search box in the "Available VMs" window will help you minimize the 
      number of VMs listed and focus only on the one(s) you're looking for.
      Zerto uses the SCSI protocol, so only VMs with disks that are configured/support 
      SCSI can be selected to be part of a VPG.
      
      

      create_vpg_1_4_a

      create_vpg_1_4_b

    5. Specify the recovery site and values to use for replication to the site, then click Next.create_vpg_1_5
    6. Specify the storage requirements for this VM and click Next.
      If you have pre-seeded the volumes, check the box beside the disks 
      and click the Edit Selected link.  Select Preseeded Volume, then browse to the VMDK 
      for that volume.  Repeat for any additional disks that you have pre-seeded.  This 
      is recommended if your VM is large, and has a high rate of change, or the WAN link 
      is shared and bandwidth is limited.

      create_vpg_1_6

    7. Specify the failover/move network (the newtwork that the recovered VM will run on), the recovery folder, any scripts, and click Next.
      Failover Test Network is optional, but recommended if you will be testing 
      failover prior to committing.
      

      create_vpg_1_7

    8. Enter the NIC details to use for the recovered VM, and click Next.
      In some cases, if you're replicating within the same vCenter or cluster, you 
      may end up with a duplicate MAC address warning when recovering, so to avoid this, you 
      can create a new MAC address on the recovery VM during recovery.  In any case, you 
      can also re-IP the VMs as part of the recovery procedure.  To view these 
      settings, check the box beside the VM(s) and click the Edit Selected link.

      create_vpg_1_8

    9. Select whether or not you want to create an offsite backup that can be stored for up to a year, then click Next.  If you don’t need to create a backup, leave this screen at the defaults, then click Next.
      For more information on backups with Zerto, refer to the help file 
      (click the ? button at the tope right of this window), or see the Zerto Virtual 
      Manager Administration Guide.

      create_vpg_1_9

    10. Review VPG settings summary, and if you don’t need to go back and make any changed, click Done.create_vpg_1_10

     

    Share This:

    Zerto: Deploy Virtual Replication Appliances

    If you’ve followed along with Zerto: ZVM Installation, this entry is a continuation, and provides steps to deploying the Zerto Virtual Replication Appliances.

    After installation has succeeded, open a browser, and connect to https://ZVMFQDN:9669/zvm.

    Notes:

    • If this VM lives in a protected network for management/utility servers, you might need to allow port 9669 from your local network to the network the ZVM lives in.  The Zerto Standalone UI, vCenter Web Client, and vCenter C# client all use port 9669 to access the ZVM.
    • Be sure to use a supported browser.  Chrome, Firefox, and IE 11+ are recommended by Zerto.
    1. Log on using your vCenter credentials.zerto_vra_deploy_1_1
    2. Enter a license key and click Start.

    After entering the license key and clicking start, you’re taken to the dashboard, however, before starting to protect VMs, the VRAs will need to be installed on the hosts in the site and pair the protected and recovery sites.

    Install the VRAs

    The Zerto installation includes the OVF template for VRAs.  A VRA must be installed on every host that manages protected VMs in the protected site, and on every host that will manage VMs in the recovery site.

    The VRA compresses data that is passed across the WAN from the protected to recovery site, and automatically adjusts the compression level according to the CPU usage, totally disabling it if required.

    A VRA can manage a maximum of 1500 volumes, whether they are protected or not.

    VRA Requirements

    Each VRA must have:

    • 12.5GB datastore space
    • at least 1GB of reserved memory
    • Each host installed to must be at least ESX/ESXi 4.0 U1 and have ports 22 and 443 enabled for the duration of the installation.

    If you are installing to ESXi 5.5 or higher, the VRA should connect to the host with user credentials, otherwise, the password for the host root account is required.  Because of the method used when the VRA connects to the host using a VIB (ESXi 5.5 or higher), it is not necessary to enter the root password.

    During VRA deployment, you should have IP addresses reserved, as it is not recommended to use DHCP; so be sure to also have the information for the subnet mask, and default gateway.

    If you do not have SSH enabled on your hosts, the ZVM will attempt to enable and disable it during the installation of the VRA.

    Important: Do not snapshot a VRA, as it will cause problems with replication!  I actually
    forgot to exclude the VRAs from backups, and CommVault attempted to back them up after I had
    configured my first VPG, and I ended up having to re-deploy the VRAs.  My advice is to create a
    folder for the VRAs in your vCenter folder structure and have that folder excluded from backups
    altogether.  Don't forget to move the VRAs into the folder as soon as they're deployed.
    

    Installation

    1. Log in to the Zerto Manager UI
    2. Click on the Setup tab.zerto_vra_deploy_2_2
    3. Locate the host you want to deploy the VRA to, and check the box beside it.  Once you have selected the host, click New VRA.
      Note:  If you select multiple hosts, clicking the New VRA link
      will only install on the first host that you have selected.

      zerto_vra_deploy_2_3

    4. Specify the host, datastore, network, RAM, group, and enter the network details, then click Install.  Repeat the steps for each additional VRA you need to deploy (one per host).
      Note: When you deploy a VRA, Zerto will automatically reserve the amount of
      memory equal to what you specify in the VRA RAM settings.  This amount of RAM is the maximum buffer
      size for the VRA that is used to buffer IOs written by the protected virtual machines before the
      writes are sent over the network to the recovery VRA.  The recovery VRA also buffers incoming IOs
      until they are written to the journal.  If a buffer becomes full, a Bitmap Sync is performed after
      space is freed up in the buffer.
      The protecting VRA can use up to 90% of its buffer for IOs to send to the recovery VRA, which can
      use up to 75% of its buffer before it is full and requires a bitmap sync.

      zerto_vra_deploy_2_4

    5. After all VRA installations are completed, the setup tab will contain more information for each host that has a VRA installed.zerto_vra_deploy_2_5

    Once you’ve completed these steps for each host requiring a VRA, you can create Virtual Protection Groups and start protecting your workloads.

    Share This:

    Zerto: ZVM Installation

    Here we go!  The following procedure is a step-by-step installation of Zerto Virtual Replication 4.5 U3.  Before starting, you should have built 2 Windows VMs per the Zerto system requirements.  If this is being done in production, be sure to size the servers as needed for the number of VMs you will be protecting.

    The version being installed is 4.5 U3.

    System Requirements

    Note: Be aware of OS limitations when dealing with 32-bit vs 64-bit.  In a 32-bit Windows Server installation, the maximum amount of RAM you can give the system (that it can actually use) is 4GB.  If you’re using Windows Server 2008 R2 or Windows Server 2012, they’re only available in 64-bit, so you won’t need to worry about this limitation.  For more information on Windows memory limitations, please see this.

    Now we’ve got that out of the way, here are the system requirements for the Zerto Virtual Manager as of version 4.5 U3:

    For the ZVM at Each Site

    • VMware vCenter 4.o U1 or later with at least 1 ESXi host
    • The account you log into the ZVM with and use to run the service will need to have administrative privileges in vCenter.
    • Supported Windows Operating Systems:
      • Windows Server 2003 SP2 or higher
      • Windows Server 2008
      • Windows Server 2008 R2
      • Windows Server 2012
      • Windows Server 2012 R2
    • Resource Reservations in vSphere
      • CPU: Reserve at least 2 vCPUs
      • Memory: Reserve at least 4GB
    • Resource Requirements for ZVMs:
      • Up to 750  protected VMs and up to 5 peer sites:
        • 2 vCPU, 4GB RAM
      • 751-2000 protected VMs and up to 15 peer sites:
        • 4 vCPU, 4GB RAM
      • > 2000 protected VMs and > 15 peer sites:
        • 8 vCPU, 8GB RAM
    • Time/NTP Requirements:
      • Zerto VMs must be synchronized with UTC (you can set actual timezones)
      • It is recommended to use an NTP server for clock synchronization.
    • Microsoft .NET Framework 4 (included with the Zerto installation package)
    • Storage: At least 2GB, plus 1.8GB if you need to install the .NET Framework

    For the ZRA on Each Host

    One VRA should be installed per host in a participating cluster.  By doing this, you are accounting for any vMotion or DRS activity related to any protected VM in the cluster(s).  ZRAs are deployed from within the Zerto UI, and furthermore, when this is done, DRS affinity rules are automatically created for the ZRAs, and any reservations required are automatically created.

    Important:  After deployment of the ZRAs, be sure to add the ZVM and ZRAs into a folder in vCenter that can be excluded from any snapshots.  In otherwords, if you’re using VADP for backups, be sure to exclude this folder, or each ZRA/ZVM from within your backup software.  Failing to do so will cause corruption and you will have to re-deploy the ZRAs.  Furthermore, this will prevent any performance degradation that is a result of snapshot cleanup/consolidation jobs.

    ZRAs require the following resources:

    • 12.5GB of datastore space (per ZRA)
    • At least 1GB RAM (reserved automatically through deployment process)
    • ESX/ESXi 4.0 U1 or higher
    • Ports 22 and 443 open on each host during installation of the ZRA (During ZRA deployment, Zerto will also attempt to enable the SSH service on each host, however, if it fails, you will need to manually enable/disable).
    • You’ll need to identify what datastore to install the ZRA to.
    • Static IP Address for each ZRA (recommended to use static)
      • IP Address (f0r each ZRA)
      • Subnet Mask
      • Default Gateway

    ZRAs will automatically be named by Zerto during deployment, and clearly indicate what host they are running on.

    Network Requirements

    • > 5MB/s is required for Zerto

    ZVM Installation

    Once you’ve built your Windows VMs to house the ZVM, the steps below will guide you through the installation.  This will need to be done in both sites, although, if you only have 1 site, you can still protect and recover within the same site.  Please note that if you are installing in 2 geographically separated sites, you may need to open some firewall ports before pairing sites and initiating replication.  For firewall requirements, see this document.

    1. Browse to the directory where you have downloaded the installation files to and run the installer (Zerto Virtual Replication VMware Installer).zerto_installation_files
    2. Click Next on the welcome screen.zerto_installation_1_2
    3.  Accept the License Agreement, and click Next.zerto_installation_1_3
    4. Select the installation directory, and click Next.zerto_installation_1_4
    5. Select the installation type, and click Next.zerto_installation_1_5
    6. Select either “Local System Account” or “This Account” if you have a dedicated service account.  Either way you decide to go, the account will require unrestricted access to the local resources on the ZVM.  After you made your selection, click Next.zerto_installation_1_6
    7. In the Database Type dialog box, select your database type, and click Next.Notes:  It is recommended to use an external SQL server when a site has more than 40 hosts that have VMs that need to be protected, and the site has more than 400 VMs that need to be protected.  If you use Windows Authentication for the SQL server (external), then the creadentials in step 6 will be used.zerto_installation_1_7
    8. Enter the name of the vCenter along with the admin credentials that will be used to connect, then click Next.zerto_installation_1_8
    9. Optional: If you have vCloud Director and want to protect it using Zerto Virtual Replication, enter the information necessary to connect to it, and click Next; otherwise, leave the “Enable vCD BC/DR” box unchecked, and click Next.zerto_installation_1_9
    10. Enter the Zerto Virtual Manager settings to identify this installation, and click Next.zerto_installation_1_10
    11. Enter the required information for ZVM communication, and click Next.  The ports listed below are defaults, and if you recover to a site managed by a Cloud Service Provider, be sure you do not change the default ports.zerto_installation_1_11
    12. As soon as you click Next on the screen in the previous step, the installer will auto-validate ZVM communication to ensure the ports to this ZVM are opened, it will verify vCenter credentials that you specified, and will register the vCenter plug-in.If the validation for each item results in “OK”, click Run, otherwise, resolve any errors, and click Recheck.zerto_installation_1_12
    13. If you clicked Run, Zerto Virtual Replication will begin installation and the configuration of components.zerto_installation_1_13
    Share This: