FlexPod® MetroCluster™ IP (Internet Protocol) solutions combine the proven FlexPod converged infrastructures with the NetApp® ONTAP® MetroCluster IP solution capabilities to help companies maintain continued availability of business-critical data services. This blog post highlights the latest FlexPod Datacenter solution Cisco Validated Design (CVD). It also describes the compliant switches deployment architecture for MetroCluster IP, which is supported by ONTAP 9.7 to reduce solution complexities and costs. And it discusses the use of ONTAP Mediator to monitor the storage clusters and automate an unplanned switchover to resume data services quickly when a site disaster occurs.
This document provides examples of the FlexPod MetroCluster IP solution at two different scales to demonstrate how various supported components can be used to build solutions that meet compute, storage, and performances requirements. Deploying these new solution configurations with compliant switches and ONTAP Mediator can help reduce solution complexity, save costs, and speed up disaster recovery. The FlexPod MetroCluster IP solutions offer zero RPO and low RTO. They protect companies against site-wide disasters and many other single-point-of failure scenarios and help them to achieve their business continuity objectives.
For more than a decade, FlexPod converged infrastructures have helped companies deploy mission-critical workloads on FlexPod with confidence. These infrastructures are powered by Cisco UCS Servers, Cisco Nexus switches, and NetApp storage arrays, and a large portfolio of Cisco Validated Designs and NetApp Verified Architectures (NVAs). These CVDs and NVAs cover all major data center workloads and are the result of continued collaborations and innovations of NetApp and Cisco on FlexPod platform solutions. Incorporating extensive testing and validations in their creation process, these CVDs and NVAs provide reference solution architecture designs and step-by-step deployment guides to help partners and customers deploy and adopt FlexPod solutions. By using these CVDs and NVAs as the guides for design and implementation, businesses can reduce risk, reduce solution downtime, and increase the availability, scalability, flexibility, and security of the FlexPod solutions they deploy.
FlexPod Datacenter with VMware vSphere 7.0 CVD. This design consists of the Cisco fourth-generation fabric interconnects, UCS B-Series blade servers, UCS C-Series rack servers, UCS C4200 rack server chassis with C125 server nodes, Nexus switches, and the NetApp AFF A400 storage controllers.
Figure 1) FlexPod Datacenter with VMware vSphere 7.0 CVD validation topology
FlexPod MetroCluster IP solutions combine the proven FlexPod converged infrastructure with the capabilities of ONTAP MetroCluster IP to synchronously replicate data between sites. This means protection for your enterprise applications and business-critical workloads such as databases, virtual desktop infrastructures, artificial intelligence, and machine learning against a disaster that results in a complete site outage.
Here are highlights of the benefits to customers of adopting FlexPod MetroCluster IP solutions:
For an overview of the FlexPod Datacenter platform, the NetApp ONTAP MetroCluster IP solution configuration, features, and capabilities, and an example solution topology for a small-scale deployment, refer FlexPod MetroCluster IP Solutions.
With ONTAP 9.6 and earlier releases, the MetroCluster IP solution requires dedicated switches that are validated and provided by NetApp. Beginning with ONTAP 9.7, MetroCluster IP solutions for some platforms can support switches that are not validated by NetApp if they are compliant with NetApp specifications. Using the Nexus switches (which are already part of a FlexPod solution) as compliant switches reduces the cost and complexity of the solution and increases the usage of the switches.
Figure 2 shows an AFF A700 cluster for one of the sites of a MetroCluster IP solution deployed without dedicated switches. In this deployment configuration, the two storage controller nodes (HA Pair) at one site are connected back to back for the intracluster traffic and the MetroCluster IP interfaces are connected to the compliant switches (not shown in the figure). The MetroCluster replication data travels from the controller nodes to the compliant switches and the intersite links to reach the cluster and storage at the other site.
Figure 2) Intracluster and MetroCluster IP fabric of an AFF A700 MetroCluster IP solution with compliant switches at one site. The compliant switches deployment architecture can use the Nexus switches, which are already part of a typical FlexPod design, to carry both client data and MetroCluster IP storage traffic. Here are the general requirements for deploying a MetroCluster IP solution with compliant switches:
For full information about the supported hardware platforms and the installation and configuration procedures for creating a MetroCluster IP solution, see NetApp Hardware Universe and the ONTAP documentation.
New ONTAP Mediator software is included with ONTAP 9.7 for the MetroCluster IP solution. The ONTAP Mediator enables the solution to perform an automated unplanned switchover (AUSO). The best practice is to deploy the Mediator software at a third site, as shown in Figure 3.
Figure 3) ONTAP Mediator deployed at a third site provides support for AUSO. The ONTAP Mediator also allows the AUSO to be disabled when the two sites encounter a failure in mirroring data between them. Preventing an automatic switchover when the intersite links are down allows the administrator to decide if it is appropriate to switch over.
To install the ONTAP Mediator service in a MetroCluster configuration, make sure that the following network requirements are met:
The following two solution architectures illustrate how FlexPod platforms and MetroCluster IP solutions using compliant switches can be implemented together at different scales for sites with various compute, storage, and performance requirements to protect data. These FlexPod MetroCluster IP solutions provide compute resources and storage for each site. Moreover, the MetroCluster IP solutions use the network infrastructure within each site and between sites to synchronously replicate data from one site to the other. The solution architectures ensure no data loss, zero RPO, low RTO, and fast restoration of services to achieve business continuity objectives despite a single-site outage scenario.
For a small site with limited compute and storage requirements, a FlexPod MetroCluster IP solution built with the UCS C-series rack servers, Nexus 9K switches, and AFF A300 storage arrays provides a small but scalable platform.
Figure 4 shows two identical FlexPod configurations, one at each site, that are connected by the intersite links (ISLs). The ONTAP MetroClsuter IP sites can be separated by up to 700 km, if the latency and other network requirements detailed in ONTAP documentation are met. There are two types of connections between the Nexus switches and the storage controllers, and they are used for data traffic and MetroCluster data replication between the two ONTAP clusters.
Figure 4) FlexPod MetroCluster IP solution architecture for a small site with compliant switches and AFF A300. The FlexPod Datacenter platform offers the scalability and performance needed for a data center that supports many applications and workloads. The FlexPod MetroCluster IP solution architecture example illustrated in Figure 5 takes advantage of the compliant switches’ deployment architecture and a FlexPod Datacenter configuration. The configuration includes Cisco UCS B-Series and C-Series servers, fourth-generation UCS fabric interconnect 6454, Nexus 9K switches, and NetApp AFF A700 storage controllers at each site.
Figure 5) FlexPod MetroCluster IP solution architecture for a large site with compliant switches and AFF A700. The ONTAP Mediator shown in the examples is deployed at a third site to monitor the ONTAP storage clusters at the two MetroCluster sites. It also provides the AUSO capability to automatically perform a switchover operation when a site experiences outage so that the data services can quickly resume from the storage at the surviving site. To scale and grow the solution, additional servers and SSD shelves can be added as needed. Finally, NetApp recommends routine monitoring and testing of the disaster scenarios to verify that the solution has been properly configured and can survive simulated and real disaster scenarios.
There are many supported FlexPod and ONTAP MetroCluster IP configurations that can be implemented together to create a variety of FlexPod MetroCluster IP solutions. Different companies and different sites will have different solution requirements for compute and storage capacities or performance. By using the example FlexPod MetroCluster IP solution architectures presented in this blog post as a guide, companies can adapt these configurations to meet their requirements.
For a small site, deploying a configuration combination of AFF A300, Nexus 9K switches, and UCS C-series servers offers a small yet scalable solution architecture. For a large site, deploying the FlexPod Datacenter architecture with the latest-generation fabric interconnect, UCS B-Series and C-Series servers, Nexus 9K switches, and AFF A700 provides the required scale and performance. Over time, additional servers and SSD storage shelves can be added to grow the solution to accommodate additional applications, workloads, and data.
In summary, companies can deploy the NetApp ONTAP MetroCluster IP solution using the simplified compliant switches configuration in a proven FlexPod converged infrastructure and automating the unplanned switchover operation with the ONTAP Mediator. With this configuration, companies can ensure the continued availability of data services and mitigate a site-wide disaster and many other single-point-of-failure scenarios to achieve their business continuity objectives.
For more information about the FlexPod architectures, MetroCluster IP installation and configuration details, and the supported hardware platforms, firmware releases, and other related information, refer to the following websites and documents.
Jyh-shing Chen is a Senior Technical Marketing Engineer with NetApp. His current focus is on Converged Infrastructure solution enablement, validation, and deployment / management simplification with Ansible automation. Jyh-shing joined NetApp in 2006 and worked in several other areas previously, including storage interoperability with Solaris and VMware vSphere operating systems, qualification of ONTAP MetroCluster solutions and Cloud Volumes data services. Before joining NetApp, Jyh-shing’s engineering experiences include software and firmware development on cardiology health imaging system, mass spectrometer system, Fibre Channel virtual tape library, and the research and development of microfluidic devices. Jyh-shing holds B.S. and M.S. degrees from National Taiwan University, PhD from Massachusetts Institute of Technology, and MBA from Meredith College.