Книга: Mastering VMware® Infrastructure3
Understanding HA
Разделы на этой странице:
Understanding HA
The VMware HA feature provides an automatic restart of the virtual machines that were running on an ESX Server host at the time it became unavailable, shown in Figure 10.25.
Figure 10.25 VMware HA provides an automatic restart of virtual machines that were running on an ESX Server host when it failed.
In the case of VMware HA, there is still a period of downtime when a server fails. Unfortunately, the duration of the downtime is not a value that can be calculated because it is unknown ahead of time how long it will take to boot a series of virtual machines. From this you can gather that, at this point in time, VMware HA does not provide the same level of high availability as found in a Microsoft server cluster solution. When a failover occurs between ESX Server hosts as a result of the HA feature, there is potential for data loss as a result of the virtual machine that was immediately powered off when the server failed and then brought back up minutes later on another server.
HA: Within, but Not Between, Sites
A requisite of HA is that each node in the HA cluster must have access to the same SAN LUNs. This requirement prevents HA from being able to failover between ESX Server hosts in different locations unless both locations have been configured to have access to the same storage devices. It is not acceptable just to have the data in LUNs the same due to SAN-replication software. Mirroring data from a LUN on a SAN in one location to a LUN on a SAN in a hot site is not conducive to allowing HA (VMotion or DRS).
In the VMware HA scenario, two or more ESX Server hosts are configured in a cluster. Remember, a VMware cluster represents a logical aggregation of CPU and memory resources, as shown in Figure 10.26. By editing the cluster settings, the VMware HA feature can be enabled for a cluster. The HA cluster then determines the number of hosts failures it must support.
Figure 10.26 A VMware ESX Server cluster logically aggregates the CPU and memory resources from all nodes in the cluster.
When ESX Server hosts are configured into a VMware HA cluster, they receive all the cluster information. VirtualCenter informs each node in the HA cluster about the cluster configuration.
HA and VirtualCenter
While VirtualCenter is most certainly required to enable and manage VMware HA, it is not required to execute HA. VirtualCenter is a tool that notifies each VMware HA-cluster node about the HA configuration. Once the nodes have been updated with the information about the cluster, VirtualCenter no longer maintains a persistent connection with each node. Each node continues to function as a member of the HA cluster independent of its communication status with VirtualCenter.
When an ESX Server host is added to a VMware HA cluster, a set of HA specific components are installed on the ESX Server. These components, shown in Figure 10.27, include:
? Automatic Availability Manager (AAM)
? VMap
? vpxa
Figure 10.27 Adding an ESX Server host to an HA cluster automatically installs the AAM, VMap, and possibly the vpxa components on the host.
The AAM, effectively the engine for HA, is a Legato-based component that keeps an internal database of the other nodes in the cluster. The AAM is responsible for the intracluster heartbeat used to identify available and unavailable nodes. Each node in the cluster establishes a heartbeat with each of the other nodes over the Service Console network. As a best practice, you should provide redundancy to the AAM heartbeat by establishing the Service Console port group on a virtual switch with an underlying NIC team. Though the Service Console could be multihomed and have an AAM heartbeat over two different networks, this configuration is not as reliable as the NIC team. The AAM is extremely sensitive to hostname resolution; the inability to resolve names will most certainly result in an inability to execute HA. When problems arise with HA functionality, look first at hostname resolution. Having said that, during HA troubleshooting, you should identify the answers to questions like:
? Is the DNS server configuration correct?
? Is the DNS server available?
? If DNS is on a remote subnet, is the default gateway correct and functional?
? Does the /etc/hosts file have bad entries in it?
? Does the /etc/resolv.conf have the right search suffix?
? Does the /etc/resolv.conf have the right DNS server?
Adding a Host to VirtualCenter
When a new host is added into the VirtualCenter inventory, the host must be added by its hostname or the HA will not function properly. As just noted, HA is heavily reliant on successful name resolution. ESX Server hosts should not be added to the VirtualCenter inventory using IP addresses.
The AAM on each ESX Server host keeps an internal database of the other hosts belonging to the cluster. All hosts in a cluster are considered either a primary host or a secondary host. However, only one ESX Server in the cluster is considered the primary host at a given time, with all others considered secondary hosts. The primary host functions as the source of information for all new hosts and defaults to the first host added to the cluster. If the primary host experiences failure, the HA cluster will continue to function. In fact, in the event of primary host failure, one of the secondary hosts will move up to the status of primary host. The process of promoting secondary hosts to primary is limited to four other hosts. Only five hosts could assume the role of primary host in an HA cluster.
While the AAM is busy managing the intranode communications, the vpxa service manages the HA components. The vpxa service communicates to the AAM through a third component called the vMap.
- CHAPTER 4 Command-Line Quick Start
- Understanding the Command Line
- Understanding Set User ID and Set Group ID Permissions
- Understanding init Scripts and the Final Stage of Initialization
- Understanding Point-to-Point Protocol over Ethernet
- Understanding SQL Basics
- Understanding the Changes Made by DHCP
- Understanding Computer Attacks
- Understanding SELinux
- Understanding the ext3 File System Structure
- Understanding drive status
- Understanding encryption and the encrypting file system