I typically recommend that administrators establish a policy for using VMware Update Manager to patch and update their ESXi hosts. Frequently, I help them write such a policy. The policy tends to vary greatly from one environment to the next. Sometimes, it varies from one ESXi cluster to the next within a single environment. The policy depends on many factors. Several of my customers are required to install new operating system patches (including ESXi patches) within 14 days of their release. Several of my larger customers have one or more clusters dedicated to development and test, where they are free to immediately install and test new patches without concern of impacting production services. Other customers only have three or fours ESXi hosts, which are all running critical VMs. Some customers are very concerned about patching aggressively due to fear of vulnerabilities. Some customers have little interest in patching and they ask, “If it works, why risk breaking it”? Some customers seldom patch, except immediately after installing an update or performing an upgrade.
Typically, my goal is to help the customer create a Patch Policy that well suits them and to help them develop the specific procedure for implementing the policy. Here is sample of a policy and procedure that I recently helped develop for a customer. The customer uses two vSphere clusters to run an application, whose SLA requires 99.99% plus availability. The application utilizes active and passive sets of virtual machines. The Active set of VMs run in Cluster-A and the Passive set of VMs run in Cluster-B. The administrators can instantly fail the application from Cluster-A to Cluster-B using a simple user interface provided by the application. They visualize vSphere simply as a solid, resilient platform to run this application. They make very few changes to the environment. They are very concerned that changing anything may disrupt the application or introduce new risk. Each cluster is composed of multiple blades and blade chassis.
In this particular use case, we developed the following policy and procedure:
- Policy: Plan to patch once per quarter and only install any missing Critical patches that are at least 30 days old. Initially, apply new patches to a single ESXi host in the B Cluster. The next day, apply new patches to second host in the same chassis. The third day, apply the new patches to the remaining hosts in the chassis. On the fourth day, apply the patches to the remaining hosts in the entire cluster. On the following day apply the new patches to all the hosts in one chassis in the Cluster A. On the final day, apply the new patches to the remaining hosts in Cluster A.
- Download all available patches from VMware’s website and manually copy the zip file to a location that is accessible from the vCenter Server.
- Use the Import Patches link on the Update Manger configuration tab to import all patches from the zip file.
- Create a new Dynamic baseline. Set the Severity to Critical, check On or Before, and the Release Date to the specific date that is 30 days prior to the current date.
- Attach the Baseline to Cluster B and Scan the entire cluster for compliance with the baseline.
- Select one non-compliant ESXi host to upgrade first. Select Enter Maintenance Mode on that host.
- Edit the DRS Settings in the Cluster and change the Automation Level to Manual.
- Remediate the host to install the missing patches.
- Restart the host. Examine its Events and logs and verify no issues exist.
- Migrate a single, non-critical VM to the host. Test various administration functions, such as console interaction, power on, and vMotion.
- Select the cluster and the DRS tab. Use the Run DRS to generate recommendations immediately. If any recommendations appear, use the Apply Recommendations button to start the migrations.
- Following the order and schedule that is established in the policy, continue upgrading the remaining hosts in Cluster B.
- After all hosts in Cluster B are patched, then change the DRS Automation back to Fully Automated
- Update Cluster A by applying the previous steps.