Eric Sloof’s quick guide to using the VMware VIX API with vCenter
Eric Sloof from NTPro.NL has posted an excellent short video showing how easy it is to create a VB application to do some simple operations on vSphere virtual machines.
Online Training – Automating vSphere with the VIX API from Eric Sloof NTPRO.NL on Vimeo.
I can’t wait to try this out, although I think I’m going to have to do a little Visual Basic study first.
Linked Clones
We’ve started our first “proper” implemenation of Linked Clones in our vSphere 4 environment. While we’ve done some limited proof-of-concept work, this is the first project to be entirely deployed using Linked Clones. The objective is to reduce the space used by our training machines on our new environment.
Linked clones allow multiple machines to share a common read-only “base” VMDK file, with each machine generating their own delta (REDO). Under normal usage circumstances, the REDO would continue to grow throughout the life of the machine; however as our machines have non-persistent hard drives, they reset to a clean state when powered-down. This makes our environment ideally suited to taking advanatage of the functionality offered by Linked Clones. They can either be created manually (by moving and renaming files on the datastore), or via the APIs, you can get more information on them in this White Paper from VMware.
Our training machines are functionally identical to our production machines, and similarly consist of three types – Capture, Packaging and Verification. These are 11, 8, and 8 GB respectively. The usage patterns are slightly different, as – unlike “live” projects which have a steady stream of work, trainees tend to come in in large batches. This means that the training environment either needs to be continuously large, but mostly idle, or it needs to be regularly redeployed then stripped back.
The benefits achieved via the implementation of Linked Clones in this project resulted in roughly the same ratio of space saving as our proof-of-concepts, but as the number of machines involved was greater, the differences are more pronounced. Also this is the first time we’ve exceeded 8 machines sharing the same VMDK, which is a notable milestone as it is only possible if we limit the number of possible hosts that the machines can run on (there is a VMFS limitation of 8 hosts accessing a VMDK concurrently). As we have DRS enabled, this meant reducing the number of hosts in each cluster to 8 or less.
We deployed twenty-five machines each, of the three different builds used in a project. All were Windows XP virtual machines
The three machines being used as the parents had their slack space on the drives was cleaned using SDelete, then the machine was converted to Thin Provisioned using Storage vMotion. It was switched off, and a snapshot was created. This snapshot will form the base for the parent’s clones.
The machines were deployed using the a script similar to the one at the bottom of this post, and it took just over an hour to deploy and customize all 75 machines. This was considerably faster than the time it would have taken to deploy 75 machines using the normal “Deploy from template” method.
Here are the data:-
- Estimated space used if deployed traditionally: 675 GB
- Capture: 11 GB per machine
- Packaging: 8 GB per machine
- Verification: 8 GB per machine
- Estimated Space if Deployed as Thin Provisioned, but not Linked Clones: 238.75 GB
- Capture: 3.34 GB per machine
- Packaging: 3.72 GB per machine
- Verification: 2.49 GB per machine
- Space used in current configuration: 19.3 GB
- Capture 3.34 GB for parent, plus 0.13 GB per Linked Clone
- Packaging 3.72 GB for parent, plus 0.13 GB per Linked Clone
- Verification 2.49 GB for parent, plus 0.13 GB per Linked Clone
And in graph-format, for extra impact:-
All size estimates are based on the machines in a powered-down state. When powered on, a swap file (equal to the size of the assigned RAM) is created, and (assuming the machines are non-persistent) REDO files are created on all types of machines.
I’ve been on a few of the machines and they don’t appear to suffer from any noticeable performance degradation, although the true test won’t come until we get considerable concurrent use.
I’m tentatively declaring this a huge success. Rather than the training environment using 240 GB between training engagaments, it’s now down to a svelte 20GB, with no reduction in functionality.
Below is a script similar to the one I used to deploy the linked clones. The actual “meat”, which deploys the machines was based on Hal Rottenberg’s New-LinkedClone.ps1 script. As far as possible, I’ve tried to strip out stuff that’s specific to our environment (we use the Custom Attributes as an asset management database and to track which machines were deployed from which templates). There’s probably going to be stuff in there that doesn’t make much sense, but if you’ve got a bit of an understanding of PowerShell, you should be able to cut and keep the bits you want.
# Script to deploy linked clones
# List of custom attributes which you're wanting to copy from the template or parent to the newly created machine
# (Machines deployed from templates no longer inherit CAs in vSphere 4.0)
# These help us track provenance, and provide information to the user
$arrStrAttributesToCopy = @(
"AD Object Location",
"Customisation",
"Infrastructure Consultant",
"Logon Administrator Name",
"Logon Administrator Password",
"Logon User Name",
"Logon User Password",
"Mobilisation Consultant",
"Project",
"Role",
)
# Name of the Custom Attribute on the parent which contains the name of the customisation to use
$CustomFieldName = "Customisation"
Function DeployLinkedClone ($strSourceVM, $intToBeDeployed, $intStartDeployingAtNumber, $CustomFieldName){
# Bases the name of the machine on the second part of the string split by spaces. This assumes that the template follows the standard naming convention of "Tmpl [Name] x.x"
$strMachinePrefix = ($strSourceVM.split(' ')[1])
$objVM = Get-VM $strSourceVM
$viewVM = $objVM | Get-View
$objCustomization = Get-OSCustomizationSpec ($objVM.CustomFields.Item($CustomFieldName))
# Ensure that the machines does not have a non persistent HD
If ($objVM | Get-HardDisk | Where-Object {$_.Persistence -like "IndependentNonPersistent"}){
Write-Host $objTemplate has a non-persistent HD!
}
# If the customisation, as specified in the parent's custom attribute does not exist, then quit.
If (!$objCustomization){
Write-Host Customisation ($objVM.CustomFields.Item($CustomFieldName)) not found. Exiting.
Break
}
$i = 1
Do {
# Convert the single digit integer (i.e., "1") into a double digit (i.e., "01")
$strMachineNumber = ("{0:0#}" -f $intStartDeployingAtNumber)
# Concatenate the machine name prefix (from the template name) with the double-digit integer, which is incrememted on each loop
$strMachineBeingDeployed = $strMachinePrefix+$strMachineNumber
# Check that the machine doesn't already exist
If ((Get-VM -Name $strMachineBeingDeployed -ErrorAction SilentlyContinue)){
Write-Host "Machine $strMachineBeingDeployed already exists!"
Break
}
# Let the user know what's going on
Write-Host ""
Write-Host "Deploying new linked-clone " -NoNewline
Write-Host $strMachineBeingDeployed -ForegroundColor Blue -NoNewline
Write-Host ", from template " -NoNewline
Write-Host $strSourceVM -ForegroundColor Blue -NoNewline
Write-Host ", using customisation " -NoNewline
Write-Host $objCustomization -ForegroundColor Blue -NoNewline
Write-Host ", on the same Host as the parent" -NoNewline
Write-Host ""
# Create the new machine using all these variables
$objFolder = $viewVM.parent
$specClone = New-Object Vmware.Vim.VirtualMachineCloneSpec
# Get the most recent snapshot attached to the machine
$specClone.Snapshot = $viewVM.Snapshot.CurrentSnapshot
# Create an object to represent the location of the clone
$specClone.Location = New-Object Vmware.Vim.VirtualMachineRelocateSpec
# This is the move-type that specifies the new disk backing (which is the bit that makes a linked clone)
$specClone.Location.DiskMoveType = "createNewChildDiskBacking"
# Run the task with the specified parameters
$task = $viewVM.CloneVM_Task($objFolder, $strMachineBeingDeployed, $specClone)
Get-VIObjectByVIView $task | Wait-Task | Out-Null
# Get the object for the machine which was just deployed
$objTargetVM = Get-VM $strMachineBeingDeployed
# Apply the customisation specification to the newly created clone
Set-VM -VM $objTargetVM -OSCustomizationSpec $objCustomization -Confirm:$false
# Start the clone
Start-VM -VM $objTargetVM
# Get the view (needed for writing custom attributes)
$viewTarget = $objTargetVM | Get-View
# Loop through each of the custom attributes which are to be copied
ForEach ($arrStrAttributeToCopy in $arrStrAttributesToCopy){
# Read the attribute from the source template
$objAttribute = $objVM.CustomFields.Item($arrStrAttributeToCopy)
# Apply the attribute to the machine object
$viewTarget.setCustomValue($arrStrAttributeToCopy,$objAttribute)
}
# Set the "Template" custom attribute to the parent templates
$arrStrAttributeToCopy = "Template"
$viewTarget.setCustomValue($arrStrAttributeToCopy,$strSourceTemplate)
# Increment the number used for naming the machines
$intStartDeployingAtNumber ++
# Increment the number used to count the number of machines deployed
$i ++
}
# Continue to loop while the number of machines deployed is less than the number required
While ($i -le $intToBeDeployed)
}
# Get the current time (for timing how long the script took to run)
$dteStart = Get-Date
# Name of source VM, should be persistent, should have a snapshot and the customisation specified in the nominated custom attribute
$strSourceVM = "Tmpl Capture 1.0"
# Number to be deployed
$intToBeDeployed = 25
# Number to start deploying from
$intStartDeployingAtNumber = 1
DeployLinkedClone $strSourceVM $intToBeDeployed $intStartDeployingAtNumber $CustomFieldName
# Name of source VM, should be persistent, should have a snapshot and the customisation specified in the nominated custom attribute
$strSourceVM = "Tmpl Packaging 1.0"
# Number to be deployed
$intToBeDeployed = 25
# Number to start deploying from
$intStartDeployingAtNumber = 1
DeployLinkedClone $strSourceVM $intToBeDeployed $intStartDeployingAtNumber $CustomFieldName
# Name of source VM, should be persistent, should have a snapshot and the customisation specified in the nominated custom attribute
$strSourceVM = "Tmpl Verification 1.0"
# Number to be deployed
$intToBeDeployed = 25
# Number to start deploying from
$intStartDeployingAtNumber = 1
DeployLinkedClone $strSourceVM $intToBeDeployed $intStartDeployingAtNumber $CustomFieldName
$dteEnd = Get-Date
$dteDiff = New-TimeSpan $dteStart $dteEnd
$timeTaken = [math]::round($dteDiff.totalMinutes, 2)
Write-Host ""
Write-Host "It took" $timeTaken "minutes for these machines to deploy"
# End of script
Happy New Year
Happy New Year to everyone!
I passed the VCP on vSphere 4 just before Christmas with 463/500. At the time the deadline for taking the exam without having to the attend the What’s New was 31st December 2009, but that’s now been extended to the 31st January. It was nice to have it out of the way before Christmas though.
The exam seemed no more difficult than the VCP on VI 3; I think they even reused a few of the questions. I found the following useful:-
- Simon Long’s VCP vSphere 4 Practice Exams
- vSphere 4 Configuration Maximums
- Barry Coombs’ VMware vSphere cue-cards
Those sites also contain numerous links to other resources, so I’m sure you’ll find something which will suit your revision style.
Next on my to-do list is the Microsoft 70-431 Microsoft SQL Server 2005 – Implementation and Maintenance.
vSphere bug with DRS, StandBy and non-persistent hard drives
We’ve been in touch with VMware recently about an issue we were experiencing in vSphere 4, where machines in standby could not be powered on. VMware have now confirmed that this is a bug, and that there will be a fix in R2.
While it’s fairly specific to our use-case, I thought I’d share the details in case anyone else runs into this.
First of all, this bug will only affect you if the following conditions are met:
- You are using VMware vSphere 4.0 (or 4.0 Update 1)
- Guest OS power-saving settings cause the virtual machine to enter standby
- One or more of the guest’s hard drives are set to “independent non-persistent”
- DRS is enabled on the virtual machine’s cluster
The machine enters standby as normal. The issue arises when you try to power the virtual machine back on: if DRS has allocated the machine to another host based on load the machine will not resume, and gives an error similar to the following:-
“Virtual Machine is configured to use a device that prevents the operation: Device ‘Hard disk 1′ is disk which is not in persistent mode. Device ‘Hard disk 1′ which is not in persistent mode”.
You cannot manually migrate the machine (even back to the original host). You cannot change the power-state on the machine, edit the virtual machine settings, or delete the machine.
If this has happened to you, the only way we’ve found to get the machine back up-and-running seems to be to remove the machine from inventory, then create a new virtual machine with the same specifications, and add the old machine’s VMDK.
Fortunately, there are a couple of workarounds. You can either disable power-saving settings in the guest OS or change the guest power management settings from “Suspend the virtual machine” to “Put the guest OS into standby mode and leave the virtual machine powered on” (you can automate this as described in my previous post).
Changing the guest power-management settings means that when the guest enters standby, although vSphere shows the machine as “powered-on”, VMware Tools is not running, which can cause problems (i.e., when trying to gracefully shut down a batch of machines).
This was also my first time working with the VMware vSphere support and I was impressed. They quickly replicated the problem and confirmed that it was indeed a bug. As most people nowadays tend to use snapshots rather than non-persistent drives, and few users virtualise desktop operating systems (which are more likely to have power-saving settings on by default) I can understand why this particular set of circumstances went untested.
Changing StandByAction using PowerShell script created with help from Onyx
We’re currently having some issues caused by the convergence of vSphere 4.0, IndependentNonPersistent drives, StandBy and DRS (I’ll post more on that later). As a workaround, we needed to modify 228 machines so that they did not go into hibernation. You can do this though the vSphere Client by right clicking the virtual machine, click Edit Settings, go to the Options Tab, then select Power Management, and changing the radio button. We were wanting to change from “Suspend the virtual machine” to “Put the guest OS into standby mode and leave the virtual machine powered on”.
To do this the machines need to be powered down. We had an imminent maintenance window, but it wouldn’t allow us the time to make this change manually (even if we wanted to), this necessitated some automation. Unfortunately I had no idea how to go about editing this setting using the PowerCLI, even after a little search through the VMware PowerCLI community.
This seemed like the perfect opportunity to try out Project Onyx.
Carter Shanklin’s video does a good job of explaining how to Onyx up and running, and it worked exactly as described (even on my Windows 7 machine).
- Download the Onyx files and extract to a folder
- Run the Onyx executable
- Click the Connect button, and connect to your VirtualCenter server.
- Once that’s launched, start vSphere client, but instead of connecting to your VirtualCenter server, connect to http://localhost:1545 (Carter actually says 1445 in the video, but you can see on screen that he’s using 1545). Use your normal credentials.
- Ignore the warning about unencrypted traffic (as Carter explains, the unencrypted traffic is local-only, the network traffic is still encrypted)
- Click the Start button on Onyx
- In vSphere client make whatever changes it is that you’re wanting to record.
- Click the Pause button on Onyx, and you’ll see in the window a script has been created.
- Copy this into your favourite PowerShell editor, and modify until it’s suitable for your purposes.
The original capture from the Onyx Window
$spec = New-Object VMware.Vim.VirtualMachineConfigSpec $spec.changeVersion = "2009-11-27T09:16:04.570821Z" $spec.powerOpInfo = New-Object VMware.Vim.VirtualMachineDefaultPowerOpInfo $spec.powerOpInfo.defaultPowerOffType = "soft" $spec.powerOpInfo.defaultSuspendType = "hard" $spec.powerOpInfo.defaultResetType = "soft" $spec.powerOpInfo.standbyAction = "checkpoint" $_this = Get-View -Id 'VirtualMachine-vm-1074' $_this.ReconfigVM_Task($spec)
A second capture changing the setting back to isolate the exact line that makes the changes
$spec = New-Object VMware.Vim.VirtualMachineConfigSpec $spec.changeVersion = "2009-11-27T09:16:33.872017Z" $spec.powerOpInfo = New-Object VMware.Vim.VirtualMachineDefaultPowerOpInfo $spec.powerOpInfo.defaultPowerOffType = "soft" $spec.powerOpInfo.defaultSuspendType = "hard" $spec.powerOpInfo.defaultResetType = "soft" $spec.powerOpInfo.standbyAction = "powerOnSuspend" $_this = Get-View -Id 'VirtualMachine-vm-1074' $_this.ReconfigVM_Task($spec)
And a finished script, which will run it against all machines in a specified blue folder comment/uncomment one of the $specVM.powerOpInfo.standbyAction lines to choose which option you want.
$objVMs = Get-Folder "Folder Name" | Get-VM
ForEach ($objVM in $objVMs){
$specVM = New-Object VMware.Vim.VirtualMachineConfigSpec
$specVM.powerOpInfo = New-Object VMware.Vim.VirtualMachineDefaultPowerOpInfo
$specVM.powerOpInfo.standbyAction = "checkpoint" # Put the guest OS into StandBy Mode and leave the Virtual Machine powered On
#$specVM.powerOpInfo.standbyAction = "powerOnSuspend" # Suspend the Virtual Machine
$viewVM = Get-View -Id $objVM.Id
$viewVM.ReconfigVM_Task($specVM)
}
I was actually surprised at how easy this was; and I think it’s going to make me a bit more adventurous with what I attempt to do via the PowerCLI.
vSphere 4.0 Update 1 Released
VMware have released update 1 for vSphere 4.0.
The following enhancements have been made to ESX (from the release notes):-
VMware View 4.0 support – This release adds support for VMware View 4.0, a solution built specifically for delivering desktops as a managed service from the protocol to the platform.
Windows 7 and Windows 2008 R2 support –This release adds support for 32-bit and 64-bit versions of Windows 7 as well as 64-bit Windows 2008 R2 as guest OS platforms. In addition, the vSphere Client is now supported and can be installed on a Windows 7 platform. For a complete list of supported guest operating systems with this release, see the VMware Compatibility Guide.
Enhanced Clustering Support for Microsoft Windows – Microsoft Cluster Server (MSCS) for Windows 2000 and 2003 and Windows Server 2008 Failover Clustering is now supported on an VMware High Availability (HA) and Dynamic Resource Scheduler (DRS) cluster in a limited configuration. HA and DRS functionality can be effectively disabled for individual MSCS virtual machines as opposed to disabling HA and DRS on the entire ESX/ESXi host. Refer to the Setup for Failover Clustering and Microsoft Cluster Service guide for additional configuration guidelines.
Enhanced VMware Paravirtualized SCSI Support – Support for boot disk devices attached to a Paravirtualized SCSI ( PVSCSI) adapter has been added for Windows 2003 and 2008 guest operating systems. Floppy disk images are also available containing the driver for use during the Windows installation by selecting F6 to install additional drivers during setup. Floppy images can be found in the /vmimages/floppies/ folder.
Improved vNetwork Distributed Switch Performance – Several performance and usability issues have been resolved resulting in the following:
- Improved performance when making configuration changes to a vNetwork Distributed Switch (vDS) instance when the ESX/ESXi host is under a heavy load
- Improved performance when adding or removing an ESX/ESXi host to or from a vDS instance
Increase in vCPU per Core Limit – The limit on vCPUs per core has been increased from 20 to 25. This change raises the supported limit only. It does not include any additional performance optimizations. Raising the limit allows users more flexibility to configure systems based on specific workloads and to get the most advantage from increasingly faster processors. The achievable number of vCPUs per core depends on the workload and specifics of the hardware. For more information see the Performance Best Practices for VMware vSphere 4.0 guide.
Enablement of Intel Xeon Processor 3400 Series – Support for the Xeon processor 3400 series has been added. For a complete list of supported third party hardware and devices, see the VMware Compatibility Guide.
vCenter 4.0 has also been updated, and now has full compatibility with Windows 7 x86 and x64 versions. Saving the various hacks that were necessary to get it working.
Also, the PowerCLI has been updated, and can be found here. There are 68 new CMDLETS, which Alan Renouf does a great job of explaining. I’m especially looking forward to trying out Get\Set-CustomAttribute (no more manipulation of the View object), Move-VMTemplate (no more converting templates to machines), and Get\Set-VMQuestion (for those times when the datastores run out of space for the REDO files necessitated by Non-Persistent disks).
I’m looking forward to investigating the new PowerCLI functionality, and I’m also looking forward to not needing to manually customise the dozen or so Windows 7 guests I’m deploying next week!
Alpha build of Project Onyx
Carter Shanklin has announced that VMware have released an Alpha build of the long-anticipated Project Onyx. This is a script recorder for vSphere Client, which is designed to allow scripting of things which are awkward or difficult to achieve using the VMware PowerCLI APIs alone.
Downloading this at the moment, although I don’t suspect I’ll have time to look at it for a while.
Using SDelete to maximise the amount of disk space reclaimed during conversion to thin-provisioned disks
We’re currently neck-deep in migration at the moment, but despite the workload, it’s always worth considering what we can do now, that might save us some time and effort later on.
One of the reasons we were moving to vSphere was the ability to thin-provision (TP) our disks, which we’re hoping will allow us to increase the amount of machines that we can provision without needing to allocate more storage (currently 18 TB). I found an article by Duncan Epping over at Yellow Bricks suggesting the use of Sysinternals SDelete utility before the conversion to TP.
Essentially, as deleted files are not zeroed in Windows, and because VMware looks at the raw disk when “deallocating” space during conversion to think provisioned disks; deleted data are not reclaimed. Running SDelete in the Windows guest before converting the disk to thin provisioned format zeroes the deleted data and should allow the maximum amount of space to be reclaimed.
While I don’t doubt that this is all correct, I wasn’t sure how much extra space it would allow us to reclaim. The majority of our guests are relatively small windows clients, almost all of which have non-persistent hard drives; of course – once they’ve been made non-persistent, the drive is effectively frozen, and subsequent use won’t increase the amount of non-zeroed slack space.
What’s the best way to see whether this is worthwhile? Run an experiment of course!
Method
I took one of our standard Windows XP guests which had been migrated to the new vSphere infrastructure. It had a 10GB hard drive, currently persistent, but which has been – for the majority of it’s 9 month existence – non-persistent. I examined how much disk space it was using, this was the pre-TP “Control”. I then cloned it without customization. One of the clones was converted to TP during a Storage vMotion operation. The other had slack space zeroed using SDelete (this process took around 3 minutes). It was then converted to Thin Provisioned disk format using Storage vMotion in the same way as the first machine.
Results
Here’s what happened
Normal (Thick) Disk
- Provisioned Storage: 10.50 GB
- Not-shared Storage: 10.00 GB
- Used Storage: 10.00GB

Converted to Thin Provisioned
- Provisioned Storage: 10.50 GB
- Not-shared Storage: 5.05 GB
- Used Storage: 5.05 GB
Converted to Thin Provisioned, after running SDelete
- Provisioned Storage: 10.50 GB
- Not-shared Storage: 4.21 GB
- Used Storage: 4.21 GB
Conclusion
Zeroing slack space on this typical machine saved me 0.84 GB (8.4%). For the minimal effort involved, I think this is worthwhile (I have about 500 more machines almost exactly the same as this).
The percentage of free space reclaimed would likely be higher on persistent machines, or larger machines which see frequent creation and deletion of files.
PowerShell script to add a hash table full of virtual port groups to vSphere hosts
As part of the migration I’m working on, we needed to add a whole bunch of Virtual Port Groups with associated VLANs to the servers. The following script could do this in a few minutes (although Host Profiles would accomplish much the same thing, we’re not running Enterprise)
# Sets up virtual port groups on all hosts connected to a specific vCenter Server # Name of vCenter Server $strVCenterServer = "your.vCenter.Server" # VLANs and associated VPGs $ArrVLANs = @{ "123" = "Admin"; "456" = "GPO"; "789" = "NAG"; } # Connect to the vCenter Server Connect-VIserver $strVCenterServer # Loop through the VLAN/VPG pairs ForEach($objVLAN in ($ArrVLANs.Keys | Sort-Object)){ # Loop through the hosts ForEach ($objHost in (Get-VMHost | Sort-Object)){ # Create the VPG with the VLAN as specified in the array above, on the switch called "VMSwitch" on the current host # Remove the "-WhatIf" tag from the end of the following line to "arm" the script New-VirtualPortGroup -Name $strNewVPG -VirtualSwitch (Get-Virtualswitch -VMHost $objHost | Where-Object { $_.Name -match "VMswitch" }) -VLanId $strNewVlanTag -WhatIf # Write what we've just done to screen Write-Host "Adding Virtual Port Group" $ArrVLANs[$objVLAN] "with VLAN Tag" $objVLAN "to" $objHost } } # Disconnect the session from the host Disconnect-VIServer -Confirm:$False
Although this isn’t a complicated script, it was the first time I’ve used hash tables (thanks to PowerShell.Com’s excellent page), so I thought I’d share.
Understanding Memory Resource Management in VMware ESX Server
VMware have published an excellent white paper on memory management [pdf].
The document is technically detailed, but makes interesting reading. The authors do a good job of describing the methods ESX uses to manage and allocate virtual memory; and how when guests deallocate memory it’s not necessarily freed up for reuse by other guests. This should prevent you from allocating more memory to guests than is physically available on the host (overcommitting); however the hypervisor uses three memory reclamation strategies which allow overcommitment:-
- Transparent Page Sharing (TPS). Where ESX detects that multiple guests are using identical memory pages (such as those used by common OS components), it presents one shared copy to the guests. By default, this is active all the time. If the guests need to write to the memory, a copy needs to be made, which incurs a slight performance penalty.
- Ballooning. Is where VMware tools allows the hypervisor to see inside the guest operating system and reclaim unused memory. This typically occurs when ESX drops to less than 4% free memory (the Soft threshold). It has more of an overhead than TPS, but is still preferable to the alternative.
- Hypervisor swapping. This is used as a last-resort when TPS or Ballooning cannot provide enough memory (or cannot provide it quickly enough). Swapping tends to affect the guest more than the other two methods.
In the unlikely event that Hypervisor Swapping is unable to provide enough memory to meet the requirement, the hypervisor blocks the execution of all virtual machines which exceed their memory limit.
The whitepaper details the results of various benchmarks to evaluate the performance overhead of each of the reclamation strategies. While I’d certainly heard that the performance impact of TPS was negligible, I had always been slightly sceptical, but the data provided by VMware would appear to back it up.
The whitepaper also includes some best practices for memory management, some of which have had me thinking about our memory allocation strategy:-
- Do not disable page sharing or the balloon driver. These two techniques are enabled by default in ESX4 and I can’t imagine that anyone would disable them unless they had specific reason to. It’s also another reason to make sure you have VMware tools installed on all your guests.
- Carefully specify the memory limit and memory reservation. Our environment is pretty fluid, with a large number of small guests with 10-15% of them being . For this to be useful for us, these values would need to be constantly checked and reconfigured.
- Host memory size should be larger than guest memory usage. I generally try to limit our hosts to a 20% potential overcommit on RAM allocated to guests, and as there are usually only about 80% of our machines switched on at any one time, the hosts are normally pretty comfortable. However, this conservative approach means our RAM allocations need to be carefully managed, and kept as low as possible; which may be hitting us elsewhere (see below)
- Use shares to adjust relative priorities when memory is overcommitted. Our environment is pretty unique in that the vast majority of the machines have equal priority, so there’s little need for us to add another management overhead.
- Set appropriate Virtual Machine memory size. The virtual machine memory size should be a little larger than the average used by the guest. I think this is an area we need to look at in our environment. Our default RAM allocations for guests is probably a little on the low side, due to historical reasons, and due to our current environment being configured with a rather paltry 16GB of physical RAM (a problem that will be resolved in the next couple of months). We may be keeping our memory usage in check, but the resultant disk-swapping might be stressing our storage infrastructure.
The white paper is definitely worth reading; it’s certainly going to help me plan a memory management strategy for the implementing our infrastructure on the new hardware .

