Posts Tagged Memory Overcommitment
The statement, “there is no technical benefit to memory overcommitment” is usually met with universal scorn, righteous indignation and a healthy dose of boisterous laughter. However, at second glance this statement is not quite so absurd. Overcommiting memory does not make your VMs faster, it doesn’t make them more highly available and it doesn’t make them more crash-resistant. So, what is memory overcommitment good for? The sole goal of memory overcommitment is to put more VMs per host. This saves you money. Thus, the benefit that memory overcommitment provides is a financial benefit.
Are there other ways to attain this financial benefit?
Memory overcommitment is one of the main reasons people choose the ESX hypervisor. If the goal of memory overcommitment is to save money and there are other ways to attain these cost savings on other hypervisors without utilizing memory overcommitment, does that change the hypervisor landscape at all? Before delving into that question, let’s first see if there is a way to save as much money without using memory overcommitment.
One way around this I’ve heard suggested is to just increase the memory in your hosts. So, if you had an ESX host with 10GB of memory and were 40% overcommitted then you could use XenServer or Hyper-V with the same amount of VMs but each host would have 14GB of memory. This to me does not seem fair as you could also add 4GB more to your ESX host and achieve even more cost savings. However, you can only add so much memory before becoming CPU-bound, right? I’m not referring to CPU utilization but the amount of vCPU’s you can overcommit before running into high CPU Ready times. Let’s use my earlier example. You have 14, 1 CPU/1GB VMs on a 4CPU/10GB ESX host. You want to put more VMs per host so you increase your host memory to 20GB. You now try putting 28, 1CPU/1GB VMs on the host. This is now twice the amount of vCPUs to the same amount of pCPUs and let’s say your CPU Ready times are around 5%-10%. Adding more VMs to this host, regardless of how much more memory slots you have available, would adversely impact performance, so you have a ceiling of around 28 VMs per host.
Knowing this number, couldn’t you then size your hosts for 4CPU and 30GB of RAM on XenServer or Hyper-V and then be saving just as much money as ESX? And this is only one way to recoup the financial benefits overcommitment provides you. If you already have Windows or Citrix products you might already own these hypervisors from a license perspective and it might not save you money to go out and buy another hypervisor. Also, some hypervisors (like XenServer) are licensed by server count and not socket count (like ESX) so you could potentially save a lot of money by using these hypervisors. In any of these cases, an in depth analysis of your specific environment will have to be done to insure you’re making the most cost effective decision.
Of course, memory overcommitment is not the only reason you choose a hypervisor. There are many other factors that still have to be considered. But given this discussion, is memory overcommitment one of these considerations? I think once you realize what memory overcommitment is really good for, it becomes less of a factor in your overall decision making process. Does this realization change the hypervisor landscape at all? As I mentioned earlier, memory overcommitment is a major sell for ESX. If you can attain the financial benefit of memory overcommitment without overcommiting memory then I think this does take a bite out of the VMware marketing machine. That said, ESX is still the #1 hypervisor out there in my opinion and I would recommend it for the vast majority of workloads but not necessarily because of memory overcommitment. There is room for other hypervisors in the datacenter and once people realize what memory overcommitment “is” and what it “isn’t” and really start analyzing the financial impact of their hypervisor choices I think you’ll see some of these other hypervisors grabbing more market share.
Thoughts? Rebuttals? Counterpoints?
With the release of XenServer 5.6, Citrix included memory overcommitment as a feature of its hypervisor. Memory overcommitment has long been the staple of the ESX hypervisor so I thought I’d do an overview of the memory reclamation techniques employed by each hypervisor to determine if the field of play has been leveled when it comes to this feature.
Transparent Page Sharing (TPS) is basically virtual machine memory de-duplication. A service running on each ESX host scans the contents of guest physical memory and identifies potential candidates for de-duplication. After potential candidates are identified a bit-by-bit comparison is done to ensure that the pages are identical. If there has been a successful match, the guest-physical to host-physical mapping is changed to the shared host-physical page. The redundant pages are then reclaimed by the host. If a VM attempts to write to a shared host-physical page a “Copy on Write” procedure is performed. As the name implies, when a write attempt is made the hypervisor makes a copy of that page file and re-maps the guest-physical memory to this new page before the write actually happens.
One interesting characteristic of this feature is that, unlike all the other memory reclamation techniques, it’s always on. This means that even well before you start running into any low memory conditions the hypervisor is de-duplicating memory for you. This is the hallmark memory overcommitment technique for VMware. With this feature, you can turn on more VMs before contention starts to occur. All the other techniques only kick in when available memory is running low or after there is actual contention.
Ballooning is achieved via a device driver (balloon driver) included by default with every installation of VMware Tools. When the hypervisor is running low on memory it sets a target page size and the balloon driver will then “inflate”, creating artificial memory pressure within the VM, causing the operating system to either pin memory pages or push them to the page file. Pinned pages will be those pages identified as “free” or no longer in use by the operating system that are “pinned” to prevent them from being paged out but whose memory can be reclaimed in host-physical memory. If any of these pages are accessed again the host will simply treat it like any other VM memory allocation and allocate a new page for the VM. If the host is running particularly low on memory the balloon driver may need to be inflated even more, causing the guest OS to start paging memory.
If TPS and Ballooning can’t free up enough memory, memory compression kicks in. Basically, any page that can be compressed by at least 50% will be compressed and put into a “compression cache” located within VM memory. The next time the page is accessed a decompression occurs. Memory compression was new to ESX 4.1 and without this feature in place these pages would have been swapped out to disk. Of course, decompressing memory still in RAM is much faster than accessing that page from disk!
Memory swapping – the last resort! When memory contention is really bad and TPS, Ballooning and Compression haven’t freed up enough memory, the hypervisor starts actively swapping out memory pages to disk. With Ballooning, it was the guest OS that decided what was pinned and what was swapped out to disk, what makes hypervisor swapping so potentially devastating to performance is that the hypervisor has no insight on what the “best” pages to swap to disk could be. At this point memory needs to be reclaimed fast and the hypervisor could very well be swapping out active pages and accessing these pages from disk is going to cause a noticeable performance hit. Needless to say, you want to size your environment such that swapping is rare.
As of version 5.6, XenServer now has a mechanism that allows for memory overcommitment on XenServer hosts. XenServer DMC (Dynamic Memory Control) works by proportionally adjusting the memory available to running virtual machines based on pre-defined minimum and maximum memory values. The amount of memory between the dynamic minimum and dynamic maximum value is known as the Dynamic Memory Range (DMR). The Dynamic maximum value represents the maximum amount of memory available to your VM and the dynamic minimum value represents the lowest amount of memory that could be available to your VM when there is memory contention on the host. Running VMs will always run at the Dynamic maximum value until there is contention on the host. When this happens the hypervisor will proportionally inflate a balloon driver on each VM where you’ve configured DMC until it has reclaimed enough memory for the hypervisor to run effectively. DMC could be thought of then, as a configurable balloon driver.
As the diagram clearly shows, ESX employs a much broader and diverse set of mechanism to achieve memory overcommitment than XenServer. So while it’s technically possible to overcommit memory on both ESX and XenServer I think it’s clear that ESX is the hypervisor of choice where memory overcommitment is concerned. The “Virtual Reality” blog over at VMware recently posted an interesting article about this as well that also compared Hyper-V, you can read it here. For further reading I’d recommend Xen.org’s article on DMC, the XenServer admin guide or TheGenerationV blog for XenServer DMC. For ESX, Hypervizor.com has many excellent VMware related diagrams and their diagram on VMware memory management is no exception. In addition, the link I directed you to earlier is also very informative and contains other links for further reading.