Posts Tagged VMware
Over the past few years there has been no shortage of excellent blog posts detailing how to properly configure resource pools in a vSphere environment. Despite the abundance, quality and availability of this information, resource pools still seem to be the #1 most commonly misconfigured item on every VMware health check I’m involved with. Even though this is well treaded territory I wanted to lend my own way of explaining this issue, if for nothing else than just a place to direct people for information on resource pools.
What follows below is a simple diagram I usually draw on a whiteboard to help explain how resource pools work with customers.
There’s not much to say that the pictures don’t already show. Just remember to keep adjusting your pool share values as new VMs are added to the pool. Also note that while I assigned 8000:4000:2000 to the VMs in the High:Normal:Low pools above, I could have just as easily assigned 8:4:2 to the same VMs and achieved the same results. It’s the ratio between VMs that counts. In either example, a VM in the “High” pool gets twice as much resources under contention as a VM in the “Normal” pool and four times as much as a VM in the “Low” pool.
Looking for more information on resource pools?
- Understanding Resource Pools in VMware vSphere – Chris Wahl
- Label Resource Pools with Per VM Shares Value – Chris Wahl
- The Resource Pool Priority-Pie Paradox – Duncan Epping
- Shares set on Resource Pools – Duncan Epping
- Custom Shares on a Resource Pool, scripted – Duncan Epping
- Don’t add resource pools for fun, they’re dangerous – Eric Sloof
- Resource pools memory reservations – Frank Denneman
Feel free to send me any other good resource pool links in the comments section and I’ll add them to my list.
VMware has a KB article detailing a bug present in ESXi 5.0 that has been known to cause a variety of networking issues in iSCSI environments. Until last week, I had not encountered this particular bug and thought I’d detail my experiences troubleshooting this issue for those still on 5.0 that may experience this issue.
The customer I was working with had originally called for assistance because their storage array was only reporting 2 out of 4 available paths “up” to each connected iSCSI host. All paths had originally been up/active until a recent power outage and since then, no manner of rebooting or disabling/re-enabling had been successful in bringing them all back up simultaneously. Their iSCSI configuration was fairly standard, with 2 iSCSI port groups connected to a single vSwitch per-server and each port group connected to separate iSCSI networks. Each port group in this configuration has a different NIC specified as an “Active Adapter” and the other is placed under the “Unused Adapters” heading.
One of the first things that I wanted to rule out was a hardware issue related to the power outage. However, after not much time troubleshooting, I quickly discovered that simply doing some NIC disable/re-enable on the iSCSI switches would cause the “downed” paths to become active again within the storage array and the path that was previously “up” would go down. As expected, a vmkping was never successful through a NIC that was not registering properly on the storage array. Everything appeared to be configured correctly within the array, the switches and the ESXi hosts so at this point I had no clear culprit and needed to rule out potential causes. Luckily these systems had not been placed into production yet and so I was granted a lot of leeway in my troubleshooting proccess.
- Test #1. For my first test I wanted to rule out the storage array. I was working with this customer remotely, so I had them unplug the array from the iSCSI switches and plug into some Linksys switch they had lying around. I then had them plug their laptop into this same switch and assign it an IP address on each of the iSCSI networks. All ping tests to each interface was successful so I was fairly confident at this point the array was not the cause of this issue.
- Test #2. For my second test I wanted to rule out the switches. I had the customer plug all array interfaces back into the original iSCSI switches. I then had them unplug a few ESXi hosts from the switches. Then they assigned their laptop the same IP addresses as the unplugged ESXi host iSCSI port groups and ran additional ping tests from the same ports the ESXi hosts were using. All ping tests on every interface was successful, so it appeared unlikely that the switches were the culprit.
At this point it appeared almost certain that the ESXi hosts were the cause of the problems here. They were the only component that appeared to be having any communication issues as all other components taken in isolation communicated just fine. At this point it was also evident that something with the NIC failover/failback wasn’t working correctly (given the behavior when we disabled/re-enabled ports) so I put the iSCSI port groups on separate vSwitches. BINGO! Within a few seconds of doing this I could vmkping on all ports and the storage array was showing all ports active again. Given that this is not a required configuration for iSCSI networking for ESXi, I immediately started googling for known bugs. Within a few minutes I ran across this excellent blog post by Josh Townsend and the KB article I linked to above. The issue caused by the bug is that it will actually send traffic down the “unused” NIC during a failover scenerio.
This is why me separating the iSCSI port groups “fixed” the issue. There was no unused NIC in the portgroup for ESXi to mistakenly send the traffic to. In addition, it also explained the behavior where disabling/re-enabling a downed port would cause it to become active again (and vice versa). In this case ESXi was sending traffic down the unused port and my disable/re-enable caused a failover scenario that caused ESXi to send traffic down the active adapter again.
In my case, upgrading to 5.0 Update 1 completely fixed this issue. I’ll update this post if I run across this problem with any other version of ESXi, just note the workaround I spoke of above and outlined in both links.
Both VMware View and Citrix XenDesktop require permissions within vCenter to provision and manage virtual desktops. VMware and Citrix both have documentation on the exact permissions required for this user account. Creating a service account with the minimal amount of permissions necessary, however, can be cumbersome and as a result, many businesses have elected to just create an account with “Administrator” permissions within vCenter. While much easier to create, this configuration will not win you any points with a security auditor.
To make this process a bit easier I’ve created a couple quick scripts, one for XenDesktop and one for View, that create “roles” with the minimal permissions necessary for each VDI platform. For XenDesktop, the script will create a role called “Citrix XenDesktop” with the privileges specified here. For View, that script will create a role called “VMware View” with privileges specified on page 87-88 here. VMware mentions creating three roles in its documentation, but I just created one with all the permissions necessary for View Manager, Composer and local mode. Removing the “local mode” permissions is easy enough in the script if you don’t think you’re going to use it and the vast majority of View deployments I’ve seen use Composer, so I didn’t see it as necessary to separate that into a different role either. You’ll also note that I used the privilege “Id” instead of “Name”. The problem I ran into there is that “Name” is not unique within privileges (e.g. there is a “Power On” under both “vApp” and “Virtual Machine”) while “Id” is unique. So, for consistencies sake I just used “Id” to reference every privilege. The only thing that will need to be modified in these scripts is to make sure to enter your vCenter IP/Hostname after “Connect-VIServer”.
Of course, these scripts could be expanded to automate more tasks, such as creating a user account and giving access to specific folders or clusters, etc., but I will let all the PowerCLI gurus out there handle that. Really, the only goal of these scripts is to automate the particular task that most people skip due to its tedious nature. Feel free to download, critique and expand as necessary.
After reading a bevy of excellent articles on multi-hypervisor datacenters, I thought I’d put pen to paper with my own thoughts on the subject. This article by Joe Onisick will serve as a primer to this discussion. Not only because it was recently written, but because it does an excellent job at fairly laying out the arguments on both sides of the issue. The article mentions three justifications organizations often use for deploying multiple hypervisors in their datacenter. These are, 1) cost, 2) leverage and 3) lock-in avoidance. I am in complete agreement that 2 and 3 are poor reasons to deploy multiple hypervisors, however, my disagrement on #1 is what I’d like to discuss with this post.
The discussion on the validity of multi-hypervisor environments has been going on for several years now. Steve Kaplan wrote an excellent article on this subject back in 2010 that mentions the ongoing debate at that time and discussions on this subject pre-date even that post. The recent acquisition of DynamicOps by VMware has made this a popular topic again and a slew of articles have been written covering the subject. Most of these articles seem to agree on a few things – First, despite what’s best for them, multi-hypervisor environments are increasing across organizations and service providers. Secondly, cost is usually the deciding factor in deploying multiple hypervisors, but this is not a good reason because you’ll spend more money managing the environment and training your engineers than you saved on the cost of the alternative hypervisor. Third, deploying multiple hypervisors in this way doesn’t allow you to move to a truly “private cloud” infrastructure. You now have two hypervisors and need two DR plans, two different deployment methods and two different management models. Let’s take each of these arguments against cost in turn and see how they hold up.
OpEx outweighs CapEx
As alluded to above, there’s really no denying that an organization can save money buying alternative hypervisors that are cheaper than VMware ESXi. But, do those cost savings outweigh potential increases in operational expenditures now that you’re managing two separate hypervisors? As the article by Onisick I linked to above suggests, this will vary from organization to organization. I’d like to suggest, however, that the increase in OpEx cited by many other sources as a reason to abandon multi-hypervisor deployments is often greatly exaggerated. Frequently cited is the increase in training costs, you have two hypervisors and now you have to send your people to two different training classes. I don’t necessarily see that as the case. If you’ve been trained and have a good grasp of the ESXi hypervisor, learning and administering the nuances and feature sets of another hypervisor is really not that difficult and formal training may not be necessary. Understanding the core mechanisms of what a hypervisor is and how it works will go a long way in allowing you to manage multiple hypervisors. And even if you did have to send your people to a one time training class, is it really all that likely that the class will outweigh the ongoing hypervisor cost savings? If not, then you probably aren’t saving enough money to justify multiple hypervisors in the first place. Doing a quick search, I’ve found week long XenServer training available for $5,000. Evaluate your people, do the math and figure out the cost savings in your scenario. Just don’t rule out multi-hypervisor environments thinking training costs will be necessarily astronomical or even essential for all of your employees.
Similar to the OpEx discussion, another argument often presented against the cost saving benefits of multi-hypervisor environments is that they are harder to administer as you have to come up with separate management strategies for VMs residing on the different hypervisors. Managing things in two separate ways, it is argued, moves away from the type of Private Cloud infrastructure most organizations should strive for. The main problem with this argument is that it assumes you would manage all of your VMs the same way even if they were on the same hypervisor. This is clearly false. A couple clear examples of this are XenApp and VDI. The way you manage these type of environments, deploy VMs, or plan DR is often vastly different than you would the rest of your server infrastructure. And so, if there is a significant cost savings, it is these type of environments that are often good targets for alternate hypervisors. They are good candidates for this type of environment not only because they are managed differently, regardless of hypervisor, but because they often don’t require many of the advanced features only ESXi provides.
I’m in complete agreement that having test/dev and production on separate hypervisors is a bad idea. Testing things on a different platform than they run in production is never good. But if you can save significant amounts of money by moving some of these systems that are managed in ways unique to your environment onto an alternate hypervisor, I’m all for it. This may not be the best solution for every organization (or even most), but like all things, should be evaluated carefully before ruling it out or adopting it.
Which is better, Citrix XenDesktop or VMware View? XenServer or ESXi? HDX or PCoIP? While the answer to these questions are debated on numerous blogs, tech conferences and marketing literature, what is explored far less often is how Citrix and VMware technologies can actually work together. What follows is a brief overview of some different ways that these technologies can be combined, forming integrated virtual infrastructures.
1) Application and Desktop delivery with VMware View and XenApp
Many organizations deploying VMware View already have existing Citrix XenApp infrastructures in place. The View and XenApp infrastructures are usually managed by separate teams and not integrated to the degree they could be. Pictured above are some possible ways these two technologies can integrate. As you can see, there are many different options in terms of application delivery with both environments. The most obvious is publishing applications from XenApp to your View desktops. This can reduce the resource consumption on individual desktops and also provides the added benefit of accessing those same applications outside your View environment with the ability to publish directly to remote endpoints as well. Existing Citrix infrastructures may also be utilizing Citrix application streaming technology as well. By simply installing some Citrix clients on your View desktops, applications can be streamed directly to View desktops or alternatively directly to end-points or even to XenApp servers and then published to View desktops or endpoints. Another option is to integrate ThinApp into this environment. Tina de Benedictis, had a good write-up on this a while back. The options for this are similar to Citrix streaming. You can stream to a XenApp server and then publish the application from there, stream directly to your View desktops or stream directly to end-points. As shown in the above picture, both Citrix Streaming and ThinApp can be used within the same environment. This might be an option if you’ve already packaged many of your applications with Citrix but either want to migrate to ThinApp over time or package and stream certain applications that Citrix streaming cannot (e.g. Internet Explorer). Whatever options you choose, it’s clear that both technologies can work together to form a very robust application and desktop delivery infrastructure.
2) Load Balancing VMware infrastructures with Citrix Netscaler
Some good articles have been written about this option as well. In fact, this option is becoming popular enough that VMware even has a KB dedicated to ensuring the correct configuration of Citix Netscalers in View environments. VMware View and VMware vCloud Director have redundant components that should be load balanced for best performance and high availability. If you have either of these products and are using Citrix Netscaler to proxy HDX connections or load balance Citrix components or other portions of your infrastructure, why not use them for VMware as well? Pictured above is a high-level overview of load balancing some internal-facing View Connection servers. Users connect to a VIP defined on the Netscalers (1), that directs them to the least busy View Connection server (2) that then connects them to the appropriate desktop based on user entitlement (3). After the initial connection process, the user connects directly to their desktop over PCoIP.
This is actually an extremely popular combination and the reasons are numerous and varied. You can have 32 host clusters (only 16 in XenServer and 8 with VMware View on ESXi), Storage vMotion and Storage DRS (XenServer doesn’t have these features and you can’t use them with VMware View), memory overcommitment (only ESXi has legitimate overcommit technology), Storage I/O Control, Network I/O Control, Multi-NIC vMotioning, Auto Deploy, and many more features that you can only get from the ESXi hypervisor. Using XenApp and XenDesktop on top of ESXi gets you the most robust hypervisor and application and desktop virtualization technology combinations possible.
4) XenApp as a connection broker for VMware View
This option intrigues me from an architectural point of view, but I have yet to see it utilized in a production environment. With this option you would publish your View Client from a XenApp server. Users could then utilize HDX/ICA over external connections or the WAN and from the XenApp server would connect to the View desktop on the LAN over PCoIP. What are the flaws in this method? I can think of a couple benefits to this off-hand. First, HDX generally performs better over high latency connections, so there could be a user experience boost. Second, VMware View uses a “Security Server” to proxy external PCoIP connections. The Security Server software just resides on a Windows server OS, a hardened security appliance like Netscaler would be more secure. I’d be interested to see how things like printing and USB redirection would work in such an environment, but for me, it’s definitely something I’d like to explore more.
So, those are a few of the possibilities for integrating VMware and Citrix technologies, what are some other combinations you can think of? Any other benefits or flaws in the above mentioned methods?
Those familiar with VMware certification exams will have experience studying for those exams with the excellent exam blueprints that occompany each test. I took the VCP5-DT (VMware View 5) test several weeks ago and used its exam blueprint to study from. While filling out the blueprint for my own study purposes, I thought it might be a useful tool for others as well so I went ahead and filled out most of the rest of the blueprint as well. I did however, leave out certain portions for various reasons. These reasons range from a) the meaning of the particular section was unclear, b) portions of the blueprint were redundant or c) certain sections can only be known through real-world experience (e.g. troubleshooting). Despite these short omissions, there is quite a bit of content here (30 pages). I got most of it from the resources listed in the exam blueprint and even copied and pasted tables as necessary. I did add my own commentary in several places where I felt the listed resources did not go far enough in their explanation.
Download the blueprint study guide here.