Posts Tagged VMware
There was a good discussion at VMworld this year between persistent and non-persistent VDI proponents. The debate spawned from discussions on twitter surrounding a blog post by Andre Leibovici entitled “Open letter to non-persistent VDI fanboys…”. Representing the persistent side of the debate was Andre Leibovici and Shawn Bass. Non-persistent fanboys were represented by Jason Langone and Jason Mattox. Overall, this is a good discussion with both sides pointing out some strengths and weaknesses of each position:
So which is the better VDI management model, persistent or non-persistent? Personally I think Andre nailed it near the end of the debate, it’s all about use case! I know that’s the typical IT answer to most questions but it really is the best answer in many of these “best tech” debates. What matters to most customers is not which is the “best” but which is the “right fit”. A Ferrari may be the best car in the world but it’s clearly not the right fit for a family of four on a budget. So while it may be fun and entertaining to discuss which is the best, in the real-world, the most relevant question is ‘which is the right fit given a particular use case?’. If you have a call center with a small application portfolio, then this is an obvious use case for non-persistent desktops (though certainly not the only use case). I agree with the persistence crowd in regards to larger environments that have extensive application portfolios. The time it takes to virtualize and package all these applications and the impossibly large amount of software required to go non-persistent for all desktops in such an environment (UEM, app publishing, app streaming, etc.) makes persistence a much more viable option. This is why many VDI environments will usually have a mixture of persistent and non-persistent desktops. These are extreme examples but it’s clear that no one model is perfect for every situation.
Other random thoughts from this discussion:
—Throughout the debate and in most discussions surrounding persistent desktops, the persistent desktop crowd often points to new technology advances that make persistent desktops a viable option. Flash-based arrays, inline de-duplication, etc. are all cited as examples. The only problem with this is that while this technology exists today, many customers still don’t have it and aren’t willing to make the additional investment in a new array or other technology on top of the VDI software investment. So the technology exists and we can have very high-level, academic discussions on running persistent desktops with this technology but for many customers it’s still not a reality.
—Here again, like most times this discussion crops up, the non-persistent crowd makes a point of trumpeting the ease of managing non-persistent desktops while glossing over how difficult it can be to actually deploy this desktop type when organizations are seeking a high percentage of VDI users. Even if we ignore the technical challenges around application delivery, users still have to like the desktop…and most companies will have more users than they know that will require/demand persistent desktops.
—About midway through the debate there is talk about how non-persistence is limiting the user and installing apps is what users want, but earlier in the debate the panel all agreed that just allowing users to install whatever app they want is a security and support nightmare. I found this dichotomy interesting in that it illuminates this truth – whichever desktop model you choose the user is limited in some way. Whatever marketing you may hear to the contrary, remember that.
And last but certainly not least…
—In this debate Shawn delivers an argument I hear a lot in IT that I disagree with and maybe this deserves a separate post. He talks about the “duality” of operational expense when you are managing non-persistent desktops using image-based management in an environment where you still have physical endpoints being managed by Altiris/SCCM. He says you actually “double” your operational expence managing these desktops in different ways. The logic undergirding this argument is the assumption that ‘double the procedure equals double the operational cost’. To me this is not necessarily true and for many environments, definitely false. The only way having two procedures “doubles” your operational cost is if both procedures require an equal amount of time/effort/training/etc. to implement and maintain. And for many customers (who implement VDI at least partly for easier desktop managment) it’s clear that image-based management is viewed as the easier and faster solution to maintain desktops. I see this same logic applied to multi-hypervisor environments as well and simply disagree that having multiple procedures is always going to mean you double or even increase your operational cost.
Any other thoughts, comments or disagreements are welcome in the comment section!
A couple months ago F5 came out with a very intriguing announcement when they released full proxy support for PCoIP on the latest Access Policy Manager code version, 11.4. Traditional Horizon View environments use “Security Servers” to proxy PCoIP connections from external users to desktops residing in the datacenter. Horizon View Security Servers will reside in the DMZ and the software is installed on Windows hosts. This new capability from F5 completely eliminates the need for Security Servers in a Horizon View architecture and greatly simplifies the solution in the process.
In addition to eliminating Security Servers and getting Windows hosts out of your DMZ, this feature simplifies Horizon View in other ways that aren’t being talked about as much. One caveat to using Security Servers is that they must be paired with Connection Servers in a 1:1 relationship. Any sessions brokered through these Connections Servers will then be proxied through the Security Servers they are paired with. Because Security Servers are located in the DMZ, this setup works fine for your external users. For internal users, a separate pair of Connection Servers are usually needed so users can connect directly to their virtual desktop after the brokering process without having to go through the DMZ. To learn more about this behavior see here and here.
Pictured below is a traditional Horizon View deployment with redundancy and load balancing for all the necessary components:
What does this architecture look like when eliminating the Security Servers altogether in favor of using F5’s ability to proxy PCoIP?
As you can see, this is a much simpler architecture. Note also that each Connection Server supports up to 2000 connections per server. I wouldn’t recommend pushing that limit but the above servers could easily support around 1500 total users (accounting for the failure of one Connection Server). If you wanted full redundancy and automatic failover with Security Servers in the architecture, whether it was for 10 or 1500 external users, you would still need at least 2 Security and 2 Connection servers. A lot of times they are not there so much for increased capacity but just for redundancy for external users, so eliminating them from the architecture can easily simplify your deployment.
But could this be simplified even further?
In this scenario the internal load balancers were removed in favor of the load balancers in the DMZ having an internal interface configured with an internal VIP for load balancing. Many organizations will not like this solution because it will be considered a security risk for the device in the DMZ to have interfaces physically outside the DMZ. ADC vendors and partners will claim their device is secure but most customers still aren’t comfortable with this solution. Another solution for small deployments with limited budget would be to just place that VIP in the above picture in the DMZ. Internal users will still connect directly to their virtual desktops on the internal network and the DMZ VIP is only accessed during the initial load balancing process for the Connection Servers. Regardless of whether you use an internal VIP or another set of load balancers, this solution greatly simplifies and secures a Horizon View architecture.
Overall, I’m really excited by this development and am interested in seeing if other ADC vendors offer this functionality for PCoIP in the near future or not. To learn more, see the following links:
Over the past few years there has been no shortage of excellent blog posts detailing how to properly configure resource pools in a vSphere environment. Despite the abundance, quality and availability of this information, resource pools still seem to be the #1 most commonly misconfigured item on every VMware health check I’m involved with. Even though this is well treaded territory I wanted to lend my own way of explaining this issue, if for nothing else than just a place to direct people for information on resource pools.
What follows below is a simple diagram I usually draw on a whiteboard to help explain how resource pools work with customers.
There’s not much to say that the pictures don’t already show. Just remember to keep adjusting your pool share values as new VMs are added to the pool. Also note that while I assigned 8000:4000:2000 to the VMs in the High:Normal:Low pools above, I could have just as easily assigned 8:4:2 to the same VMs and achieved the same results. It’s the ratio between VMs that counts. In either example, a VM in the “High” pool gets twice as much resources under contention as a VM in the “Normal” pool and four times as much as a VM in the “Low” pool.
Looking for more information on resource pools?
- Understanding Resource Pools in VMware vSphere – Chris Wahl
- Label Resource Pools with Per VM Shares Value – Chris Wahl
- The Resource Pool Priority-Pie Paradox – Duncan Epping
- Shares set on Resource Pools – Duncan Epping
- Custom Shares on a Resource Pool, scripted – Duncan Epping
- Don’t add resource pools for fun, they’re dangerous – Eric Sloof
- Resource pools memory reservations – Frank Denneman
Feel free to send me any other good resource pool links in the comments section and I’ll add them to my list.
VMware has a KB article detailing a bug present in ESXi 5.0 that has been known to cause a variety of networking issues in iSCSI environments. Until last week, I had not encountered this particular bug and thought I’d detail my experiences troubleshooting this issue for those still on 5.0 that may experience this issue.
The customer I was working with had originally called for assistance because their storage array was only reporting 2 out of 4 available paths “up” to each connected iSCSI host. All paths had originally been up/active until a recent power outage and since then, no manner of rebooting or disabling/re-enabling had been successful in bringing them all back up simultaneously. Their iSCSI configuration was fairly standard, with 2 iSCSI port groups connected to a single vSwitch per-server and each port group connected to separate iSCSI networks. Each port group in this configuration has a different NIC specified as an “Active Adapter” and the other is placed under the “Unused Adapters” heading.
One of the first things that I wanted to rule out was a hardware issue related to the power outage. However, after not much time troubleshooting, I quickly discovered that simply doing some NIC disable/re-enable on the iSCSI switches would cause the “downed” paths to become active again within the storage array and the path that was previously “up” would go down. As expected, a vmkping was never successful through a NIC that was not registering properly on the storage array. Everything appeared to be configured correctly within the array, the switches and the ESXi hosts so at this point I had no clear culprit and needed to rule out potential causes. Luckily these systems had not been placed into production yet and so I was granted a lot of leeway in my troubleshooting proccess.
- Test #1. For my first test I wanted to rule out the storage array. I was working with this customer remotely, so I had them unplug the array from the iSCSI switches and plug into some Linksys switch they had lying around. I then had them plug their laptop into this same switch and assign it an IP address on each of the iSCSI networks. All ping tests to each interface was successful so I was fairly confident at this point the array was not the cause of this issue.
- Test #2. For my second test I wanted to rule out the switches. I had the customer plug all array interfaces back into the original iSCSI switches. I then had them unplug a few ESXi hosts from the switches. Then they assigned their laptop the same IP addresses as the unplugged ESXi host iSCSI port groups and ran additional ping tests from the same ports the ESXi hosts were using. All ping tests on every interface was successful, so it appeared unlikely that the switches were the culprit.
At this point it appeared almost certain that the ESXi hosts were the cause of the problems here. They were the only component that appeared to be having any communication issues as all other components taken in isolation communicated just fine. At this point it was also evident that something with the NIC failover/failback wasn’t working correctly (given the behavior when we disabled/re-enabled ports) so I put the iSCSI port groups on separate vSwitches. BINGO! Within a few seconds of doing this I could vmkping on all ports and the storage array was showing all ports active again. Given that this is not a required configuration for iSCSI networking for ESXi, I immediately started googling for known bugs. Within a few minutes I ran across this excellent blog post by Josh Townsend and the KB article I linked to above. The issue caused by the bug is that it will actually send traffic down the “unused” NIC during a failover scenerio.
This is why me separating the iSCSI port groups “fixed” the issue. There was no unused NIC in the portgroup for ESXi to mistakenly send the traffic to. In addition, it also explained the behavior where disabling/re-enabling a downed port would cause it to become active again (and vice versa). In this case ESXi was sending traffic down the unused port and my disable/re-enable caused a failover scenario that caused ESXi to send traffic down the active adapter again.
In my case, upgrading to 5.0 Update 1 completely fixed this issue. I’ll update this post if I run across this problem with any other version of ESXi, just note the workaround I spoke of above and outlined in both links.
Both VMware View and Citrix XenDesktop require permissions within vCenter to provision and manage virtual desktops. VMware and Citrix both have documentation on the exact permissions required for this user account. Creating a service account with the minimal amount of permissions necessary, however, can be cumbersome and as a result, many businesses have elected to just create an account with “Administrator” permissions within vCenter. While much easier to create, this configuration will not win you any points with a security auditor.
To make this process a bit easier I’ve created a couple quick scripts, one for XenDesktop and one for View, that create “roles” with the minimal permissions necessary for each VDI platform. For XenDesktop, the script will create a role called “Citrix XenDesktop” with the privileges specified here. For View, that script will create a role called “VMware View” with privileges specified on page 87-88 here. VMware mentions creating three roles in its documentation, but I just created one with all the permissions necessary for View Manager, Composer and local mode. Removing the “local mode” permissions is easy enough in the script if you don’t think you’re going to use it and the vast majority of View deployments I’ve seen use Composer, so I didn’t see it as necessary to separate that into a different role either. You’ll also note that I used the privilege “Id” instead of “Name”. The problem I ran into there is that “Name” is not unique within privileges (e.g. there is a “Power On” under both “vApp” and “Virtual Machine”) while “Id” is unique. So, for consistencies sake I just used “Id” to reference every privilege. The only thing that will need to be modified in these scripts is to make sure to enter your vCenter IP/Hostname after “Connect-VIServer”.
Of course, these scripts could be expanded to automate more tasks, such as creating a user account and giving access to specific folders or clusters, etc., but I will let all the PowerCLI gurus out there handle that. 🙂 Really, the only goal of these scripts is to automate the particular task that most people skip due to its tedious nature. Feel free to download, critique and expand as necessary.
After reading a bevy of excellent articles on multi-hypervisor datacenters, I thought I’d put pen to paper with my own thoughts on the subject. This article by Joe Onisick will serve as a primer to this discussion. Not only because it was recently written, but because it does an excellent job at fairly laying out the arguments on both sides of the issue. The article mentions three justifications organizations often use for deploying multiple hypervisors in their datacenter. These are, 1) cost, 2) leverage and 3) lock-in avoidance. I am in complete agreement that 2 and 3 are poor reasons to deploy multiple hypervisors, however, my disagrement on #1 is what I’d like to discuss with this post.
The discussion on the validity of multi-hypervisor environments has been going on for several years now. Steve Kaplan wrote an excellent article on this subject back in 2010 that mentions the ongoing debate at that time and discussions on this subject pre-date even that post. The recent acquisition of DynamicOps by VMware has made this a popular topic again and a slew of articles have been written covering the subject. Most of these articles seem to agree on a few things — First, despite what’s best for them, multi-hypervisor environments are increasing across organizations and service providers. Secondly, cost is usually the deciding factor in deploying multiple hypervisors, but this is not a good reason because you’ll spend more money managing the environment and training your engineers than you saved on the cost of the alternative hypervisor. Third, deploying multiple hypervisors in this way doesn’t allow you to move to a truly “private cloud” infrastructure. You now have two hypervisors and need two DR plans, two different deployment methods and two different management models. Let’s take each of these arguments against cost in turn and see how they hold up.
OpEx outweighs CapEx
As alluded to above, there’s really no denying that an organization can save money buying alternative hypervisors that are cheaper than VMware ESXi. But, do those cost savings outweigh potential increases in operational expenditures now that you’re managing two separate hypervisors? As the article by Onisick I linked to above suggests, this will vary from organization to organization. I’d like to suggest, however, that the increase in OpEx cited by many other sources as a reason to abandon multi-hypervisor deployments is often greatly exaggerated. Frequently cited is the increase in training costs, you have two hypervisors and now you have to send your people to two different training classes. I don’t necessarily see that as the case. If you’ve been trained and have a good grasp of the ESXi hypervisor, learning and administering the nuances and feature sets of another hypervisor is really not that difficult and formal training may not be necessary. Understanding the core mechanisms of what a hypervisor is and how it works will go a long way in allowing you to manage multiple hypervisors. And even if you did have to send your people to a one time training class, is it really all that likely that the class will outweigh the ongoing hypervisor cost savings? If not, then you probably aren’t saving enough money to justify multiple hypervisors in the first place. Doing a quick search, I’ve found week long XenServer training available for $5,000. Evaluate your people, do the math and figure out the cost savings in your scenario. Just don’t rule out multi-hypervisor environments thinking training costs will be necessarily astronomical or even essential for all of your employees.
Similar to the OpEx discussion, another argument often presented against the cost saving benefits of multi-hypervisor environments is that they are harder to administer as you have to come up with separate management strategies for VMs residing on the different hypervisors. Managing things in two separate ways, it is argued, moves away from the type of Private Cloud infrastructure most organizations should strive for. The main problem with this argument is that it assumes you would manage all of your VMs the same way even if they were on the same hypervisor. This is clearly false. A couple clear examples of this are XenApp and VDI. The way you manage these type of environments, deploy VMs, or plan DR is often vastly different than you would the rest of your server infrastructure. And so, if there is a significant cost savings, it is these type of environments that are often good targets for alternate hypervisors. They are good candidates for this type of environment not only because they are managed differently, regardless of hypervisor, but because they often don’t require many of the advanced features only ESXi provides.
I’m in complete agreement that having test/dev and production on separate hypervisors is a bad idea. Testing things on a different platform than they run in production is never good. But if you can save significant amounts of money by moving some of these systems that are managed in ways unique to your environment onto an alternate hypervisor, I’m all for it. This may not be the best solution for every organization (or even most), but like all things, should be evaluated carefully before ruling it out or adopting it.
Which is better, Citrix XenDesktop or VMware View? XenServer or ESXi? HDX or PCoIP? While the answer to these questions are debated on numerous blogs, tech conferences and marketing literature, what is explored far less often is how Citrix and VMware technologies can actually work together. What follows is a brief overview of some different ways that these technologies can be combined, forming integrated virtual infrastructures.
1) Application and Desktop delivery with VMware View and XenApp
Many organizations deploying VMware View already have existing Citrix XenApp infrastructures in place. The View and XenApp infrastructures are usually managed by separate teams and not integrated to the degree they could be. Pictured above are some possible ways these two technologies can integrate. As you can see, there are many different options in terms of application delivery with both environments. The most obvious is publishing applications from XenApp to your View desktops. This can reduce the resource consumption on individual desktops and also provides the added benefit of accessing those same applications outside your View environment with the ability to publish directly to remote endpoints as well. Existing Citrix infrastructures may also be utilizing Citrix application streaming technology as well. By simply installing some Citrix clients on your View desktops, applications can be streamed directly to View desktops or alternatively directly to end-points or even to XenApp servers and then published to View desktops or endpoints. Another option is to integrate ThinApp into this environment. Tina de Benedictis, had a good write-up on this a while back. The options for this are similar to Citrix streaming. You can stream to a XenApp server and then publish the application from there, stream directly to your View desktops or stream directly to end-points. As shown in the above picture, both Citrix Streaming and ThinApp can be used within the same environment. This might be an option if you’ve already packaged many of your applications with Citrix but either want to migrate to ThinApp over time or package and stream certain applications that Citrix streaming cannot (e.g. Internet Explorer). Whatever options you choose, it’s clear that both technologies can work together to form a very robust application and desktop delivery infrastructure.
2) Load Balancing VMware infrastructures with Citrix Netscaler
Some good articles have been written about this option as well. In fact, this option is becoming popular enough that VMware even has a KB dedicated to ensuring the correct configuration of Citix Netscalers in View environments. VMware View and VMware vCloud Director have redundant components that should be load balanced for best performance and high availability. If you have either of these products and are using Citrix Netscaler to proxy HDX connections or load balance Citrix components or other portions of your infrastructure, why not use them for VMware as well? Pictured above is a high-level overview of load balancing some internal-facing View Connection servers. Users connect to a VIP defined on the Netscalers (1), that directs them to the least busy View Connection server (2) that then connects them to the appropriate desktop based on user entitlement (3). After the initial connection process, the user connects directly to their desktop over PCoIP.
This is actually an extremely popular combination and the reasons are numerous and varied. You can have 32 host clusters (only 16 in XenServer and 8 with VMware View on ESXi), Storage vMotion and Storage DRS (XenServer doesn’t have these features and you can’t use them with VMware View), memory overcommitment (only ESXi has legitimate overcommit technology), Storage I/O Control, Network I/O Control, Multi-NIC vMotioning, Auto Deploy, and many more features that you can only get from the ESXi hypervisor. Using XenApp and XenDesktop on top of ESXi gets you the most robust hypervisor and application and desktop virtualization technology combinations possible.
4) XenApp as a connection broker for VMware View
This option intrigues me from an architectural point of view, but I have yet to see it utilized in a production environment. With this option you would publish your View Client from a XenApp server. Users could then utilize HDX/ICA over external connections or the WAN and from the XenApp server would connect to the View desktop on the LAN over PCoIP. What are the flaws in this method? I can think of a couple benefits to this off-hand. First, HDX generally performs better over high latency connections, so there could be a user experience boost. Second, VMware View uses a “Security Server” to proxy external PCoIP connections. The Security Server software just resides on a Windows server OS, a hardened security appliance like Netscaler would be more secure. I’d be interested to see how things like printing and USB redirection would work in such an environment, but for me, it’s definitely something I’d like to explore more.
So, those are a few of the possibilities for integrating VMware and Citrix technologies, what are some other combinations you can think of? Any other benefits or flaws in the above mentioned methods?