Archive for category Uncategorized
Over the past few years there has been no shortage of excellent blog posts detailing how to properly configure resource pools in a vSphere environment. Despite the abundance, quality and availability of this information, resource pools still seem to be the #1 most commonly misconfigured item on every VMware health check I’m involved with. Even though this is well treaded territory I wanted to lend my own way of explaining this issue, if for nothing else than just a place to direct people for information on resource pools.
What follows below is a simple diagram I usually draw on a whiteboard to help explain how resource pools work with customers.
There’s not much to say that the pictures don’t already show. Just remember to keep adjusting your pool share values as new VMs are added to the pool. Also note that while I assigned 8000:4000:2000 to the VMs in the High:Normal:Low pools above, I could have just as easily assigned 8:4:2 to the same VMs and achieved the same results. It’s the ratio between VMs that counts. In either example, a VM in the “High” pool gets twice as much resources under contention as a VM in the “Normal” pool and four times as much as a VM in the “Low” pool.
Looking for more information on resource pools?
- Understanding Resource Pools in VMware vSphere – Chris Wahl
- Label Resource Pools with Per VM Shares Value – Chris Wahl
- The Resource Pool Priority-Pie Paradox – Duncan Epping
- Shares set on Resource Pools – Duncan Epping
- Custom Shares on a Resource Pool, scripted – Duncan Epping
- Don’t add resource pools for fun, they’re dangerous – Eric Sloof
- Resource pools memory reservations – Frank Denneman
Feel free to send me any other good resource pool links in the comments section and I’ll add them to my list.
Below you’ll find step-by-step instructions on setting up a Cisco UCS environment for the first time. I wanted to post this as a general guideline for those new to UCS who may be setting up their first lab or production environments. It’s important to note that UCS is highly customizable and that configuration settings will be different between environments. So, what you’ll see below is a fairly generic configuration of UCS with an ESXi service profile template. Also important to note is that since the purpose of this is to aid UCS newcomers in setting up UCS for the first time, I’ve done many of these steps manually. Most of the below configuration can be scripted and pools and policies can be created in the service profile template wizard but to really learn where things are at the first time, I recommend doing it this way.
This is a pretty lengthy blog post, so if you’d like it in .pdf format, click here.
There’s really not more to say on a general level that the pictures don’t already show. Based on how your environment is set up and the type of connectivity you require, the cabling could be much different than what is pictured above. The important things to note, however, are that you will always only connect a particular I/O Module to its associated Fabric Interconnect (as shown above) and for Fiber channel connections, “Fabric A” goes to “Switch A” and likewise for Fabric B. Each switch is then connected to each storage processor. Think of the Fabric Interconnects in this scenario as separate initiator ports on a single physical server (which is how we’ll configure them in our service profile) and the cabling will make much more sense.
Configuring the Fabric Interconnects
Connect to the console port of Fabric Interconnect (FI) “A”, which will be the primary member of the cluster. Power on FI-A and leave the secondary FI off for now. Verify that the console port parameters on the attached computer are as follows “9600 baud”, “8 data bits”, “No parity”, “1 stop bit”. You will then be presented with the following menu items (in bold, with input in green):
Enter the configuration method. (console/gui) ? console
Enter the setup mode; setup newly or restore from backup. (setup/restore) ? setup
You have chosen to setup a new Fabric interconnect. Continue? (y/n): y
Enter the password for “admin”: password
Confirm the password for “admin”: password
Is this Fabric interconnect part of a cluster(select ‘no’ for standalone)? (yes/no) [n]: yes
Enter the switch fabric (A/B) : A
Enter the system name: NameOfSystem (NOTE: “-A” will be appended to the end of the name)
Physical Switch Mgmt0 IPv4 address : X.X.X.X
Physical Switch Mgmt0 IPv4 netmask : X.X.X.X
IPv4 address of the default gateway : X.X.X.X
Cluster IPv4 address : X.X.X.X (NOTE: This IP address will be used for Management)
Configure the DNS Server IPv4 address? (yes/no) [n]: y
DNS IPv4 address : X.X.X.X
Configure the default domain name? (yes/no) [n]: y
Default domain name: domain.com
Apply and save the configuration (select ‘no’ if you want to re-enter)? (yes/no): yes
Now connect to the console port of the secondary FI and power it on. Once again, you will be presented with the following menu items:
Enter the configuration method. (console/gui) ? console
Installer has detected the presence of a peer Fabric interconnect. This Fabric interconnect will be added to the cluster. Continue (y/n) ? y
Enter the admin password of the peer Fabric interconnect: password
Physical Switch Mgmt0 IPv4 address : X.X.X.X
Apply and save the configuration (select ‘no’ if you want to re-enter)? (yes/no): yes
Both Fabric Interconnects should now be configured with basic IP and Cluster IP information. If, for whatever reason you decide you’d like to erase the Fabric Interconnect configuration and start over from the initial configuration wizard, issue the following commands: “connect local-mgmt” and then “erase configuration”
After the initial configuration and cabling of Fabric Interconnect A and B is complete, open a browser and connect to the cluster IP address and launch UCS Manager:
Configuring Equipment Policy
Go to the “Equipment” tab and then “Equipment->Policies”:
The chassis discover policy “Action:” dropdown should be set to the amount of links that are connected between an individual IOM and Fabric Interconnect pair. For instance, in the drawing displayed earlier each IOM had four connections to its associated Fabric Interconnect. Thus, a “4 link” policy should be created. This policy could be left at the default value of “1 link” but my personal preference is to set it to the actual amount of connections that should be connected between an IOM and FI pair. This policy is essentially just specifying how many connections need to be present for a chassis to be discovered.
For environments with redundant power sources/PDUs, “Grid” should be specified for a power policy. If one source fails (which causes a loss of power to one or two power supplies), the surviving power supplies on the other power circuit continue to provide power to the chassis. Both grids in a power redundant system should have the same number of power supplies. Slots 1 and 2 are assigned to grid 1 and slots 3 and 4 are assigned to grid 2.
Go to the “Equipment” tab and then “Fabric Interconnects->Fabric Interconnect A/B” and expand any Fixed or Expansion modules as necessary. Configure the appropriate unconfigured ports as “Server” (connections between IOM and Fabric Interconnect) and “Uplink” (connection to network) as necessary:
For Storage ports, go to the “Equipment” tab and then “Fabric Interconnects->Fabric Interconnect A/B” and in the right-hand pane, select “Configure Unified Ports”. Click “Yes” in the proceeding dialog box to acknowledge that a reboot of the module will be necessary to make these changes. On the “Configure Fixed Module Ports” screen, drag the slider just past the ports you want to configure as storage ports and click “Finish”. Select “Yes” on the following screen to confirm that you want to make these changes:
Next, create port channels as necessary on each Fabric Interconnect for Uplink ports. Go to the “LAN” tab, then “LAN->LAN Cloud->FabricA/B->Port Channels->Right-Click and ‘Create Port Channel'”. Then give the port channel a name and select the appropriate ports and click “Finish”:
Select the Port Channel and ensure that it is enabled and is set for the appropriate speed:
Next, configure port channels for your SAN interfaces as necessary. Go to the “SAN” tab and then “SAN Cloud->Fabric A/B->FC Port Channels->Right Click and ‘Create Port Channel'”. Then give the port channel a name and select the appropriate ports and select finish:
Select the SAN port channel and ensure that it is enabled and set for the appropriate speed:
What follows are instructions for manually updating firmware to the 2.1 release on a system that is being newly installed. Systems that are currently in production will follow a slightly different set of steps (e.g. “Set startup version only”). After the 2.1 release, firmware auto install can be used to automate some of these steps. Release notes should be read before upgrading to any firmware release as the order of these steps may change over time. With that disclaimer out of the way, the first step in updating the firmware is downloading the most recent firmware packages from cisco.com:
There are two files required for B-Series firmware upgrades. An “*.A.bin” file and a “*.B.bin” file. The “*.B.bin” file contains all of the firmware for the B-Series blades. The “*.A.bin” file contains all the firmware for the Fabric Interconnects, I/O Modules and UCS Manager.
After the files have been downloaded, launch UCS manager and go to the “Equipment” tab. From there navigate to “Firmware Management->Download Firmware”, and upload both .bin packages:
The newly downloaded packages should be visible under the “Equipment” tab “Firmware Management->Packages”.
The next step is to update the adapters, CIMC and IOMs. Do this under the “Equipment” tab “Firmware Management->Installed Firmware->Update Firmware”:
Next, activate the adapters, then UCS Manager and then the I/O Modules under the “Equipment” tab “Firmware Management->Installed Firmware->Activate Firmware”. Choose “Ignore Compatibility Check” anywhere applicable. Make sure to uncheck “Set startup version only”, since this is an initial setup and we aren’t concerned with rebooting running hosts:
Next, activate the subordinate Fabric Interconnect and then the primary Fabric Interconnect:
Creating a KVM IP Pool
Go to the “LAN” tab and then “Pools->root->IP Pools->IP Pool ext-mgmt”. Right-click and select “Create Block of IP addresses”. Next, specify your starting IP address and the total amount of IPs you require, as well as the default gateway and primary and secondary DNS servers:
Creating a Sub-Organization
Creating a sub-organization is optional, for granularity and organizational purposes and are meant to contain servers/pools/policies of different functions. To create a sub-organization, right-click any “root” directory and select “Create Organization”. Specify the name of the organization and any necessary descriptions and select “OK”. The newly created sub-organization will be visible in most tabs now under “root->Sub-Organizations”:
Create a Server Pool
To create a server pool, go to “Servers” tab and then “Pools->Sub-Organization->Server Pools”. Right-Click “Server Pools” and select “Create Server Pool”. From there, give the Pool a name and select the servers that should be part of the pool:
Creating a UUID Suffix Pool
Go to the “Servers” tab and then “Pools->Sub-Organizations->UUID Suffix Pool”. Right-Click and select “Create UUID Suffix Pool”. Give the pool a name and then create a block of UUID Suffixes. I usually try to create some two letter/number code that will align with my MAC/HBA templates that allow me to easily identify a server (e.g. “11” for production ESXi):
Creating MAC Pools
For each group of servers (i.e. “ESXi_Servers”, “Windows_Servers”, etc.), create two MAC pools. One that will go out of the “A” fabric another that will go out the “B” fabric. Go to the “LAN” tab, then “Pools->root->Sub-Organization”, right-click “MAC Pools” and select “Create MAC Pool”. From there, give each pool a name and MAC address range that will allow you to easily identify the type of server it is (e.g. “11” for production ESXi) and the fabric it should be going out (e.g. “A” or “B”):
Whole blog posts have been written on MAC pool naming conventions, to keep things simple for this initial configuration, I’ve chosen a fairly simple naming convention where “11” denotes a production ESXi server and “A” or “B” denotes which FI traffic should be routed through. If you have multiple UCS pods and multiple sites, consider creating a slightly more complex naming convention that will allow you to easily identify exactly where traffic is coming from by simply reviewing the MAC address information. The same goes for WWNN and WWPN pools as well.
Creating WWNN Pools
To create a WWNN Pool, go to the “SAN” tab, then “Pools->root->Sub-Organization”. Right-click on “WWNN Pools” and select “Create WWNN Pool. From there, create a pool name and select a WWNN pool range. Each server should have two HBA’s and therefore two WWNNs. So the amount of WWNNs should be the amount of servers in the pool multiplied by 2:
Create WWPN Pools
Each group of servers should have two WWPN Pools, one for the “A” fabric and one for “B”. Go to the “SAN” tab, then “Pools->root->Sub-Organization”. Right-click on “WWPN Pools” and select “Create WWPN Pool”, from there, give the pool a name and WWPN range:
Creating a Network Control Policy
Go to the “LAN” tab, then “Policies->root->Sub-Organizations->Network Control Policies”, from there, right-click “Network Control Policies” and select “Create Network Control Policy”. Give the policy a name and enable CDP:
Go to the “LAN” tab and then “LAN->LAN Cloud->VLANS”. Right-click on “VLANs” and select “Create VLANs”. From there, create a VLAN name and ID:
Go to the “SAN” tab and then “SAN->SAN Cloud->VSANs”. Right-Click “VSANs” and select “Create VSAN”. From there, specify a VSAN name, select “Both Fabrics Configured Differently” and then specify the VSAN and FCoE ID for both fabrics:
After this has been done, go to each FC Port-Channel in “SAN” tab “SAN->SAN Cloud->Fabric A/B->FC Port Channels” and select the appropriate VSAN. Once the VSAN has been selected, “Save Changes”:
Creating vNIC Templates
Each group of servers should have two templates. One going out the “A” side of the fabric and one going out the “B” side. Go to the “LAN” tab, then “Policies->root->Sub-Organization->vNIC Templates”. Right-click on “vNIC Templates” and select “Create vNIC Template”. Give the template a name, specify the Fabric ID and select “Updating Template”. Also specify the appropriate VLANs, MAC Pool and Network Control Policy:
Creating vHBA Templates
Each group of servers should have two templates. One going out the “A” side of the fabric and one going out the “B” side. Go to the “SAN” tab, then “Policies->root->Sub-Organization->vHBA Templates”. Right-click on “vHBA Templates” and select “Create vHBA Template”. Give the template a name, specify the Fabric ID and select “Updating Template”. Also specify the appropriate WWPN Pool:
Creating a BIOS policy
For hypervisors, I always disable Speedstep and Turbo Boost. Go to the “Servers” tab, then “Policies->root->Sub-Organizations->BIOS Policies”. From there, right-click on “BIOS Policies and select “Create BIOS Policy. Give the policy a name and under “Processor”, disable “Turbo Boost” and “Enhanced Intel Speedstep”:
Creating a Host Firmware Policy
Go to the “Servers” tab, then “Policies->root->Sub-Organizations->Host Firmware Packages”. Right-click “Host Firmware Packages” and select “Create Host Firmware Package”. Give the policy a name and select the appropriate package:
Create Local Disk Configuration Policy
Go to the “Servers” tab, then “Policies->root->Sub-Organizations->Local Disk Config Policies”. Right-click “Local Disk Config Policies” and select “Create Local Disk Configuration Policy”. Give the policy a name and under “Mode:” select “No Local Storage” (assuming you are booting from SAN):
Create a Maintenance Policy
Go to the “Servers” tab, then “Policies->root->Sub-Organizations->Maintenance Policies”. Right-click “Maintenance Policies” and select “Create Maintenance Policy”. From there, give the policy a name and choose “User ack”. “User ack” just means that the user/admin has to acknowledge any maintenance tasks that require a reboot of the server:
Create a Boot Policy
Go to the “Servers” tab, then “Policies->root->Sub-Organizations->Boot Policy”. Right-click “Boot Policy” and select “Create Boot Policy”. Give the policy a name and add a CD-ROM as the first device in the boot order. Next, go to “vHBAs” and “Add SAN Boot”. Name the HBA’s the same as your vHBA templates. Each “SAN Boot” vHBA will have two “SAN Boot Targets” that will need to be added. The WWNs you enter should match the cabling configuration of your Fabric Interconnects. As an example, the following cabling configuration…:
Should have the following boot policy configuration:
Creating a Service Profile Template
Now that you have created all the appropriate policies, pools and interface templates, you are ready to build your service profile. Go to the “Servers” tab and then “Servers->Service Profile Templates->root->Sub-Organizations”. Right-click on the appropriate sub-organization and select “Create Service Profile Template”. Give the template a name, select “Updating Template” and specify the UUID pool created earlier. An updating template will allow you to modify the template at a later time and have those modifications propagate to any service profiles that were deployed using that template:
In the “Networking” section, select the “Expert” radio button and “Add” 6 NICS for ESXi hosts (2 for MGMT, 2 for VMs, 2 for vMotion). After clicking “Add” you will go to the “Create vNIC” dialog box. Immediately select the “Use vNIC Template” checkbox, select vNIC Template A/B and the “VMware” adapter policy. Alternate between the “A” and “B” templates on each vNIC:
In the “Storage” section, specify the local storage policy created earlier and select the “Expert” radio button. Next “Add” two vHBA’s. After you click “Add” and are in the “Create vHBA” dialog box, immediately select the “Use vHBA Template” checkbox and give the vHBA a name. Select the appropriate vHBA Template (e.g. vHBA_A->ESXi_HBA_A, etc) and adapter policy:
Skip the “Zoning” and “vNIC/vHBA Placement” sections by selecting “Next”. Then, in the “Server Boot Order” section, select the appropriate boot policy:
In the “Maintenance Policy” section, select the appropriate maintenance policy:
In the “Server Assignment” section, leave the “Pool Assignment” and power state options at their default. Select the “Firmware Management” dropdown and select the appropriate firmware management policy:
In “Operational Policies”, select the BIOS policy created earlier and then “Finish”:
Deploying a Service Profile
To deploy a service profile from a template, go to the “Servers” tab, then “Servers->Service Profile Templates->root->Sub-Organizations”. Right-click the appropriate service profile template and select “Create service profiles from template”. Select a naming prefix and the amount of service profiles you’d like to create:
To associate a physical server with the newly created profile, right-click the service profile and select “Change service profile association”. In the “Associate Service Profile” dialog box, choose “Select existing server” from the “Server Assignment” drop down menu. Select the appropriate blade and click “OK”:
You can have UCS manager automatically assign a service profile to a physical blade by associating the service profile template to a server pool. However, the way in which UCS automatically assigns a profile to a blade is usually not desired by most people and this way allows you assign profiles to specific slots for better organization.
Configuring Call Home
Go to the “Admin” tab and then “Communication Management->Call Home”. In the right-hand pane, turn the admin state to “On” and fill out all required fields:
In the “Profiles” tab, add firstname.lastname@example.org to the “Profile CiscoTAC-1”. Add the internal email address to the “Profile full_txt”:
Under “Call Home Policies”, add the following. More policies could be added but this is a good baseline that will alert you to any major equipment problems:
Under “System Inventory”, select “On” next to “Send Periodically” and change to a desirable interval. Select “Save Changes” and then click the “Send System Inventory Now” button and an email should be sent to email@example.com:
In the “Admin” tab, select “Time Zone Management”. Click “Add NTP Server” in the right-hand pane to add an NTP server and select “Save Changes” at the bottom:
Backing up the Configuration
Go to the “Admin” tab and then “All”. In the right-hand pane, select “Backup Configuration”. From the “Backup Configuration” dialog box, choose “Create Backup Operation”. Change Admin states to “Enabled” and do a “Full State” and then an “All Configuration” backup. Make sure to check “Preserve Identities:” when doing an “All Configuration” backup and save both backups to the local computer and then to an easily accessible network location:
After backing up your configuration you can start your ESXi/Windows/Linux/etc. host configurations! Now that all the basic prep-work has been done, deploying multiple servers from this template should be a breeze. Again, it’s important to note that what is shown above are some common settings typically seen in UCS environments, particularly when setting up ESXi service profile templates. Certainly, there could be much more tweaking (BIOS, QoS settings, MAC Pool naming conventions, etc.) but these general settings should give you a general idea of what is needed for a basic UCS config.
I’ve had a number of customers ask me about the steps needed in order to setup Windows boot from SAN in a Cisco UCS environment. There are a number of resources out there already, but I wanted to go ahead and create my own resource that I could consistently point people to when the question comes up. So, without further ado…
Assuming the service profile has already been built with a boot policy specifying CD-ROM and then SAN storage as boot targets, complete the following steps to install Microsoft Windows in a boot from SAN environment on Cisco UCS:
1. First, download the Cisco UCS drivers from Cisco.com. Use the driver .iso file that matches the level of firmware you are on:
2. Next, boot the server and launch the KVM console. From the “Virtual Media” tab, add the Windows server boot media as well as the drivers .iso file downloaded in the previous step and map the Windows boot media. After the server is booted, zone only one path to your storage array (e.g. vHBA-A -> SPA-0). Once the path has been zoned, you can also register the server on the array and add to the appropriate storage groups. Remember, it is very important that you only present one path to your storage array until multipathing can be configured on Windows after the installation. A failure to do this will result in LUN corruption.
3. Once the installation reaches the point where you select the disk to install Windows on, the installation process will notify you that drivers were not found for the storage device. Go back to the “Virtual Media” tab and map the drivers .iso file:
6. After selecting the appropriate driver, the new drive should appear (you may have to select “Refresh” if it does not show up immediately). Re-map the Windows media and continue with the installation:
7. After Windows is fully installed, configure the desired multipathing software and zone and register the rest of the paths to the array.
That’s about it! This is really a very simple procedure, the most important things to note are to get the appropriate drivers and zone only one path during installation.
VMware has a KB article detailing a bug present in ESXi 5.0 that has been known to cause a variety of networking issues in iSCSI environments. Until last week, I had not encountered this particular bug and thought I’d detail my experiences troubleshooting this issue for those still on 5.0 that may experience this issue.
The customer I was working with had originally called for assistance because their storage array was only reporting 2 out of 4 available paths “up” to each connected iSCSI host. All paths had originally been up/active until a recent power outage and since then, no manner of rebooting or disabling/re-enabling had been successful in bringing them all back up simultaneously. Their iSCSI configuration was fairly standard, with 2 iSCSI port groups connected to a single vSwitch per-server and each port group connected to separate iSCSI networks. Each port group in this configuration has a different NIC specified as an “Active Adapter” and the other is placed under the “Unused Adapters” heading.
One of the first things that I wanted to rule out was a hardware issue related to the power outage. However, after not much time troubleshooting, I quickly discovered that simply doing some NIC disable/re-enable on the iSCSI switches would cause the “downed” paths to become active again within the storage array and the path that was previously “up” would go down. As expected, a vmkping was never successful through a NIC that was not registering properly on the storage array. Everything appeared to be configured correctly within the array, the switches and the ESXi hosts so at this point I had no clear culprit and needed to rule out potential causes. Luckily these systems had not been placed into production yet and so I was granted a lot of leeway in my troubleshooting proccess.
- Test #1. For my first test I wanted to rule out the storage array. I was working with this customer remotely, so I had them unplug the array from the iSCSI switches and plug into some Linksys switch they had lying around. I then had them plug their laptop into this same switch and assign it an IP address on each of the iSCSI networks. All ping tests to each interface was successful so I was fairly confident at this point the array was not the cause of this issue.
- Test #2. For my second test I wanted to rule out the switches. I had the customer plug all array interfaces back into the original iSCSI switches. I then had them unplug a few ESXi hosts from the switches. Then they assigned their laptop the same IP addresses as the unplugged ESXi host iSCSI port groups and ran additional ping tests from the same ports the ESXi hosts were using. All ping tests on every interface was successful, so it appeared unlikely that the switches were the culprit.
At this point it appeared almost certain that the ESXi hosts were the cause of the problems here. They were the only component that appeared to be having any communication issues as all other components taken in isolation communicated just fine. At this point it was also evident that something with the NIC failover/failback wasn’t working correctly (given the behavior when we disabled/re-enabled ports) so I put the iSCSI port groups on separate vSwitches. BINGO! Within a few seconds of doing this I could vmkping on all ports and the storage array was showing all ports active again. Given that this is not a required configuration for iSCSI networking for ESXi, I immediately started googling for known bugs. Within a few minutes I ran across this excellent blog post by Josh Townsend and the KB article I linked to above. The issue caused by the bug is that it will actually send traffic down the “unused” NIC during a failover scenerio.
This is why me separating the iSCSI port groups “fixed” the issue. There was no unused NIC in the portgroup for ESXi to mistakenly send the traffic to. In addition, it also explained the behavior where disabling/re-enabling a downed port would cause it to become active again (and vice versa). In this case ESXi was sending traffic down the unused port and my disable/re-enable caused a failover scenario that caused ESXi to send traffic down the active adapter again.
In my case, upgrading to 5.0 Update 1 completely fixed this issue. I’ll update this post if I run across this problem with any other version of ESXi, just note the workaround I spoke of above and outlined in both links.
Both VMware View and Citrix XenDesktop require permissions within vCenter to provision and manage virtual desktops. VMware and Citrix both have documentation on the exact permissions required for this user account. Creating a service account with the minimal amount of permissions necessary, however, can be cumbersome and as a result, many businesses have elected to just create an account with “Administrator” permissions within vCenter. While much easier to create, this configuration will not win you any points with a security auditor.
To make this process a bit easier I’ve created a couple quick scripts, one for XenDesktop and one for View, that create “roles” with the minimal permissions necessary for each VDI platform. For XenDesktop, the script will create a role called “Citrix XenDesktop” with the privileges specified here. For View, that script will create a role called “VMware View” with privileges specified on page 87-88 here. VMware mentions creating three roles in its documentation, but I just created one with all the permissions necessary for View Manager, Composer and local mode. Removing the “local mode” permissions is easy enough in the script if you don’t think you’re going to use it and the vast majority of View deployments I’ve seen use Composer, so I didn’t see it as necessary to separate that into a different role either. You’ll also note that I used the privilege “Id” instead of “Name”. The problem I ran into there is that “Name” is not unique within privileges (e.g. there is a “Power On” under both “vApp” and “Virtual Machine”) while “Id” is unique. So, for consistencies sake I just used “Id” to reference every privilege. The only thing that will need to be modified in these scripts is to make sure to enter your vCenter IP/Hostname after “Connect-VIServer”.
Of course, these scripts could be expanded to automate more tasks, such as creating a user account and giving access to specific folders or clusters, etc., but I will let all the PowerCLI gurus out there handle that. 🙂 Really, the only goal of these scripts is to automate the particular task that most people skip due to its tedious nature. Feel free to download, critique and expand as necessary.
Documentation for creating custom load evaluators in Citrix has existed for some time. Articles detailing the folly of using the “Default” load evaluator have been around for a while as well. Citrix even has an excellent whitepaper titled “Top 10 items found by Citrix Consulting on Assessments” that lists improper load management as the 2nd overall most common misconfigured item found by Citrix consulting and even gives an example baseline custom load evaluator. Despite all this, environments using the Default load evaluator are still prevalent and make up at least half the Citrix assessments I’m involved with. When words fail to make an impression, sometimes a visual can help:
The problem with the Default load evaluator is clear, it takes user distribution into account but not actual server resource consumption. Citrix load indexes are calculated on a 0-10,000 scale (you can see the value for each server with the “qfarm /load” command), with 10,000 being a “full” server. As you can see above, Server03 is the least busy from a Citrix perspective (since it has the least amount of users logged on), despite being the most busy from a server perspective. Further, the Default load evaluator sets the maximum amount of users per server at “100” while the environment above will not support more than 25-30. So from a load distribution and capacity perspective, the Default load evaluator is clearly ill-suited for any production environment.
A custom load evaluator that accounts for resource consumption takes less than 5 minutes to create and apply to the appropriate servers in your farm. As mentioned previously, the Citrix whitepaper I linked to above has a good baseline custom load evaluator that should get you started. So, take the time to make this simple farm optimization, your users will thank you!
After reading a bevy of excellent articles on multi-hypervisor datacenters, I thought I’d put pen to paper with my own thoughts on the subject. This article by Joe Onisick will serve as a primer to this discussion. Not only because it was recently written, but because it does an excellent job at fairly laying out the arguments on both sides of the issue. The article mentions three justifications organizations often use for deploying multiple hypervisors in their datacenter. These are, 1) cost, 2) leverage and 3) lock-in avoidance. I am in complete agreement that 2 and 3 are poor reasons to deploy multiple hypervisors, however, my disagrement on #1 is what I’d like to discuss with this post.
The discussion on the validity of multi-hypervisor environments has been going on for several years now. Steve Kaplan wrote an excellent article on this subject back in 2010 that mentions the ongoing debate at that time and discussions on this subject pre-date even that post. The recent acquisition of DynamicOps by VMware has made this a popular topic again and a slew of articles have been written covering the subject. Most of these articles seem to agree on a few things — First, despite what’s best for them, multi-hypervisor environments are increasing across organizations and service providers. Secondly, cost is usually the deciding factor in deploying multiple hypervisors, but this is not a good reason because you’ll spend more money managing the environment and training your engineers than you saved on the cost of the alternative hypervisor. Third, deploying multiple hypervisors in this way doesn’t allow you to move to a truly “private cloud” infrastructure. You now have two hypervisors and need two DR plans, two different deployment methods and two different management models. Let’s take each of these arguments against cost in turn and see how they hold up.
OpEx outweighs CapEx
As alluded to above, there’s really no denying that an organization can save money buying alternative hypervisors that are cheaper than VMware ESXi. But, do those cost savings outweigh potential increases in operational expenditures now that you’re managing two separate hypervisors? As the article by Onisick I linked to above suggests, this will vary from organization to organization. I’d like to suggest, however, that the increase in OpEx cited by many other sources as a reason to abandon multi-hypervisor deployments is often greatly exaggerated. Frequently cited is the increase in training costs, you have two hypervisors and now you have to send your people to two different training classes. I don’t necessarily see that as the case. If you’ve been trained and have a good grasp of the ESXi hypervisor, learning and administering the nuances and feature sets of another hypervisor is really not that difficult and formal training may not be necessary. Understanding the core mechanisms of what a hypervisor is and how it works will go a long way in allowing you to manage multiple hypervisors. And even if you did have to send your people to a one time training class, is it really all that likely that the class will outweigh the ongoing hypervisor cost savings? If not, then you probably aren’t saving enough money to justify multiple hypervisors in the first place. Doing a quick search, I’ve found week long XenServer training available for $5,000. Evaluate your people, do the math and figure out the cost savings in your scenario. Just don’t rule out multi-hypervisor environments thinking training costs will be necessarily astronomical or even essential for all of your employees.
Similar to the OpEx discussion, another argument often presented against the cost saving benefits of multi-hypervisor environments is that they are harder to administer as you have to come up with separate management strategies for VMs residing on the different hypervisors. Managing things in two separate ways, it is argued, moves away from the type of Private Cloud infrastructure most organizations should strive for. The main problem with this argument is that it assumes you would manage all of your VMs the same way even if they were on the same hypervisor. This is clearly false. A couple clear examples of this are XenApp and VDI. The way you manage these type of environments, deploy VMs, or plan DR is often vastly different than you would the rest of your server infrastructure. And so, if there is a significant cost savings, it is these type of environments that are often good targets for alternate hypervisors. They are good candidates for this type of environment not only because they are managed differently, regardless of hypervisor, but because they often don’t require many of the advanced features only ESXi provides.
I’m in complete agreement that having test/dev and production on separate hypervisors is a bad idea. Testing things on a different platform than they run in production is never good. But if you can save significant amounts of money by moving some of these systems that are managed in ways unique to your environment onto an alternate hypervisor, I’m all for it. This may not be the best solution for every organization (or even most), but like all things, should be evaluated carefully before ruling it out or adopting it.