Archive for the ‘Troubleshooting Fridays’ Category

XenServer – Pool Master Recovery (The Missing Part 1 to XenServer Hosts in Halted Mode)

In July of 2012 I wrote a “part 2” regarding XenServer Hosts in halted mode — however I seem to have misplaced part 1 – which I’ve rewritten after having to need to reference these steps again recently.

There are several events which can cause a XenServer Pool to become corrupt. In a recent instance of mine, the pool master was unable to communicate with the HA storage repository (SR) and fenced. I also had another instance where several shutdown unexpectedly, and the pool master was among them. Here are the steps I performed to recover the Pool Master.

  1. Work on recovering the pool, elect the server you want to become the master, and on that box run “xe pool-emergency-transition-to-master”
  2. Once that is completed, on the newly elected/transitioned master, run “xe pool-recover-slaves”
  3. Once that is complete, you should be able to run “xe host-list” and see all of your hosts listed


Based in part on information from: XenServer System Recovery Guide


Hung VM, unable to force reboot/shutdown

I have been working with a few vendor provided VM’s which run Linux. For some reason this specific set of Linux VMs do not properly respond when issuing reboot or shutdown commands when they VMs are hung. This is even true of force-shutdown. The following process works great for virtual servers that are non-responsive in a XenServer environment, after normal reboot/shutdown attempts have failed.

  1. “xe vm-list name-label={vm logical name}” to get the uuid of the VM that is hung
  2. “list_domains” to list the domain uuid’s so you can determine the domain # of the VM above by matching the uuids from this output with the uuid for your VM from the previous command.
  3. “/opt/xensource/debug/destroy_domain -domid XX” where XX is the domain number from the previous command
  4. “xe vm-reboot name-label={vm logical name} –force”



Based in part on information from:


Disabled Mailbox is not showing in Disconnected Mailbox Area

May 21, 2013 1 comment

In Exchange, when you delete an active directory user account, it does not delete their mailbox automatically. Instead it considers the mailbox to be in a “disconnected” state. The mailbox exists but it is no longer associated to an active directory user account. There are several reasons why you might want to keep the mailbox around and perhaps eventually reconnect it. Today I was working on a very corrupt user account in AD, but the mailbox itself was fine. I simply deleted the user account from AD (after ensuring proper backups were taken), and then recreated a new user account. Now even though the username is the same as the one I just deleted, they contain a different GUID, so they are, in fact, different users. After creating the AD user account, I went over to the Exchange Management Console and the users mailbox was missing from both the Mailbox list, as well as the Disconnected list. The reason for this is because these are moved during a mailbox maintenance process. However you can speed this up.


Launch the Exchange PowerShell and run the following


After that is complete, go back to the Disconnect Mailbox list and refresh the page, and you will find your mailbox.



How to Remove a XenServer Slave when it No Longer Exists in the Pool

Citrix article CTX126382 describes how to remove a XenServer Slave from a pool, however it does not completely clean up after the process is complete. While the host will be removed, any storage repositiories will be left behind, such as DVD and local storage.

To clean these up perform the following:

1) Click on the disconnected storage repository on the console

2) On the general tab, right-click on the UUID and select copy

3) On the Pool master console, type: xe sr-forget uuid= (and then right-click paste which will insert the UUID of the disconnected storage repository)

Repete this process for all disconnected storage repositories, which is tpically local storage, DVD, and removable storage.




Xenserver hosts in halted mode (part 2)

Jul 29, 2012 1 comment

I recently encountered a problem where one server in a pool had shutdown expected in a way which cased the vms running on that host to fail. We restarted the host and found that about half of the vms returned to the pool and could be started on another pool member, however a handful of vms were unable to start. Using information I have previously posted, I checked the power-state for these vms and they were in a halted state. However they were not available in the list_domains command. Further attempts at recovery had failed.

At that point we took a closer took at the system and discovered that the dom0 drive had zero free disk space by running the command df from the console. I connected using winscp and browsed to the log directory and deleted a majority of the old and large log files, which freed up over 59% of the disk space. Another reboot later and the disk space issue was resolved,

However in this case, there was a second issue, which is that the host that was in this state was hosting the Citrix license server and this specific host was unable to contact the license server so it couldn’t start vms. But since this vm was halted instead of stopped I couldn’t start it on a different host yet. Simply going into the license manager in XenCenter, I removed licensing on the host, which placed it into a 28 grace period. Once this was completed I could restart the halted vms, and then subsequently repoint the host back to the license server to remain the Enterprise License feature set.

XenServer: Changing management adapter in pool

After going through several rounds of problems to move a management adapter for a xenserver pool, I have found the following working process. However, it is because of this processes that Citrix makes very clear that you should configure it properly in the first place, and if you need to make changes post-installation, to make them BEFORE you join it to a pool… Also you must change the subnet when changing interfaces. Even if you need to move it to a temporary, non-existant IP address space, and then move it back to the correct IP address space after you are on the correct network interface.

However, lets say you have a pool in production and you need to make the change…

  1. Perform a metadata backup and back up your virtual machines before performing the rest of this procedure.
  2. Disable High Availability from XenCenter, if enabled.
  3. Disable external authentication (Active Director)
  4. Log on to a pool member from the physical console and change the management interface IP address
  5. From the xsconsole, go to Network and Management Interface > Configure Management Interface.
    1. Note: xsconsole freezes when the change is applied. You can use the key sequence CTRL+Z to gain access to the command prompt to run step 4 below. Then, use the command fg %1 to return to xsconsole and exit cleanly.
  6. From the CLI: use the following command: xe pif-reconfigure-ip uuid= IP= gateway= netmask= DNS= mode=
  7. To locate the correct PIF uuid for pif-reconfigure command, use the following command: xe pif-list params=uuid,host-name-label,device,management
  8. From the CLI, run the following command: xe-toolstack-restart
  9. The server enters the emergency mode. Verify that the server is using the new IP address. You can ping it from another host. Try a Secure Shell connection to it, or use the ifconfig command. Verify that the server is in emergency mode by running xe host-is-in-emergency-mode from the CLI. You should get True as the output.
  10. Repeat steps 3 and 4 on each of the pool members.
  11. Change the management interface IP address on the pool master using step 3 above.
  12. Run the following command on the pool master: xe-toolstack-restart
    From the CLI, on each of the pool members, run xe pool-emergency-reset-master master-address=IP_OF_THE_MASTER.
  14. Verify the correct status of the pool. Connect with XenCenter to the new master’s IP address and check everything from there.
  15. Re-enable High Availability and external authentication, if required

If during this process, any of your pool-slave hosts reboot and show missing management interface, and no network cards, please see our post over at:

You can also view a video walk through of this process at:

Adapted from CTX123477

XenServer: Hung VM

I’ve experieneced several instances where a VM appears to hang and is non-repsonsive, not only at the console level, but also to the XenServer Hypervisor and XenCenter. Attempts to force shutdown the server using xe vm-reboot or xe vm-shutdown fail with the error “Another operation involving the object is currently in progress class: VM”.

This has worked consistently to recover this VM.

1 – “xe vm-list” to get the uuid of the VM that is hung
2 – “list_domains” to list the domain uuid’s so you can determine the domain # of the VM above by matching the uuids from this output with the uuid for your VM from the previous command.
3 – “/opt/xensource/debug/destroy_domain -domid XX” where XX is the domain number from the previous command
4 – “xe vm-reboot uuid=XXXX –force” where XXXX is the uuid from the first vm-list command for your VM.