Browsed by
Month: January 2017

Mystery of the broken VM

Mystery of the broken VM

Today my colleague (vmware administrator) asked me a small favour – help to perform RCA (root cause analyze) – related to one of production VM that had recently a problem. VM for some reason was migrated (collegue stands that it happened without administrative intervention) to other ESXi host that do not have proper network config for this vm – this caused outage for whole system. I asked about issue time and we went deeper into logs in order to find out what exacly happened.
We started from VM logs called vmware.log.

vmx| I120: VigorTransportProcessClientPayload: opID=52A52E10-000000CE-7b-15-97b0 seq=26853: Receiving PowerState.InitiatePowerOff request.
vmx| I120: Vix: [36833 vmxCommands.c:556]: VMAutomation_InitiatePowerOff. Trying hard powerOff
| vmx| I120: VigorTransport_ServerSendResponse opID=52A52E10-000000CE-7b-15-97b0 seq=26853: Completed PowerState request.
vmx| I120: Stopping VCPU threads…
vmx| I120: VMX exit (0).
| vmx| I120: AIOMGR-S : stat o=29 r=50 w=25 i=63 br=795381 bw=237700 
vmx| I120: OBJLIB-LIB: ObjLib cleanup done.
 -> vmx| W110: VMX has left the building: 0. 

According to VMware KB: 2097159 „VMX has left the building: 0” – is an informational message and is caused by powering off a Virtual Machine or performing a vMotion of the Virtual Machine to a new host. It can be safely ignored so we had a first clue related to vm migration.
Next we moved to fdm.log (using time stamps form vmware.log) and surprise surprise – VM for some reason was powered off 1h before issue occur:

vmx local-host: local power state=powered off; assuming user power off; global power state=powered off
verbose fdm[FF9DE790] [Originator@6876 sub=Invt opID=SWI-95f1a3f] [VmStateChange::SaveToInventory] vm /vmfs/volumes/vmx from __localhost__ changed inventory cleanPwrOff=1

Now we need to find out why VM was powered off – so we went through hostd logs using  grep with VM name:
info hostd[3DE40B70] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes.vmx] State Transition (VM_STATE_POWERING_OFF -> VM_STATE_OFF)
info hostd[3DE40B70] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/.vmx] State Transition (VM_STATE_POWERING_OFF -> VM_STATE_OFF)
info hostd[3E9C1B70] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/.vmx opID=6197d370-76-97ee user=vpxuser:VSPHERE.LOCAL\Administrator] State Transition (VM_STATE_OFF -> VM_STATE_CREATE_SNAPSHOT)
info hostd[3E9C1B70] [Originator@6876 sub=Libs opID=6197d370-76-97ee user=vpxuser:VSPHERE.LOCAL\Administrator] SNAPSHOT: SnapshotConfigInfoReadEx: Creating new snapshot dictionary, ‘/vmfs/volumes/.vmsd.usd’.
info hostd[3E9C1B70] [Originator@6876 sub=Libs opID=6197d370-76-97ee
user=vpxuser:VSPHERE.LOCAL\Administrator] SNAPSHOT: SnapshotCombineDisks: Consolidating from ‘/vmfs/volumes/ -000001.vmdk’ to ‘/ user=vpxuser:VSPHERE.LOCAL\Administrator] SNAPSHOT: SnapshotCombineDisks: Consolidating from ‘/vmfs/volumes/000001.vmdk’ to ‘/vmfs/volumes/.vmdk’. 

-> info hostd[3F180B70] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 485016 : Virtual machine disks consolidated successfully on in cluster in ha-datacenter. 

So we confirmed that shortly before issue occured VM was powered off for snapshot consolidation – at this point we assumed that this might be related to general issue but we decided to verify vobd log (less verbose than vmkernel and give good view about esxi healt in case of storage and networking) :

uplink.transition.down] Uplink: vmnic0 is down. Affected portgroup: VM Network VLAN130. 2 uplinks up. Failed criteria: 128 Uplink: vmnic0 is down. Affected portgroup: VM Network VLAN8 ESOK2. 2 uplinks up. Failed criteria:128
Lost uplink redundancy on virtual switch “vSwitch0”. Physical NIC vmnic0 is down. Affected port groups: “ISCSI”, “Management Network”

This is it, direct hit to the reason of  main issue – network adapter problem, with new time stamps just for confirmation we went back to fdm.log :

fdm[FFC67B70] [Originator@6876 sub=Notifications opID=SWI-69d87e37] [Notification::AddListener] Adding listener of type Csi::Notifications::VmStateChange: Csi::Policies::GlobalPolicy (listeners = 4)
–> Protected vms (0):
–> Unprotect request vms (0):
–> Locked datastores (0):
–> Events (4):
–> EventEx= vm= host=host-24 tag=host-24:1225982388:0
–> vm= host=host-24 tag=host-24:1225982388:1
–> vm= host=host-24 tag=host-24:1225982388:2

With VMware KB: Determining which virtual machines are restarted during a vSphere HA failover (2036555) – confirm that ha react to network outage and power-on vm on diffrenet not afected esxi.
Conclusion – always correlate problematic vm log with esxi (host,vobd,vmkernel) logs to have full issue picture.


VMware Data Protection 6.1.3 backup of VMS with multiple disks

VMware Data Protection 6.1.3 backup of VMS with multiple disks

Recenly, I found a strange issue with my Customer’s VDP backup. There is VDP version 6.1.3 and vSphere 6.0 Update 2.

The problem is that the backup jobs of a few virtual machines got stuck at 92%. The state was present for more than week at the first try, untill it was manually cancelled.


There is a way to check the actual status of backup via CLI. So did I.

Unfortunatelly the command mccli activity show –active displayed fallowing information:


I went through lots of communities and KBs, extended the memory as one suggested but it didn’t help.

Then, I started to analyse logs more carefully ( of course I verified earlier that the snaps are consolidated, etc.) and realised that the progresss via CLI is going up to 20GB when it reaches about 45% at vSphere client task. But then it stops and nothing more happens even though the tasks progress go up to 92%. The courious thing is that the progress went up to 20GB which is the size of the one of disks in the VM. However, the VM has 2 disks, 20GB each, so there should be a value of 40 GB or 2 entries for each disk. That was a clue.

Then I tried a test to backup a freshly installed VM with 2 disks, despite the seconds disks was empty at all, the backup job get stuck the same way. Next thing was to remove the disk from the VM and voila – the backup jub ended successfully.

That’s mean that the workoardound is to backup only single disk Virtual Machine. But on the official documentation I did not find any restriction or constraint of VDP according to number of disks.

Furthermore I made additional test in my test environment where older version of VDP is present (6.0.3) and there wasn’t any problem with backups of VMs with multiple disks. It worked completely fine there. I reckon it’s a kind of new bug in the 6.1.3 version, I’ll try to check it with VMware Support and let you know.


If you had simmilar issue I’ll be glad to hear about it.

PowerCLI – useful tools

PowerCLI – useful tools

VMware PowerCLI is a powerful tool for daily task for every Admin. The pure console is most commonly used. However, there are a few alternatives to the simple console which could make the use of PowerCLI even more handy. I’ll describe them shortly in the next a few lines.

  1. Powershell ISE script editor, which provides a better user experience. It’s divided into two panes. The upper pane  is for viewing/editing script files, and the lower pane is for running individual commands and displaying their output (an analog of the standard PowerShell console).ISEYou can execute PowerCLI commands in the lower pane of PowerShell ISE or in the PowerShell console. It will be usefull also during the first steps with PowerCLI through analysing of some ready downloaded from Internet scripts.The most convenient way to do is to open a script in the upper pane of Powershell ISE. In this way you can select each individual command and executee by pressing F8 or “Run Selection” button. When the command execution is finished you will see “Completed” message at the bottom of the console.
  2. PowerGUI – it’s another script editor made by Quest Software which was acquired by Dell. powerguiIMHO it’s better organised than PowerShell ISE, the functions are rather simillar. However, instead of tabs with your scripts here you will be able to see the whole folder-tree with different kinds of scripts. It’s really helpful when you work with more than a few scripts.
Infinio Accelerator – how it works?

Infinio Accelerator – how it works?

In my last post about Infinio Accelerator we introduced product and basics about it. Now it is time to go more deep – how this server side cache is working ?

Infinio’s cache inserts server RAM (and optionally, flash devices) transparently into the I/O stream. By dynamically populating server-side media with the hottest data, Infinio’s software reduces storage requirements to a small fraction of the workload size. Infinio is built on VMware’s vSphere APIs for I/O Filtering (VAIO) framework. This enables administrators to use VMware’s Storage Policy Based Management to apply Infinio’s storage acceleration filter to VMs, VMDKs, or groups of VMs transparently.


An Infinio cluster seamlessly supports typical cluster-wide VMware operations, such as vMotion, HA, and DRS. Introduction of Infinio doesn’t require any changes to the environment. Datastore configuration, snapshot and replication setup, backup scripts, and integration with VMware features like VAAI and vMotion all remain the same.



Infinio’s core engine is a content-based memory cache that scales out to accommodate expanding workloads and additional nodes. Deduplication enables the memory-first design, which can be complemented with flash devices for large working sets. In tiered configuration such as this, the cache is persistent, enabling fast warming after either planned or unplanned downtime.


  Note. Infinio’s transparent server-side cache doesn’t require any changes to the environment !

 Lets go with installation – is easy and entirely non-disruptive with no reboots or downtime. It can be completed in just a few steps via an automated installation wizard. The installation wizard collects vCenter credentials and location, and desired Management Console information, then automatically deploys the console :

  1. Run infinio setup and agree to license terms


2. Add vcenter FQDN and user credentials (in example we go with sso admin)


3. Select destination esxi and other parameters to deploy ovf management console vm (datastore and network)


  1. Set management console hostname and network information (IP address, DNS)


  1. Create admin user for management console


  1. setup auto-support (in our trial scenario we skip this step)


  1. Preview config and deploy management console.



  1. Login to management console



In the next article we will provide some real performance result form our lab tests – so stay tuned 🙂


Mysterious Infinio – Product overview

Mysterious Infinio – Product overview

Shared storage  performance and characteristics (iops,latency)  is crucial for overall  vSphere platform performance and users satisfaction. In the advent of ssd and memory cache solutions we have many options to chose in case storage acceleration (local ssd, array side ssd , server side ssd). Lets discuse further server side caching – act of caching data on the server.

Data can be cached anywhere and at any point on the server that makes sense. It is common to cache commonly used data from the DB to prevent hitting the DB every time the data is required. We cache the results from competition scores since the operation is expensive in terms of both processor and database usage. It is also common to cache pages or page fragments so that they don’t need to be generated for every visitor.

In this article I would like to introduce one of the commercial server side caching solution from INFINIO – Infinio Accelerator 3.


Infinio Accelerator increases IOPS and decreases latency by caching a copy of the hottest data on serverside resources such as RAM and flash devices. Native inline deduplication ensures that all local storage resources are used as efficiently as possible,reducing the cost of performance. Infinio is built on VMware’s VAIO (vSphere APIs for I/O Filters) framework,which is the fastest and most secure way to intercept I/O coming from a virtual machine. Its benefits can be realized on any storage that VMware supports; in addition, integration with VMware features like DRS, SDRS, VAAI and vMotionall continue to function the same way once Infinio is installed. Finally, future storage innovation that VMware releases will be available immediately through I/O Filter integration.


The I/O Filter is the most direct path to storage for capabilities like caching and replication that need to intercept the data path. (Image courtesy of VMware)


Infinio is licensed per ESXi host in an Infinio cluster. Software may be purchased for perpetual or term use:

  • A perpetual license allows the use of the licensed software indefinitely with an annual cost for support and maintenance.
  • A term license allows the use of software for one year, including support and maintenance.

For more information on licensing and pricing, contact

System requirements

Infinio Accelerator requires min. VMware vSphere ESXi 6 U2 (Standard, Enterprise,or Enterprise Plus) and VMware vCenter 6 U2.

Note! vSphere 6.5 is supported and on VMware HCL !

Infinio works with any VMware supported datastore, including a variety of SAN, NAS, and DAS hardware supporting VMFS, Virtual Volumes (VVOLs), and Virtual SAN (vSAN).

  • Infinio’s cluster size mirrors that of VMware vSphere’s, scaling out to 64 nodes.
  • Infinio’s Management Console VM requires 1 vCPU, 8GB RAM, and 80GB of HDD space.

I’m very happy to announce that we received very friendly response from Infinio support and we get an option to download trial version of software – next articles will describe product in more depth and show “real life” examples of use in our lab environment.

Please, stay tuned 🙂


VMware Auto Deploy Configuration in vSphere 6.5

VMware Auto Deploy Configuration in vSphere 6.5




The architecture of auto deploy has changed in vSphere 6.5, one of the main difference is the ImageBuilder build in vCenter and the fact that you can create image profiles through the GUI instead of PowerCLI. That is really good news for those how is not keen on PowerCLI. But let’s go throgh the new configuration process of Auto Deploy. Below I gathered all the necessary steps to configure Auto Deploy in your environment.

  1. Enable Auto Deploy services on vCenter Server. Move to Administration -> System Configuration -> Related Objects, look for and start fallowing services:
  • Auto Deploy
  • ImageBuilder Service

You can change the startup type to start them with the vCenter server automatically as well.

Caution! In case you do not see any services like on the screan below, probably vmonapi and vmware-sca services are stopped.ad1

To start them, log in to vCenter Server through SSH and use fallowing commands:

#service-control  – -status         // to verify the status of these services

#service-control  – -start vmonapi vmware-sca       //to start services


Next, go back to Web Client and refresh the page.


  1. Prepare the DHCP server and configure DHCP scope including default gateway. A Dynamic Host Configuration Protocol (DHCP) scope is the consecutive range of possible IP addresses that the DHCP server can lease to clients on a subnet. Scopes typically define a single physical subnet on your network to which DHCP services are offered. Scopes are the primary way for the DHCP server to manage distribution and assignment of IP addresses and any related configuration parameters to DHCP clients on the network.

When basic DHCP scope settings are ready, you need to configure additional options:

  • Option 066 – with the Boot Server Host Name
  • Option 067 – with the Bootfile Name (it is a file name observed at Auto Deploy Configuration tab on vCenter Server – kpxe.vmw-hardwired)


  1. Configure TFTP server. For lab purposes I nearly always using the SolarWinds TFTP server, it is very easy to manage. You need to copy the TFTP Boot Zip files available at Auto Deploy Configuration page observed in step 2 to TFTP server file folder and start the TFTP service.


At this stage when you are try to boot you fresh server should get the IP Address and connect to TFTP server. In the  Discovered Hosts tab of Auto Deploy Configuration you will be able to see these host which received IP addresses and some information from TFTP server, but no Deploy Rule has been assigned to them.


  1. Create an Image Profile.

Go to Auto Deploy Configuration page -> Software Depots tab  and Import Software Depot



Click on Image Profiles so see the Image Profiles that are defined in this Software Depot.


The ESXi software depot contains the image profiles and software packages (VIBs) that are used to run ESXi. An image profile is a list of VIBs.


Image profiles define the set of VIBs to boot ESXi hosts with. VMware and VMware partners make image profiles and VIBs available in public depots. Use the Image Builder PowerCLI to  examine the depot and the Auto Deploy rule engine to specify which image profile to assign to which host. VMware customers can create a custom image profile based on the public image profiles and VIBs in the depot and apply that image profile to the host.


  1. Add Software Depot.

Click on Add Software Depot icon and add custom depot.


Next point in the newly created custom software depot select Image Profiles and click  New Image Profile.


I selected the minimum required VIBs to boot ESXi host which are:

  • esx-base 6.5.0-0.0.4073352 VMware ESXi is a thin hypervisor integrated into server hardware.
  • misc-drivers 6.5.0-0.0.4073352 This package contains miscellaneous vmklinux drivers
  • net-vmxnet3 VMware vmxnet3
  • scsi-mptspi LSI Logic Fusion MPT SPI driver
  • shim-vmklinux-9-2-2-0 6.5.0-0.0.4073352 Package for driver vmklinux_9_2_2_0
  • shim-vmklinux-9-2-3-0 6.5.0-0.0.4073352 Package for driver vmklinux_9_2_3_0
  • vmkplexer-vmkplexer 6.5.0-0.0.4073352 Package for driver vmkplexer
  • vsan 6.5.0-0.0.4073352 VSAN for ESXi.
  • vsanhealth 6.5.0-0.0.4073352 VSAN Health for ESXi.
  • ehci-ehci-hcd 1.0-3vmw.650.0.0.4073352 USB 2.0 ehci host driver
  • xhci-xhci 1.0-3vmw.650.0.0.4073352 USB 3.0 xhci host driver
  • usbcore-usb 1.0-3vmw.650.0.0.4073352 USB core driver
  • vmkusb 0.1-1vmw.650.0.0.4073352 USB Native Driver for VMware

But the list could be different for you.



  1. Create a Deploy Rule.






  1. Activate Deploy Rule


  1. That’s it, now you can restart you host, it should boot and install according to your configuration now.
VMware Auto Deploy considerations

VMware Auto Deploy considerations

According to VMware definitione vSphere Auto Deploy can provision hundreds of physical hosts with ESXi software. You can specify the image to deploy and the hosts to provision with the image. Optionally, you can specify host profiles to apply to the hosts, a vCenter Server location (datacenter, folder or cluster), and assign a script bundle for each host. In short that is the tool to automate your ESXi deployment or upgrade.

As far as I know in particular on the Polish market it is not a widely used tool. However, it can be helpful for Integrator’s Companies to improve and make far more faster deployment of new environments. Furthermore, VMware claims the scripted or automated deployments should be used for every deployment with 5 or more hosts. Nonetheless, even if you are woring as a System Engineer or  at other implementation position I believe you are not installing new deployments every week..If that is every month – lucky you.

Well, is it really worth to prepare the AutoDeploy environment to deploy for instance 8 new hosts? – It depends.

IMHO, for such small deployments if you are really keen on making it a little bit fater the better way is to use kickstarts scripts. It can be much faster, expecially in case you are using them at least from time to time and you have prepared a good template (According the vSphere 6.5 I’m changing my mind a little bit due to changes which make AutoDpeloy preparation far more quicker)

However, Auto Deploy that’s not only deployment. It can be a kind of environment and change management. That can only be a specific kind of infrastructure where you use AutoDeploy to boot ESXi hosts instead of booting from local hard drives/SD cards.

Nevertheless, in Polands it is easier to meet classic PXE deployment booting from SAN than AutoDeploy. Is it the same trend seen around the world?

I am looking forward to hearing from you about yours experience with Auto Deploy.

VM Consolidation – Survival Guide

VM Consolidation – Survival Guide


Survival guide for any vm snaphost consolidation problems all in one place :

Note! Make sure any backup software is turned off or that all jobs are stopped. A reboot of the backup server is required to clear any potential residual locks.

  1. Restart vc service –
  1. Restart the management agents on the ESXi cluster where problematic vms are working restart   –, or manually verify to determine “who” is holding the lock

3. Use vmfstools (-D) command against vm snapshot files:

/vmfs/volumes/<datastore># vmkfstools -D <file name>
You see an output similar to:

[root@test-esx1 testvm]# vmkfstools -D test-000008-delta.vmdk
Lock [type 10c00001 offset 45842432 v 33232, hb offset 4116480
gen 2397, mode 2, owner 00000000-00000000-0000-000000000000mtime 5436998]<————–MAC address of lock owner
RO Owner[0] HB offset 3293184 xxxxxxxx-xxxxxxxx-xxx-xxxxxxxxxxxx <——————————MAC address of read-only lock owner
Addr <4, 80, 160>, gen 33179, links 1, type reg, flags 0, uid 0, gid 0, mode 100600
len 738242560, nb 353 tbz 0, cow 0, zla 3, bs 2097152

//more information in kb:

If  esxi holding lock you can restart mgmt agents as per above advice or migrate all vms and reboot host or determine which process is holding the lock – just run one of these commands:

# lsof file

# lsof | grep -i file

For example:

# lsof | grep test02-flat.vmdk

You should see an output similar to:


71fd60b6- 3661 root 4r REG 0,9 10737418240 23533 Test02-flat.vmdk

Check the process with the PID returned in above, in our  example:

# ps -ef | grep 3661

to kill the process, run the command:

# kill

All in all when we solve “locks” problems we can continue vm consolidation process :

  1. Connect to the ESXi where is problematic vm directly
  1. Power off problematic vm
  1. Disable CBT for the virtual machine (very ofter ctk files are corrupt, for example we run backup job on vm with active snapshot – this is unsupported config) For more information, see:

6.Remove  any files ending with the *-ctk.vmdk file extension in the virtual machine directory.

  1. Enable CBT for the virtual machine again, see:
  1. Remove and add vm to inventory (just to verify vm configuration integrity, in case any vmx problems you got error message and you need correct vm config), more information in kb:
  1. Create a snapshot:

Right-click the virtual machine.

Click Snapshot.

Click Take Snapshot.

  1. Perform a Delete All operation:

Right-click the virtual machine.

Click Snapshot.

Click Snapshot Manager.

Click Delete All.

TIP:  To verify snapshots are rejoining run the commands:

#watch “ls -lhut –time-style=full-iso *-delta.vmdk”

#watch “ls -lh –full-time *-delta.vmdk *-flat.vmdk”

//more info in kb:

  1. Power on vm and verify fix


However if above do not work/solve the problem we have two alternate options:

  1. a) clone or storage vmotion problematic vm’s to different datastore
  1. b) use VMware converter and perform v2v operation

That’s it – my survival guide for any vm snapshot consolidation problems – wondering if you have any add ons or different approach view ?

VCP Datacenter 6.5 Beta

VCP Datacenter 6.5 Beta

VMware announced the VCP Datacenter beta exam updated according to new features in vSphere 6.5.

It is expecially valuable for those who need to recertify or who do not have a valid VCP certification. People who already have active VCP6-DCV exam, passing this beta exam will not give any big reward, while even the title is the same. So there won’t be VCP6.5-DCV or so, it’s still VCP6-DCV Certification.

Anyway, as others VMware beta exam it costs only 50$, so it’s worth to consider it if your certification is going to expire soon.

Just keep in mind that beta exams contains far more questions than normal exams 🙂 In this case that’s 150 questions and 180 minutes.

Good luck!

Adding a sound card to ESXi hosted VM

Adding a sound card to ESXi hosted VM

Sound Card in vSphere Virtual Machine is an unsupported configuration. This is feature dedicated to Virtual Machines created in VMware Workstation. However, you can still add HD Audio device to vSphere Virtual Machine by manually editing .vmx file. I have tested it in our lab environment and it works just fine.

Below  procedure how to do this:

1. Verify storage where VM with no soundcard reside


  1. Login with root to the ESXi host where VM reside using SSH.
    3. Navigate to /vmfs/volumes/<VM LUN>/<VM folder>
    In my example it was:
    ~# cd /vmfs/volumes/Local_03esx-mgmt_b/V11_GSS_DO
    4. Shut down problematic VM
    5. Edit .vmx file using VI editor.

Make a backup copy of the .vmx file. If your edits break the virtual machine, you can roll back to the original version of the file.
More information about editing files on ESXi host, refer to KB article:

  1. Once you have open vmx to edit, navigate to the bottom of the file and add following lines to the .vmx configuration file:
    sound.present = “true”
    sound.allowGuestConnectionControl = “false”
    sound.virtualDev = “hdaudio”
    sound.fileName = “-1”
    sound.autodetect = “true”
  2. Save file and Power-On Virtual machine.
  3. Once it have booted, and you have enabled Windows Audio Service, sound will work fine.

If you go to “Edit Settings” of the VM, you can see information that device is unsupported. Please be aware that if after adding sound card to you virtual machine, you may exprience any kind of unexpected behavior (tip: in our lab env work this config without issues).