Browsed by
Category: Storage

General vSAN Error

General vSAN Error

vSAN is a wonderful shared storage option in a vSphere cluster, but it requires an administrator with deep product knowledge and overall awareness to be able to manage it with an understanding of its quirks and gotchas. I’ve worked with several vSAN clusters composed of many nodes for a few years now but sometimes it still surprises me. I’ve recently spent a couple of hours troubleshooting a “General vSAN Error” to figure out why I couldn’t put a host in Maintenance Mode. Finally I found out that it was done on purpose. I decided to describe my experience to help others to resolve their vSAN issues.

Usually, if I want to check some scenario as quickly as possible, I use one of the VMware Hands On Labs environments, which I reconfigure just as I need it. This time I used “HOL-2008-01-HCI – vSAN – Getting Started”. It is based on the 6.7 version. I know it’s not a current vSAN version, but it is mature enough to use it for testing. I wanted to check how a three-node cluster would behave if I put one of the nodes in Maintenance Mode choosing “Full data migration” as a data evacuation option. A VM which was run in the cluster used “vSAN Default Storage Policy”. The task quickly failed after it started, with an error message “General vSAN error”. I immediately checked if there was enough storage space left on disks of the remaining nodes and there was. A “CORE-A” VM was consuming just 492.1 MB from almost 60 GB of vSAN datastore. Even if I put one host in Maintenance Mode, it would be enough storage space from the remaining two nodes. I decided to confirm this conclusion, so I opened a SSH session to vCenter Server Appliance (vCSA). I ran these commands:

rvc administrator@corp.local@vcsa-01a.corp.local
vsan.whatif_host_failures -s 1/RegionA01/computers/RegionA01-COMP01/

It showed me which percent of storage space was used per node and how these numbers would change after a simulated failure of 1 node. It didn’t look suspicious.

Next, I checked “Task Console” in vSphere Client to find any clues. A description added to the error message confused me: “Evacuation precheck failed – Retry operation after adding 1 nodes with each node having 1 GB worth of capacity.” and I ignored it without thinking. I dived into to find any clues there.
I quickly found this article: “out of resources” error when entering maintenance mode on vSAN hosts with large vSAN objects (2149615).
This got my attention to vSAN’s clomd service. I decided to check /var/log/clomd.log. I opened a SSH session to an ESXi host and found in last four consecutive lines that decommission operation was started and it changed its state as shown below:


Also, I decided to find if there were any known problems with decommissioning nodes from vSAN clusters. I quickly found another article: “vSAN Host Maintenance Mode is in sync with vSAN Node Decommission State (51464)” and I used this recommended command to check if there were any problems in vSAN database with node decommissioning:

cmmds-tool find -t NODE_DECOM_STATE -f json | grep ‘uuid\|decomState’

The results showed that values for decomState key were equal to zero. It indicated that there weren’t any problems with background decommission operation which froze.

Then, I decided to find any traces in VMware’s community resources. I easily found that my issue was well known and there were some solutions.
In the post titled “A general system error occurred: Operation failed due to a VSAN error. Another host in the cluster is already entering maintenance mode” I found out that I should try to break any Maintenance Mode entering operations using this command:

localcli vsan maintenancemode cancel

In order to put a host into Maintenance Mode I should use this command:

localcli system maintenanceMode set -e true -m noAction

I found it useful, but putting a host into Maintenance Mode without data evacuation wasn’t what I was looking for.

Finally, desperately I decided to search the product documentation to find the answers. And my life got easier from the first hit. In vSAN documentation in the article titled “Place a Member of vSAN Cluster in Maintenance Mode” I found this definition of the available data evacuation options:

Ensure accessibility – “This is the default option. When you power off or remove the host from the cluster, vSAN ensures that all accessible virtual machines on this host remain accessible. Select this option if you want to take the host out of the cluster temporarily, for example, to install upgrades, and plan to have the host back in the cluster. This option is not appropriate if you want to remove the host from the cluster permanently.
Typically, only partial data evacuation is required. However, the virtual machine might no longer be fully compliant to a VM storage policy during evacuation. That means, it might not have access to all its replicas. If a failure occurs while the host is in maintenance mode and the Primary level of failures to tolerate is set to 1, you might experience data loss in the cluster.”

And finally the most important note was this one:

“This is the only evacuation mode available if you are working with a three-host cluster or a vSAN cluster configured with three fault domains.”

The rest of the definitions you can read there, but what I read was the explanation I was looking for.

If you use a three-node vSAN cluster and want to put a host in Maintenance Mode to be able to do any service activities, you don’t have an option to fully protect hosted VMs. It can be done by using at least 4 nodes in the cluster.

Remember folks, the old rule “RTFM” still counts!

VSAN real capacity utilization

VSAN real capacity utilization

There are a few caveats that make the calculation and planning of VSAN capacity tough and gets even harder when you try to map it with real consumption on the VSAN datastore level.

  1. VSAN disks objects are thin provisioned by default.
  2. Configuring full reservation of storage space through Object Space Reservation rule in Storage Policy, does not mean

disk object block will be inflated on a datastore. This only means the space will be reserved and showed as used in VSAN Datastore Capacity pane.

Which makes it even harder to figure out why size of “files” on this datastore is not compliant with other information related to capacity.

  1. In order to plan capacity you need to include overhead of Storage Policies. Policies – as I haven’t met an environment which would use only one for all kinds of workloads. This means that planning should start with dividing workloads for different groups which might require different levels of protections.
  1. Apart from disks objects there are different objects especially SWAP which are not displayed in GUI and can be easily forgotten. However, based on the size of environment they might consume considerable amount of storage space.
  1. VM SWAP object does not adhere to Storage Policy assigned to VM. What does it mean? Even if you configure your VM’s disks with PFTT=0

SWAP will always utilize PFTT=1. Unless you configure advanced option (SwapThickProfivisionedDisabled) to disable it.

I have made a test to check how much space will consume my empty VM. (Empty means here without operating system even)

In order to see that a VM called Prod-01 has been created with 1 GB of memory and 2 GB of Hard disk and default storage policy assigned (PFTT=1)

Based on the Edit Setting window the VM disk size on datastore is 4 GB (Maximum sized based on disk size and policy). However, used storage space is 8 MB which means there will be 2 replicas 4 MB each, which is fine as there is no OS installed at all.

VMka wyłączona

However, when you open datastore files you will see this list with Virtual Disk object you will notice that the size is 36 864 KB which gives us 36 MB. So it’s neither 4 GB nor 8 MB as displayed by edit setting consumption..vsan pliki

Meanwhile datastore provisioned space is listed as 5,07 GB.

vmka dysk 2GB default policy i 1GB RAM - wyłączona


So let’s power on that VM.

Now the disks size remain intact, but other files appear as for instance SWAP has been created as well as log and other temporary files.

VSAN VMKa wlaczona


Looking at datastore provisioned space now it shows 5,9 GB. Which again is confisung even if we forgot about previous findings powering on VM triggers SWAP creation which according to the theory should be protected with PFTT=1 and be thick provisioned. But if that’s the case then the provisioned storage consumption should be increased by 2 GB not 0,83 (where some space is consumed for logs and other small files included in Home namespace object)


vmka dysk 2GB default policy i 1GB RAM - włączona

Moreover during those observations I noticed that during the VM booting process the provisioned space is peaking up to 7,11 GB for a very short period of time

And this value after a few seconds decreases to 5.07 GB. Even after a few reboots those values stays consistent.

vmka dysk 2GB default policy i 1GB RAM - podczas bootowania

The question is why those information are not consistent and what heppens during booting of the VM that is the reason for peak of provisioned space?

That’s the quest for not to figure it out 🙂



Perennially reservations weird behaviour whilst not configured correctly

Perennially reservations weird behaviour whilst not configured correctly

Whilst using RDM disks in your environment you might notice long (even extremely long) boot time of your ESXi hosts. That’s because ESXi host uses a different technique to determine if Raw Device Mapped (RDM) LUNs are used for MSCS cluster devices, by introducing a configuration flag to mark each device as perennially reserved that is participating in an MSCS cluster. During the start of an ESXi host, the storage mid-layer attempts to discover all devices presented to an ESXi host during the device claiming phase. However, MSCS LUNs that have a permanent SCSI reservation cause the start process to lengthen as the ESXi host cannot interrogate the LUN due to the persistent SCSI reservation placed on a device by an active MSCS Node hosted on another ESXi host.

Configuring the device to be perennially reserved is local to each ESXi host, and must be performed on every ESXi host that has visibility to each device participating in an MSCS cluster. This improves the start time for all ESXi hosts that have visibility to the devices.

The process is described in this KB  and is requires to issue following command on each ESXi:

 esxcli storage core device setconfig -d –perennially-reserved=true

You can check the status using following command:

esxcli storage core device list -d

In the output of the esxcli command, search for the entry Is Perennially Reserved: true. This shows that the device is marked as perennially reserved.

However, recently I came across on a problem with snapshot consolidation, even storage vMotion was not possible for particular VM.

Whilst checking VM settings one of the disks was locked and indicated that it’s running on a delta disks which means there is a snapshot. However, Snapshot manager didn’t showed any snapshot, at all. Moreover, creating new and delete all snapshot which in most cases solves the consolidation problem didn’t help as well.


In the vmkernel.log while trying to consolidate VM lots of perenially reservation entries was present. Which initially I ignored because there were RDMs which were intentionally configured as perennially reserved to prevent long ESXi boot.


However, after digging deeper and checking a few things, I return to perenially reservations and decided to check what the LUN which generates these warnings is and why it creates these entries especially while trying consolidation or storage vMotion of a VM.

To my surprise I realised that datastore on which the VM’s disks reside is configured as perenially reserved! It was due to a mistake when the PowerCLi script was prepared accidentially someone configured all available LUNs as perenially reserved. Changing the value to false happily solved the problem.

The moral of the story is simple – logs are not issued to be ignored 🙂

vSphere 6.5 – Stronger security with NFS 4.1

vSphere 6.5 – Stronger security with NFS 4.1

NFS 4.1 is been supported since vSphere 6.0 and  but now we are looking into providing stronger security. In vSphere 6.5 we have better security  by providing strong cryptographic algorithms with Kerberos (AES). Also, IPV6 is supported but not with Kerberos and that is another area we are looking into along with supporting integrity checks.

Aa we know vSphere 6 NFS client also does not support the more advanced encryption type know is AES. So lets take a look at what is new in vSphere 6.5 NFS in terms of encryption standard :


  • NFS 4.1 has been supported since vSphere 6.0 ,
  • Currently support stronger cryptographic algorithms with Kerberos authentication using AES ,
  • Introducing Kerberos integrity check (SEC_KRB5i) along with Kerberos authentication in vSphere 6.5,
  • Adding Support IPV6 with Kerberos ,
  • Added Host Profiles support for NFS 4.1 ,
  • Better security for customer environments .


vSphere 6.5 – New scale limits for paths & LUNs

vSphere 6.5 – New scale limits for paths & LUNs

In vSphere 6.5 VMware  doubled  the  current limits and continuously work on reaching new scale around this . Current limits (before 6.5) pose challenge as for example in some cases our customers have 8 paths to a LUN, in this configuration one can have max of 128 LUNs in a cluster. Also, many of the customers tend to have smaller size LUNs to segregate important data for easy backup and restore. This approach can also exhaust current LUN and Path limits.

Large LUN limits  enable  to have larger cluster sizes and hence reducing management over head.storage4


  • Current Limit is 256 LUNs and 1024 Paths ,
  • This limits customer deployments requiring higher Path counts ,
  • Customers requiring small sized LUNs for important files/data require larger LUN limits to work with ,
  • Larger Path/LUN limits can enable larger cluster sizes, reducing the overhead of managing multiple clusters ,
  • Support 512 LUNs and 2K paths in vSphere 6.5 .


vSphere 6.5 – Automatic UNMAP

vSphere 6.5 – Automatic UNMAP

In vSphere 6.5 VMware are looking into automating the UNMAP process, where VMFS  would track the deleted blocks and will be able to reclaim deleted space from the backend array in back ground. This background operation should make sure that there is a minimal storage I/O impact due to UNMAP operations.


Just to remaind – UNMAP is a VAAI primitive using which we can reclaim dead or stranded space on thinly provisioned VMFS volume. Currently this can be initiated by running a simple ESX CLI command and it can free up deleted blocks from storage.

In vSphere 6.5 VMware is looking into automating the UNMAP process, where VMFS  would track the deleted blocks and will be able to reclaim deleted space from the backend array in back ground. This background operation should make sure that there is a minimal storage I/O impact due to UNMAP operations.

Lets go  though an UNMAP example stating our thought process

  1. VM is being provisioned on a vSphere host and assigned a 6 TB VMDK.
  2. There will be thin provisioned VMDK storage space allocated on storage array.
  3. User installs a POC data analytics application and creates a 400 GB database VM
  4. Once the work is done with this database, user deletes this DB VM – VMFS  initiate a space reclamation in the back ground
  5. 400GB space on the array side should be freed or claimed back

One of the design goal will be to make sure there is minimal impact due to UNMAP on storage I/O. We are also looking into using new SESparse format as a snapshot file format to enable this.

Space reclamation is critical when customers are using All Flash storage due to higher cost of Flash and any storage usage optimization will provide better ROI for customers


  • Automatic UNMAP does not require any manual intervention or scripts
  • Space reclamation happens in the background
  • CLI based UNMAP continues to be supported
  • Storage I/O impact due to automatic UNMAP is minimal

Supported in vSphere 6.5 with new VMFS 6 datastores

vSphere 6.5 – VMFS6 & 512e HDD support

vSphere 6.5 – VMFS6 & 512e HDD support

vSphere 6.5 introduces a new VMFS 6 – but why we need new version You ask? –answer: to support new hdd type, and this  point  us to current storage market situation . Well because with  512bytes sector size HDD’s  vendors are hitting drive capacity limits. They can not go beyond a certain size without compromising the resilience and reliability (not the best option in case of our data).             To provide large capacity drives, Storage Industry is moving forward to Advance format (AF) drives. These drives use large physical sector size of 4096 bytes.


So how does it help? With new AF (4K sector size) format, Disk drive vendors can create more reliable and large capacity HDD to support the growing storage needs. These drives are more cost effective as they provide better $/GB ratio.

Two kinds of 4k drives:

  1. 512 Emulation (512e) mode – these are 4KN drives but expose logical sector size as 512 and have physical sector size as 4K. This mode is important as it continues to work with legacy OS and application and provide large capacity drives. Main disadwantage with these drives is that they will trigger a RMW for storage I/O smaller than 4K. This RMW happens in drive firmware and may have some performance impact in cases where large # of storage IO are smaller than 4K


  1. 4KN Drives – these drives expose logical sector size and physical sector size as 4K. This drives can not work with legacy OS and application. Whole of the stack from vm guest OS to ESXi to Storage has to be 4KN

Lets now  look at a few advantages of 4k drives.

  • 4K drives require less space for error correction codes than regular 512-byte sector drives . This result in greater data density on 4k drives which provides a better TCO(total cost of ownership) ,
  • 4K drives have a larger ECC field for error correction codes and so inherently provide better data integrity,
  • 4k drives are expected to have better performance than the current 512n drives. However this is only true when the guest OS has been configured to issue I/Os aligned to the 4K sector size.



  • VMFS 5 does not support 4k drives even in emulation mode.If a 512e drives is formatted with VMFS-5 it is still recognized but this configuration is not supported by Vmware,
  • VMFS-6 is designed from the ground up to support AF drives in 512e mode,
  • VMFS-6 metadata is designed to be in alignment with the 4k sector size,
  • 512e drives can only be used with VMFS-6.
Increase VMware ESXi iSCSI storage performance ? – lets demistyfy all tips and tricks

Increase VMware ESXi iSCSI storage performance ? – lets demistyfy all tips and tricks


Before we start I would like to describe main motivation to write this article which is quite simple – to gather in one place all basic theoretical background about iscsi protocol and best practices at its implementation on vSphere platform with special consideration about potential performance tuning tips & tricks . This is first part of the series where We (I’m counting on readers participation) try to gather and verify all this “magical” parameters often treated as myths by many Admins.

To begin let’s start from something boring but as usual necessary 😉 … theoretical background.

iSCSI is an network based storage standard that enable connectivity between iSCSI initiator (client) and target (storage device) over well known IP network. To explain this storage standard in very simple way we can say that SCSI packets are encapsulated in IP packet and sent over traditional TCP/IP network where targets and initiators can de-encapsulate TCP/IP datagrams to read SCSI commands. We have couple options in case of implementation this standard because TCP/IP network model components transporting SCSI commands can be realized at software and/or hardware layer.


Important iSCSI standard concepts and terminology:

  • Initiator – functions as an iSCSI client. An initiator typically serves the same purpose to a computer as a SCSI bus adapter would, except that, instead of physically cabling SCSI devices (like hard drives and tape changers), an iSCSI initiator sends SCSI commands over an IP network. Initiators can be divided into two broad types:
    • A software initiator implement iSCSI using code component that use existing network card to emulate SCSI device and communicate thru iSCSI protocol. Software initiators are available for most popular operating systems and are the simplest and best economic method of deploying iSCSI.
    • A hardware initiator based on dedicated hardware, typically use special firmware running on that hardware and implementing iSCSI above network adapter acting as HBA card in server. Hardware decrease CPU overhead of iSCSI and TCP/IP processing that is why it may improve the performance of servers thet use iSCSI protocol to communicate with storage devices.
  • Target – functions as resource located on an iSCSI server, most often dedicated network connected storage device (well known as storage array) that provide target as access gateway to its resources. But it may also be a “general-purpose” computer or even virtual machine – because as with initiators iSCSI target can be realized at software layer.
  • Logical unit number – in iscsi terms LUN stands for logical unit and is specified by unique number. A LUN is representation of an individual SCSI (logical) device that is provided /accessible thru target. After iscsi connection is establish (emulate connection to scsi hdd) initiators treat iSCSI LUNs as they would a raw SCSI or IDE hard drive. In many deployments LUN usually representing part of large RAID (Redundant Array of Independent Disksdisk) array, it leaves access to underlying filesystem – regarding of the operating system that use it.
  • Addressing – iSCSI uses TCP/IP pots (usual 860 and 3260) for the protocol to name objects use to address it with special names refer to both iSCSI initiators and targets. iSCSI provides name-formats:
    • iSCSI Qualified Name (IQN)
      • iqn -iSCSI qualified name
      • datethat the naming authority took ownership of the domain
      • reversed domain name of the authority
      • Optional “:” prefixing a storage target name specified by the naming authority.
    • Extended Unique Identifier (EUI)

Format: eui.{EUI-64 bit address} (eui.xxxxxxxxx)

  • T11 Network Address Authority (NAA)

Format: naa.{NAA 64 or 128 bit identifier} (naa.xxxxxxxxxxx)

Note : IQN format addresses occur most commonly.

  • iSNS – iSCSI initiators can locate appropriate storage resources using theInternet Storage Name Service (iSNS) protocol. iSNS provide provides iSCSI SANs with the same management model as dedicated Fibre Channel  In practice, administrators can implement many deployment goals for iSCSI without using iSNS.

iSCSI protocol is over IETF responsibility – to have more information please see RFC 3720, 3721, 3722, 3723, 3747, 3780,3783, 4018,4173,4544,4850,4939, 5046, 50475048,7143



And finally for those who dare to read all boring theory part – main dish: my performance“ tips and tricks” list to demystify in this blog series journey :

  1. iSCSI initiator (hardware or software) queue depth:

        //example for softoware iscsi initiator

#esxcfg-module -s iscsivmk_LunQDepth=64 iscsi_vmk

  1. Adjusting Round Robin IOPS limit :

        //example for max iops and bytes parameter

#esxcli storage nmp psp roundrobin deviceconfig set -t=iops -I=10 -d=naa.xxxxxxxxxxxx

#esxcli storage nmp psp roundrobin deviceconfig set -t=bytes -B 8972 -d=naa.xxxxxxxxxxx

     3. NIC/HBA Driver and firmware version on esxi hypervisor


    4. Using jumbo frames for iSCSI


    5. Controlling LUN queue depth throttling

//example based on kb:

#esxcli storage core device set –device naa.xxxxxxxxxx–queue-full-threshold  8 –queue-full-sample-size 32

    6. Delay ACK enable /disable


    7. Port binding considerations use / not use



On next article I will try to gather all ESXi hypervisor layer configuration level best practices and describe test environment and test methodology.

So let’s end this pilot episode with open question – is it worth to use/implement any of them in vSphere environment ?