Today my good friend Duncan Epping surprised me by shooting me an IM notifying me that he was posting on the impact of partition misalignment and asked if I’d care to comment. Fortunately for me I had time to do just that and this post is an expanded version of what I shared in the comments section at YellowBricks.

I’d suggest that anyone interested begin at YellowBricks before proceeding.

Alignment of the data in a virtual infrastructure to the storage array is critical to performance, scaling, hardware life cycle, and storage efficiencies. The lack of alignment results in an array retrieving more data than what the VM is requesting. This results in inefficiencies on the array that leads to requiring more storage hw resources to serve a workload.

Did You Know…

Misalignment can be found with VMFS datastores and inside of VMs?

There was an old issue with VMFS file systems created in Virtual Center 1.x did not align the VMFS partition to a 128KB offset. I wish I had the KB # at hand. If your datastore is misaligned it does not get corrected with a VMFS upgrade. If you have these old datastores the best course of action is to migrate the VMs and destroy the datastore.

Misalignment within Virtual Machines

To my knowledge the only storage vendors to publish content around the importance of alignment have been EMC & NetApp. In fact, EMC has even updated their incorrect data around the lack of needing to align on NFS (kudos to Chad Sakac of EMC for getting this erroneous data corrected).

I would suggest that anyone interested in understanding more on this issue read the NetApp technical report TR-3747. The content in this document was reviewed and approved by VMware, Microsoft, Citrix, and NetApp.

As for the VI3 document that Duncan referenced in his post, I have some concerns. First it only recommends aligning the virtual disks outside of the VM’s systems drive. It reasoning listed is around a view that the systems drive does not have a high I/O requirement. While I can agree on the merits of this point I disagree on the recommendation. First aligning system drives is hard to accomplish once the system is deployed. I believe this may be closer to the actual reason for the recommendation.

Further more, if one does not align the system partition, the array still has to work harder. Imagine the impact of misaligned data when one reboots a number of VMs say with SRM or View? Misalignment also has a negative impact on data deduplication, which manifests itself in seeing a reduction in the storage savings over time. he reasoning for this is misalignment results in the data being stored in each VM to not be identical on the array and as such the dedupe savings are reduced. As an example, if one deploys a service pack to many VMs.

I believe it is fair to say the premise of misalignment impacting NetApp more than other arrays is over simplified, allow me to elaborate.

What is GOS Type?

I know Windows rather well, and as such I will speak to this OS family.

First modern GOS types like Windows 7, Vista, 2008 implement GPT versus MBR and as such have a 1MB starting partition offset (versus the traditional 32,256 byte with MBR). The 1MB offset is aligned and optimized for every storage array vendor, protocol, and platform. I would like to thank Microsoft for listening to their storage partners input when they began engineering GPT.

If your VMs run Windows NT – 2003 then they most likely are misaligned due to the default starting offset of 32,256 found with MBR partitions. Also, if you upgrade a VM from one of these versions to Windows 7 or 2008 the starting partition offset will remain unchanged.

See TR-3747 for more on this point. It is also covered in TR-3428 (VI3) & TR-3749 (vSphere).

 

So Why Bother to Align Existing VMs

It’s simple, consider GOS partition alignment a standard for clouds and virtual infrastructures. When you align you ensure the best performance for VMs on any storage platform, over any storage protocol whether it is an internal cloud or an external cloud provider. Isn’t one of the goals of server virtualization hardware independence?

I have discussed this issue in a post titled: I/O Efficiency & Alignment – the Cloud Demands Standards

What Array and Storage File System are You Storing VMs on?

A NetApp array stores data in a 4KB block whether the data is served via SAN or NAS. So if the GOS partition is misaligned then should a VM make a 4KB read request we will read 2 4KB blocks (or 8KB). Most data reads aren’t that small and as such don’t have a 100% read overhead (4KB / 8KB). Say the VM makes a 1MB read request, then we would retrieve 1MB plus an additional 4KB block. In this case the overhead on the array is less than 1%. I would ‘guesstimate’ that most non-busy VMs make requests more in the range of 32KB to 128KB, and as such the overhead for misalignment would be around 7% to 10%.

Storage arrays from other vendors store data in other block, or chunk, sizes. Say your array stores data in a 64 KB block (Maybe EMC can confirm this is the size of the storage chunk used in a Symmetrix with LUNs). In this configuration if the GOS partition is misaligned then should a VM make a 4KB read request we will read a 64KB block. As I’ve stated before most data reads aren’t that small, so let’s consider a 1MB read request. In this case the array would retrieve 1MB plus an additional 64KB block. In this case the overhead on the array is around 1%. So if we consider my premise that many non-busy VMs make requests in the 32KB to 128KB range the overhead with a misaligned 64KB block would be between 200% and 50%.

While ESX/ESXi does aggregate its read requests they are ‘decoupled’ when they hit the array and as such do actually experience this read inefficiencies.

Bottom Line

Pleas align the GOS partitions in your VMs. I’d suggest by correcting your templates and your P2V migration process. NetApp provides a free tool, MBRAlign, which can align the partitions for most GOS types. We also provide MBRScan, which can perform an audit of your currently deployed VMs providing you with feedback as to the state of our current deployment. These tools are only supported on the service console on ESX hosts (but some may have fond a way to run on ESXi).

Alternatively, one can check out tools like vOptimizer pro for Vizioncore, which provides a more robust interface with scheduling and reporting mechanisms.

Thanks to Duncan for raising awareness on this topic and to Chad for correcting the recommendations for Celerra NFS datastores.