The last time we looked at the first layer of virtualized storage to see what knobs we could turn without inflicting too much pain (or tears in peeling the onion). Let’s see what runs underneath and what additional control points we could use in assuring workload performance.
Storage management – more gain or more pain?
We will be looking at typical storage infrastructure available from vendors like NetApp, EMC, etc. It normally is composed of standard components: filers or storage controllers and disk aggregates. One storage controller can manage multiple storage aggregates that host disks arranged into RAID groups which then are combined into storage pools. Then these storage pools can be partitioned into LUNs and offered as volumes to be mapped to the virtualized datastores we discussed earlier.
This is where the actual storage magic happens. RAID groups stripe individual disk spindles together providing redundancy and/or additional capacity and performance. Storage controllers perform many maintenance and management functions like disk scrubbing, compression, deduplication etc. So when a VM that uses a virtualized datastore mapped to a NetApp volume and one of the aggregates is experiencing latency there could be multiple reasons for that.
The most common one is that the RAID group hosting this volume just doesn’t have enough IOPS capacity to satisfy the combined demand of all VMs using it. A typical SATA drive can handle 80 IOPS with 4K blocks – even if you combine 10 such drives to get the effective capacity of 800 IOPS but have 100 VMs consuming 10 IOPS each you are already above what this volume can handle. What makes it acceptable is that not every VM uses its 10 IOPS all the time; however, when they start peaking together the disk drives will be struggling and VMs will start experiencing latencies.
So, what to do in this situation? First, if you know how the datastores are mapped to RAID groups you could identify a datastore mapped to a less-loaded RAID group and move some VMs there. We already discussed the challenge of selecting the right VMs to move – but let’s assume we figured it out. Now we are moving 5 VMs to a datastore where the resulting IOPS consumption and latency will be better. However, the actual topology could be more complex. Several volumes could be mapped to a single overloaded RAID group and while you could move individual VMs across datastores it may take visible time – you are moving VMs one-by-one using the virtualized layer which adds additional overhead.
If there is a less-loaded RAID group and you could measure the latencies that the entire volume experiences on the aggregate side you could move this volume to another aggregate as a whole. It will be done by the storage controller for all VMs running there, will be much faster, but in order to determine if this action is better you need to look at expenses and benefits looking at both options.
For example, the IOPS consumption of the target aggregate could be visibly less than the overloaded one but the latencies still seem to be pretty high. It could be that the filer is performing some scrubbing or compression and slowing down access to the drives. So in this case moving a volume within the same filer won’t accomplish much. If you are running NetApp clustered data ONTAP you could move the volume across storage controllers if there are less loaded ones. Or if you could identify shared datastores mapped to an aggregate managed by a less-loaded controller, moving individual VMs could be a better option.
Let’s assume that’s the case, we identified such a datastore (you know all these relationships and metrics to make this decision, right?) and found 5 VMs which should make the performance of the entire workload better. But these VMs happened to be pretty fat and when you move them the storage controller will likely perform some snapshot partition management. Usually every volume has a special snapshot partition which keeps periodic snapshots of all VM images so you can always roll back. Every time there is a change in VM images, they are accumulated and then added to a next snapshot. When this partition becomes full, it can overflow into the main storage area and what looked like a pretty healthy volume will suddenly run out of space. Also moving several fat VMs can trigger recalculation of the deduplication working set which can consume more space now and also cause additional load on the storage controller and cause extra latencies.
As we see, while we added more knobs we also added more variables and their combination to watch – and if we don’t take all of them into account, the results of our action can cause more harm than good.
So the deeper we go into these layers, the more we're exposed to the complexity of storage management. You wish you never went there, it could be really overwhelming for manual operations. Are there solutions which can do this for you and reduce the risks, complexities and tears while peeling this onion?
Image source: The Mark Wahlberg-Dwayne Johnson Magnum Opus, Pain & Gain