It is difficult to decide on the amount of hardware needed for your datacenter to assure application performance. In addition to the huge complexity we discussed before, I would like to concentrate on a single consideration – peak utilization demand.
Problem: Need to assure peak demand performance
Many organizations have seasoning demand. It can be something as simple as nightly batch reports, end of week / month / year processing; perhaps you are in the pizza business and it’s the super bowl. By nature, these are important applications for the organization and you would like to plan your capacity accordingly for the peak.
One way to avoid this is to burst to the cloud on a demand base basic (rent hardware when the demand justifies it). However, many times that is simply not a possibility. Hence, you must buy hardware that will satisfy the business year round.
Solution: Over-provision (….Right?)
Administrators typically address this challenge by taking the safer approach: allocating more resources than a VM needs. There are many advanced ways to do this, and it is almost never as simple as looking at the peak of memory and making decisions accordingly. However, no matter how intelligent the calculation is, it will always be a way of summing up the peaks and buying hardware for those peaks.
This seems like a safe assumption since if for example, your VMs need in total during peak demand 140 GB of memory; you can buy 10 servers with 16 GB of memory each. Now you have 160 GB of memory that can support a peak demand of 140 GB of memory. You are probably all set.
Over-provisioning Isn’t Enough, Performance Issues Will Still Occur
Over-provisioning for the peaks doesn't prevent workloads interfering with each other. Funny thing about peak demand, it tends to happen together for different workload groups. In addition, although predictable to some extent – there are many unknowns and what happened last end of quarter, isn’t necessarily (sometimes isn’t at all) an indication of workload demand for the end of next quarter.
When you sum up the peak, you do so for a cluster (or a data center). It would be unwise to do so for a single host since VMs typically don’t stay tied up to a single host. Now, your high demand kicks in and 4 of your VMs consume 8 GB of Memory each. Luck has it - they happen to be on the same host. Now you have ballooning and swapping in your system.
This is of course only memory. To make things more complicated, other things will break in your stressed environment during high demand. Let’s add 2 more decision points for example:
- Two of these VMs are very chatty and if you move one of them to a different host, you will suffer from network latency.
- One of them has 16 CPUs and the other hosts in the environment are suffering from ready queues.
So not only do you still suffer from performance problems due to memory congestion, solving it might not be as easy as trying to “balance” the memory across the cluster (Even though, your bought more memory than the total peak demand of your entire environment!).
Solution: Plan Capacity in Conjunction with Placement and Sizing in Real Time
The only solution is to plan and control in real time, using the same mechanism. Imagine that in the example above, there is a solution where everything magically aligns during that point of time. You avoid network issues, place chatty VMs near each other and prevent memory ballooning to begin with. Then, 10 minutes later, when a new group of VMs demand kick in, corrections will be done in the system to avoid new bottlenecks.
That is the only way you can assure performance with the hardware you planned for.