Back to Blog

Yuri Rabover

Peaking Application Demand and the Impact on Performance

Chameleons impacted by seasonal peaks just like workload performanceWorkload – the four seasons

Some of the examples we reviewed earlier illustrated how workload demand fluctuations can cause peaks in utilization in different parts of the infrastructure. It is becoming clear that without being aware of the workload demand, it is extremely difficult if not impossible to guarantee performance and efficiency.

But let’s look at the nature of peaking loads in the application itself. We already analyzed some examples with video and VDI applications and concluded that it is usually triggered by the spike in end-user demand or infrastructure-related maintenance tasks. Now let’s examine the end-user demand in more detail and dive into longer-term seasonal peaks as opposed to short-term spikes.

A spike during Christmas shopping season is one common example. For large retailers this is one of the busiest times of the year. It greatly surpasses all other holidays, like Labor Day, Mother’s day etc. This is the time of increased shopping activity where online shopping applications experience increased web traffic and related load on the infrastructure supply.weekly-visits-retail500


A traditional way to guarantee that peaks are accommodated is to provide enough capacity during peak times. This can be very expensive, so managers in anticipation try to re-arrange the resources and either shutdown unnecessary services and projects or provide extra capacity elsewhere.

First, one must determine what to shut down. One way is to look at the non-essential projects like some development and maintenance activities. But which ones to choose? The best would be those that consume the resources needed by mission-critical apps. So you need to know the pieces of the workload sharing compute, storage and network and their average and peak consumption. If they don’t peak much you could use their average consumption and see what will happen if you remove them.

If you stop and think about this, you realize it is a very non-trivial task. Most of the tools today give you trends in raw resource utilization - say, host physical memory and CPU or VM virtual memory and CPU. But the raw physical resource utilization analysis doesn’t indicate which parts of the workload consume it. And the virtual resource utilization doesn’t show you the demand (i.e, active virtual memory) but only how the resources utilized inside a VM.

If we need to do this for virtual memory in vSphere, then we need to look at the consumed and active memory of all VMs and figure out which non-essential VMs consume the most, including their peaks and averages. Then we need to see if some of the memory goes into balloons as it may impact your calculations. Then, do the same with CPU, I/O and network. And then switch to the storage portion and look at IOPS and space consumption, including thin provisioning. What may happen is that you may end up with several subsets of candidates for compute, storage and network. And then you need to find the optimal overlap that will give you the biggest bang for the buck. Today’s trending and reporting tools don’t give you that story, you have to compute it yourself and across a wide variety of loads. And during different seasons these calculations will be different.

Then another technique is to provide burst capacity spilling over to the public clouds. Again, you need to figure out which applications to move there but in addition to what we just reviewed you also need to understand where to place them in the public clouds to deliver the right service.

For example, you may be serving various parts of the country and the world during the peak season and during the day. Then it’d be important to decide which AWS or Azure regions to use and how much load to put there.  It could turn out that moving or starting big pieces of the load there could be very expensive, storage- and bandwidth-wise. So you need to examine the demand and supply both on premises and in the public clouds to find the equilibrium state where the performance will be within  the expected limits and the infrastructure is still used properly. And again take into account seasonal long- and short-term peaks which can overlap in very intricate patterns.

All this while theoretically possible is very difficult to do in practice.  While you can anticipate a spike in demand during some seasons, it is practically impossible to guarantee the needed supply without actively controlling consumption and spending across all resources. This issue is becoming one of the most serious barriers to delivering reliable QoS and requires new innovative techniques. Let’s look at them in the subsequent posts and figure out the right solutions.

Image source: