Back to Blog

Ben Yemini

Crossing the “Virtualization Chasm” – Part I: Why are production apps difficult to virtualize?

Virtualization draws a great divide between performance-insensitive and performance-sensitive applications. Performance-insensitive applications –such as development tools or print servers– can often be virtualized in a matter of minutes, yielding significant gains.  It is little wonder that these  “low-hanging fruits” account for the enormous success of virtualization and most of the estimated 20%-30% penetration, accomplished by virtualization.

In contrast, performance-sensitive applications remain very complex to virtualize and, once virtualized, very difficult to manage. These applications, accounting for the remaining 70%-80% of non-virtualized IT, primarily consist of production applications. While some of these applications require extreme performance guarantees beyond virtualization technologies  — e.g., high-speed trading applications can require sub-second delays – the bulk of production applications could be virtualized to yield significant ROI gains.

We call this complexity gap, between virtaulizing performance-insensitive and production applications, the Virtualization Chasm. Figure 1, below, illustrates this chasm. Crossing it – i.e., simplifying the virtualization and management of production applications — is a central challenge of IT administrations and the virtualization industry.

describe the image

Figure 1: The Virtualization Chasm

This three-part series of posts explores the fundamentals of the virtualization chasm and strategic solutions to cross it.

Managing Production Apps Through Dedicated-Peak-Provisioning

Traditional IT assures the performance of productions applications by dedicating physical machines (PM) with sufficient capacity to handle peak demands.

We use an example to illustrate this Dedicated-Peak-Provisioning (DPP) strategy and its implications to virtualization.  Figure 2, based on a recent research paper, depicts the IO capacity utilization by the workload of an Exchange server. The blue curve represents average utilization while the red curve represents utilization peaks.

IO workload of Exchange Server

Figure 2: IO Workload of An Exchange Server

The peak utilization reaches 100% for a short duration around 10PM and then for a sustained duration between 3AM-4AM, with lingering effects until 5AM. These workloads peaks are associated with nighttime administrative tasks (e.g., backup).

A DPP strategy ascertains that the PM dedicated to Exchange has sufficient IO capacity to service peak demands.  The PM of Figure 2 supports exactly the peak demand, resulting in 100% utilization during peak-times.  The price of 100% utilization is congestion, significant delays and other disruptions of IO flows.  Indeed, the traffic peak starting at 3AM creates a massive sustained congestion, reflected by the peak and average utilization rising to 100% until 4AM and lingering on until 5AM.  Fortunately, the applications (aka Exchange Roles), generating the peaks, are insensitive to congestion and delays.  Were administrators concerned with performance of these Roles, they could allocate additional capacity to keep peak utilization sufficiently low.

DPP can help assure the performance of critical Roles. During most of the day, the average utilization ranges in the 3%-8%. This low utilization assures that performance-sensitive Roles (e.g., email flow) will have sufficient resources to meet stringent performance goals.

virtualized exchange server

Figure 3: 85%-92% Of The Capacity is Wasted Most of The Day

However, the price of this performance assurance is a waste of some 92%-97% of the IO capacity, most of the day. Even if one considers peak utilization, not just averages, the peaks utilize 8%-15% of the capacity most of the day, leaving 85%-92% idle, as illustrated in figure 3.

Virtualization is Set to Resolve the Inefficiencies of Dedicated-Peak-Provisioning

Virtualization replaces DPP with workload consolidation.  Applications, packaged into virtual machines (VMs), are consolidated to share a PM. The Exchange Roles applications could be packaged into VMs, and consolidated with VMs of other applications. These additional VMs could exploit the 85%-92% capacity left idle by the Exchange Roles.

For example, a print server, or a Web application development tool, may be packaged into VMs, sharing the PM with Exchange VMs. During nighttime, the print-server and Web development tools are inactive, leaving the entire capacity to service the Exchange peaks. During daytime, they could utilize the 85%-92% slack capacity, left by Exchange, to improve the resource utilization efficiency.

Consider a hypothetical virtualization of Exchange, where key Roles are packaged into separate VMs  and are consolidated with VMs running performance-insensitive applications. Figure 3 depicts the peak (purple) and average (green) utilization curves, for the consolidated IO workloads.  The dashed red and blue curves in the background depict the peak and average utilization of the Exchange components.

describe the image

Figure 4: Workload Consolidation

 The workload contributions by the additional applications, consolidated with Exchange, are represented by the differences between the green/purple and blue/red curves. This additional workload has several effects:

  • It yields significant utilization gains over DPP and eliminates its inefficiencies. The average utilization increases to the 40%-50% range, during the work day, in contrast with the 3%-8% of DPP, eliminating the capacity waste of Figure 3.
  • But it also interferes with the Exchange traffic, pushing peak utilization close to 100%, for long periods.  For example, the brief peak of Exchange at 10PM (Figure 2,3) is transformed into sustained congestion between 9PM-12PM.  We will consider the significance of these behaviors in the next section.

More generally, consolidation can produce significant utilization gains, resolving the inefficiencies of DPP. However, these gains may come at the price of interference among the workloads competing over resource use.

The Root Cause of the Virtualization Chasm: Interference

The consolidated workload of figure 4 will see its peaks exceeding 60%, for sustained periods between 9AM-20PM.  During these periods, IO flows of the consolidated VMs will interfere with each other, resulting in congestion and queueing delays. Applications are often very sensitive to such disruptions of IO flows. Production applications, in particular, may see significant performance degradation and impair users and business functions. For example, critical Exchange Roles may become very sluggish disrupting user functions. Such disruptions may be unacceptable.

Sensitivity to interference may vary greatly with applications and users. Consider the brief 10PM peak, at 100% utilization, in Figure 3. The administrative applications creating this peak may not be very sensitive to performance degradation. However, a prolonged congestion between 9PM-12PM, as depicted in Figure 3, may disrupt the schedules of Exchange administrators and may be unacceptable.

Therefore, while consolidation can yield significant efficiency gains, these benefits come at the price of mutual interference among the consolidated workloads and respective disruptions.  Performance-insensitive applications may tolerate such interference and thus admit simple, risk free virtualization. In contrast, production applications may require strict bounds on their exposure to interference, to guarantee their performance.

Therefore, the key challenge in virtualizing production applications is: how can interference be controlled to establish an efficient balance between utilization gains and performance requirements.

Controlling Interference is Key to Virtualization

In summary, the root cause of the Virtualization Chasm is interference between consolidated workloads.

Applications can exhibit a broad range of performance requirements, determining their sensitivity to interference.  At one end of the spectrum are applications that cannot tolerate any interference in accessing compute resources (e.g., high-speed trading). Such rare applications should best remain non-virtualized and use DPP strategy.  At the other end of the spectrum are performance-insensitive applications. Virtualizing these “low-hanging-fruits” applications is simple and risk free.  Production applications lie between these two extremes and require strict control of interference to assure their performance.

(The next post in this series will, thus, consider more closely strategies for interference control and their use in virtualizing production applications.)