<img alt="" src="https://secure.bomb5mild.com/193737.png" style="display:none;">

Turbonomic Blog

Why Workload Automation Can’t Assure Performance

Posted by Matt Vetter on Jun 25, 2015 8:51:40 AM

Is Workload Automation all its cracked up to be?

Data center and virtualization management vendors often overuse the word automation within today’s IT world. It's used to sound cutting edge, to promote ease of use of a technology, or to create the impression of an intelligent or superior solution. In truth, workload automation can refer to a wide variety of tasks, from the daily mundane tasks to the extraordinarily complex. In reviewing the strength of a virtualization tool’s technology, it is therefore increasingly important to be skeptical when a company refers to its technology as an “automation tool”.

If I told you that I could automate your virtual environment so that you never experienced latency, congestion or slowness again, you would be right to be skeptical of what I say at face value. Many technologies claim to be able to “automate” to solve the complexity of managing a virtual environment. When we dive into the actual technology, however, we begin to discover that the reality of what can be automated is much less glamorous than what we were told.

The reality of workload automation is not so glamourous

Take the new concept of workload automation as an example. As virtual environments have become more and more complex with the introduction of large, mission critical applications, it becomes increasingly important to be proactive in regards to management of the environment to assure that every application continues to perform at the highest levels of performance. As an industry, we therefore introduced the concept of workload automation to ensure that workload placement can be managed and configured before resource contention causes end user complaints, latency or congestion. A noble cause to be sure, and an enticing concept for an IT administrator. But, as we discussed earlier, we must dive into the concept of automation, and peel back the layers of the proverbial onion to determine how this automation actually works.

Within the idea of workload automation, we develop dynamic thresholds over a period of a few business cycles. That is to say, a tool will slowly begin to create a line that is considered to be “normal” utilization of resources, looking at both different times during the day, as well as different days during the week. Unlike a normal threshold, which is a flat line for a specific metric that is non-customizable, a dynamic threshold will adjust accordingly for different levels of utilization for different metrics. This is done due to the assumption that, due to fluctuating demand for resources, what is considered to be a normal level of utilization for a resource on a specific day of the week or time of day, might be an anomaly for a different period.

A day in the life of an IT admin

Take for example the classic IT administrator headache: the morning boot storm. When employees first come into the office around 9am in the morning, they all log in to their applications or desktops at the same time. With a normal threshold, we will receive an alert every morning during this time, which we ignore, as it’s an expected spike, creating unnecessary white noise in the administrator’s inbox. With a dynamic threshold, however, a tool will learn that this spike in utilization is considered “normal”, and will adjust the threshold accordingly over time to account for this expected increase in demand. As a result, we build confidence that the tool understands the environment, and receive far less white noise from an alerting standpoint.

Once this new dynamic threshold has been created, a workload automation tool can now begin to take effect. When this new dynamic threshold is crossed, workload automation begins to make decisions to move and size virtual machines to assure that performance is not degraded by the spike in utilization. This automation is based on rules, set by the administrator, on how the tool should respond to spikes in utilization. In theory, this style of automation will ensure that performance will not be degraded when the spike occurs, as actions are taken before alerts or issues cross the administrator’s desk. It would seem that workload automation solves the issue of having to respond to alerts and performance degradation after the spike has already caused issues like latency, congestion and crashes within the environment.

The Reality of Workload Automation

In reality, however, workload automation does not assure performance, despite the theoretical benefits above. Regardless of the complexity of the rules set by the workload automation administrator, or the speed at which actions can be taken to bring the environment back under the dynamic threshold, a key feature of virtualization is missed, causing a critical flaw in workload automation. That is, there is no way to assume a normal state within the environment and assure performance at the same time. Because utilization is based on an extraordinarily complex series of metrics, which constantly fluctuate due to the demand on the infrastructure at any given time, we can never assume that historical data will have any influence on how performance will be impacted in the future. The concept of a dynamic threshold is flawed, therefore, as we must always consider the real time demand of resources to truly understand how to assure performance when we automate decisions. Setting rules based on historical data to automate can therefore never assure performance in the datacenter. Like so many other tools, the glamour of workload automation pales in comparison to the reality of its capabilities.

Instead, we must focus on the demand of resources as a way of assuring performance through automation. Instead of focusing on the underlying supply of resources, and attempting to balance or automate to ensure that resource utilization levels never cross a threshold, we must change the way we think. If instead we focus on the actual demand of resources within the virtual machine and application level, and now make decisions to meet this demand to the underlying supply in real time with regards to sizing, placement and capacity, we can be ready for any situation, no matter how complex the environment.

Understanding workload automation with VMTurbo is like swallowing the right pill in the Matrix movie

If I told you that VMTurbo can assure performance through demand-driven decision automation, would you believe me? Why not see for yourself?

Subscribe Here!

Recent Posts

Posts by Tag

See all