With constant budget pressure, how do you address steadily increasing demands on your IT service model? How will you assure you are not leaving money on the table? There is a misconception about the relationship between hardware and application performance in the IT Industry. It seems so easy to respond to performance questions with more hardware. However, that answer is wasteful, and eventually incurs higher costs for cooling, powering and licensing. Even worse, it can lead to the unintended consequence of decreased application performance. To put it more bluntly:
Your team might be demanding too much hardware to assure application performance and you might still be getting these angry, gut-wrenching calls from the business side when a high-performance application has just experienced downtime.
Let’s dive into the world of data center tradeoffs and why over provisioning is something you should be on the lookout for in your data center.
vCPU Ready Queue
Often times, application developers insist on giving their VMs a lot of vCPU. vCPU requires scheduling against physical CPU, meaning 4vCPU assigned to your Virtual Machine requires 4CPU. The VM will wait for the hardware to be available. The dreaded wait time, CPU ready queue, is unnecessary in many cases for the application might just work fine with 2vCPU. The CPU ready queue and why assigning too much vCPU can be bad for your performance is a common example of why over provisioning alone does not assure performance.
Beyond CPU ready queue, over provisioning is an issue you should be on the lookout for across the entire stack.
Storage and IO
Most businesses have analytics applications that need lots of storage. Your sys admins can give these applications comfortable amounts of storage with the intention of not having to experience storage contention. If all these apps try to access the same storage array at the same time, there will be IO bandwidth contention, meaning your app performance slows down. And here is the issue: Due to a lack of visibility into the data center, your sys admins might be inclined to suspect a lack of storage when there really is an IO bandwidth issue.
Your business applications are trying to access the storage ocean you have provided, but there is only one access. So much storage, yet so hard to reach.
In that situation, you want to separate your IO-heavy VMs or give them priority for flash storage over other VMs with lower bandwidth demand, as this will decrease the impact of “noisy neighbors”. The question to ask is: Do we understand and maintain a healthy trade-off between storage and I/O?
Java Application Memory and garbage collection time
Let’s talk about memory. In a Java application, there is a process called garbage collection. Every once in a while, the application memory gets scanned and, if it is not being used by the application, it is returned to the common memory pool, called heap. The more memory assigned, the longer the app waits before scanning for unclaimed memory. In that case, garbage collection can take some time, which may cause your app performance to go down during that time.