How many times do you revisit your IT architecture? One of the things that the public cloud has helped us to do is to revisit not just our architecture, but ways we can infuse some of the new processes into our existing IT platforms.
Virtualization is not going anywhere. Even as the shift to public cloud and multicloud adoption continues to rise both in scale and popularity. What has become incredibly important is the need to reassess how we operate and design our existing “traditional” infrastructure to make sure we in the technology organization is competitive in value and capabilities to what the public cloud allure is creating for our CIOs.
The concept of Superclusters is one that I’ve been especially keen on for by 3 decades in the industry including 20+ years running moderate and large-scale VMware environments. Designing clusters is most often done for logical segmentation than physical segmentation. I’ll explain what that means.
SCENARIO: DEV, PROD, DMZ Clusters
You may know this design well as it’s a common choice. We build separate clusters for development workloads, production workloads, and then a third unique cluster for public-facing and specialized workloads. Seems logical to give an operational separation between workload types and also physical segmentation as a bit of a firewall between environments.
The catch is that with virtualization and the nature of workloads moving and accessing each other through real and virtual firewalls, there is no actual need to physically separate between clusters. That’s one reason that is easy to set aside.
Another popular (and misplaced) idea is that resource usage has to be separated between clusters so that apps aren’t fighting each other and taking power and performance away from other apps. Just like we have virtual firewalls, there is a better way to allocate and manage resources in a modern application hosting environment…but let’s not jump right to it yet.
Overhead = Overblown
The biggest issue with the 3 cluster design we just showed is an incredible amount of waste that is both costly and unnecessary. We all know that we design hardware clusters for planned capacity and growth. We also know that we (as humans) are particularly bad at guesswork. When in doubt, guess high…which is how we got to this difficult place.
If we are at 60% capacity on each of the 3 clusters then you literally have 1.2 full clusters of capacity being entirely unused on a permanent basis. That’s assuming you are actually using the 60% fully at any given time, which thanks to virtualization and application patterns, we are not.
What makes this even worse is that your DMZ probably has a small number of workloads. I’m not doubting the importance of those workloads, but to keep a resilient environment you are probably vastly overprovisioned. It’s just the way we have always done things. The math doesn’t look good though:
- Dev cluster – 4 servers total – 1+ idle
- Prod cluster – 7 servers total – 3+ idle
- DMZ cluster – 3 servers total – 2+ idle
Flattening Clusters FTW!
Let’s assume we are able to share networks, storage, and create virtual barriers (micro-segmentation, virtual firewalls etc.) between our workloads. If you’re running VMware then you don’t even need all the NSX bells and whistles. Your software and hardware firewalls already have the ability to present distinct network boundaries into the virtual cluster.
Now we merge the workloads and look at the same virtual utilization target of 60% (which is an artificial target anyways) and suddenly you have a few spare servers. This may not feel useful now because you can’t just send them back for a refund, but you have reduced the need to buy additional gear without impacting performance.
If you wanted to dabble with some Kubernetes on bare metal, or try something else in your Ops lab, you now have your own gear to reuse for a greater purpose than just sitting there idle.
If you want to see an example of what merging and flattening clusters does in practice, there is a fun and informative web streaming session on Turbonomic Labs that covers how to do it.
Look for much more in the blog as this is one of the most requested topics and we are excited to share some really great working examples from small, medium, and hyperscale environments.
As famed computer scientist, Rear Admiral Grace Murray Hopper, said in an interview in Information Week in 1987: “The most damaging phrase in the language is “We’ve always done it this way!”