Our team was in attendance at the recent Cisco Live US in July, a conference and tradeshow, targeted at network engineers, the main user group of Cisco networking technologies. When those network folks - as they like to refer to themselves - stopped by our booth, we had pretty much the same conversation over and over again, in short: networking and data center operations technologies are connected, but the teams don’t really talk to each other, unless the packets hit the fan so to speak. That means that nobody seems to talk unless an application is down and data center operations suspects a networking issue to be the source of that failure.
IT organizations have become comfortable with the fact that part of their daily lives is a blame game
The conversation, according to the networking teams, is usually pretty aggressive, short-worded and immediately comes to an end as soon as the application is running again. Being the communication aficionado that I am, I was thinking that this cannot be the normal mode of operations in most IT organizations. However, there are even blogs out there picking the blame game between networking and other IT teams as their central theme.
Businesses depend on the alignment of IT organizations
Every person in an IT department has a different perspective and different priorities, the network team needs to deliver data highways, the application developer wants a space to code its application, the system administrators do the best they can to monitor. The reality is that all things are interconnected, and in times of an ever increasing demand for IT services and with business success depending highly on IT performance, no one should be happy or comfortable with misalignment, unproductive blame games and constant frustration.
Creating transparency helps with alignment
Let’s dive a little deeper into the moment a system administrator picks up the phone to call the networking team. To make that a little more colorful, let’s imagine she works for a hospital, responsible for electronic medical records (EMR) being available across the hospital. She just received a call that there seems to be an issue with the EMR, they’re not up to date, some do not display correctly. In some cases that can have pretty big ramifications and sys admins are under pressure to fix their issue as soon as possible. Her monitoring tools tell her that there is a networking problem, but she cannot really tell whether that is at the host level or at the switch level, hence: a call to networking. Networking monitors different metrics and cannot see any issues stemming from their side.
And that is exactly when the misalignment gap becomes apparent: Unless you have a system in place that is capable of understanding the dependencies between several layers of the IT stack, situations like these will continue to happen. In any business, be it a top financial institution, a big e-commerce website or a hospital, every day people rely on applications performing reliably. The complexity to manage this performance is immense.
There is an approach in IT that is called the systems approach, meaning: that application is a system that is highly dependent on and interacting with other systems. And the only way you can ensure that the system works is by understanding these inter-dependencies. How can the system administrator make an informed decision on what to do to fix her EMR issues when all she can do is guess at which layer of the stack her issue lies?
True Performance Assurance Needs a Full-Stack Approach
This systems thinking approach needs us as technologists to embrace something bigger. We have more responsibility than we ever thought we did. We need to embrace a systems approach to our entire stack, from the physical hosts and networking, all the way through the hypervisor or cloud, and up to application layers.
This is more than just a visibility issue. It’s a case where we need to embrace a system that understands the demand of the applications, and the stitching of all of the physical and virtual resources that it consumes to deliver data to the people that matter: your customers and clients.
It’s no surprise that we have built our platform with this in mind. There just is no way for the scale and complexity of today’s environments, even the seemingly small ones, to be handled the way that we have for decades leading up to now. By the time you “see” what the issue is, something has already gone wrong. We just can’t operate that way in 2016 and beyond. This is the time to use a system that let’s you get back to engineering and away from troubleshooting performance issues.
Won’t it be nice to call the Network team for something other than a problem, for once? From the conversations we had at Cisco Live, trust us, they would like that too.