Recently I was listening to an episode of the popular radio show "Car Talk" on NPR by two brothers. They diagnose mysterious car problems on the air based on symptoms each caller describes. They are very funny and creative in guessing the root cause from often vague descriptions. One caller’s problem in particular reminded me of root cause analysis in the modern data center. The caller couldn’t start his car after he stopped at the grocery store, but it started without incident when he stopped to buy ice cream. The root cause was actually the engine temperature. When he stopped for groceries, the engine had more time to cool off than when he made a quick stop for ice cream.
Troubleshooting application performance issues in modern data centers isn’t much different. We have many monitoring tools to alert us to problems, but it’s still hard to identify root cause, or when issues will ripple across the stack and which applications get impacted. As I have written previously, Application Performance Monitoring (APM) and Network Performance Monitoring (NPM) give IT teams detailed insights on specific parts of the stack, but do not provide the complete picture.
While the piece parts are important, what really matters is how they work together and how it ultimately impacts application performance and the end user experience. We call that “full-stack visibility.
Here are six things Full-Stack Visibility can reveal:
1. Application-to-infrastructure dependencies. NPM, APM and other monitoring solutions only provide a partial view of dependencies between infrastructure and applications. If you have a complete understanding of how the infrastructure relates to applications, such as the specific storage, hosts, VMs and networks used by each app, you’ll have a clear view of how specific infrastructure issues impact application performance.
2. Application-to-application dependencies. By understanding how application components are tied together, you can build a clear profile of each application. This can be extremely useful if you’re migrating an application to new infrastructure, a new data center or the cloud.
3. Weak points in your infrastructure that matter. With a clear top-to-bottom view of the infrastructure and performance bottlenecks, you can easily identify which infrastructure resources and services are the most important. In turn, that tells you where you have gaps that are the most likely to cause problems.
It’s also important to map both infrastructure resources (compute, storage and networking) and infrastructure services (DNS, DHCP, authentication services, security, load balancing, etc). Performance issues with core services can quickly cripple multiple applications, so staying ahead of these is critical.
4. How traffic flows through the datacenter. With end-to-end visibility of traffic from the user to the infrastructure, you can more easily identify irregularities. For example, The Wine Group found unauthorized traffic flowing from multiple virtual workstations to DropBox. But because NPM is typically blind to east-west traffic in a virtual environment, they were only discovered this with full-stack visibility. Quarles & Brady, a law firm, identified rogue traffic to its Exchange Servers from other servers that should have been decommissioned.
5. Root cause of performance issues anywhere in the stack. With a clear view of end-to-end dependencies and the specific condition of each infrastructure component (such as CPU utilization, storage I/O and throughput), it’s much easier to identify root cause quickly and accurately. In turn, that lets IT teams avoid the typical finger pointing and multiple meetings that happen when there isn’t a consistent, shared view of data center performance.
6. How the end-user sees things. This is what matters at the end of the day. If the end users are happy, the IT team will be happy. Full-stack visibility can monitor performance from the end-user’s perspective all the way back through to the datacenter so you can stay ahead of problems and identify issues before they impact the end-user
If your car diagnostic system has all the information on the external temperature, engine temperature, how long the car was off, status of the various internal sensors and correlation intelligence, you’d be able to quickly diagnose root cause when it didn’t start: the engine was too cold. Like cars, data center infrastructure performance issues can be intermittent and frustratingly difficult to diagnose. Full-stack visibility surfaces root cause quickly when there are problems, and even lets you get ahead of the problems.
# # #
Uila's CEO and co-founder, Chia-Chee Kuan, may be reached at: (408) 819-0777, or by email at: firstname.lastname@example.org.
- Most important aspect of VDI troubleshooting
- Importance of logging analysis in the Observability World
- Uila Success Story: Baron Capital
- Uila at VMware Explore US 2022
- What's new in Uila 5.0
- What's new in Uila 4.6?
- VDI Tips and Tricks for Desktop Professionals #3: Network Monitoring
- VDI Tips and Tricks for Desktop Professionals #2: End-to-end VDI and Application Visibility
- VDI Tips and Tricks for Desktop Professionals #1: Resource Provisioning
- Tips on how to plan your Datacenter Migration