Welcome!

In the Cloud, you have to trust your instruments...

Michael Kopp

Subscribe to Michael Kopp: eMailAlertsEmail Alerts
Get Michael Kopp via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Top Stories by Michael Kopp

For a while now I have been writing about how to analyze and optimize Hadoop jobs beyond just tweaking MapReduce options. The other day I took a look at some of our Outage Analyzer Hadoop jobs and put words into action. A simple analysis of the Outage Analyzer jobs with Compuware APM 5.5 identified three hotspots and two potential Hadoop problems in one of our biggest jobs. It took the responsible developer a couple of hours to fix it and the result is a 2x improvement overall and a 6x improvement on the Reduce part of the job. Let's see how we achieved that. About Outage Analyzer Outage Analyzer is a free service provided by Compuware that displays in real-time any availability problems with the most popular third-party content providers on the Internet. It's available at http://www.outageanalyzer.com. It uses real time analytical process technologies to do anomaly d... (more)

Troubleshooting Response Time Problems

Production Monitoring is about ensuring the stability and health of our system, that also includes the application. A lot of times we encounter production systems that concentrate on System Monitoring, under the assumption that a stable system leads to stable and healthy applications. So let’s see what System Monitoring can tell us about our Application. Let’s take a very simple two-tier Web Application: A simple two tier web application This is a simple multi-tier eCommerce solution. Users are concerned about bad performance when they do a search. Let's see what we can find out a... (more)

Application Performance Monitoring in Production

Setting up Application Performance Monitoring is a big task, but like everything else it can be broken down into simple steps. You have to know what you want to achieve and subsequently where to start. So let’s start at the beginning and take a top-down approach Know What You Want The first thing to do is to be clear of what we want when monitoring the application. Let’s face it: we “do not want to” ensure CPU utilization to be below 90 percent or a network latency of under one millisecond. We are also not really interested in garbage collection activity or whether the database ... (more)

Application Performance Monitoring in Production

Last time I explained logical and organizational prerequisites to a successful production level application performance monitoring. I originally wanted to look at the concrete metrics we need on every tier, but was asked how you can correlate data in a distributed environment, so this will be the first thing that we look into. So let’s take a look at the technical prerequisites of successful production monitoring. Collecting data from distributed environment The first problem that we have is the distributed nature of most applications. In order to isolate response time problems or... (more)

How Garbage Collection Differs in the Three Big JVMs

(Note: If you’re interested in WebSphere in a production environment, check out Michael's upcoming webinar with The Bon-Ton Stores) Most articles about Garbage Collection ignore the fact that the Sun Hotspot JVM is not the only game in town. In fact whenever you have to work with either IBM WebSphere or Oracle WebLogic you will run on a different runtime. While the concept of Garbage Collection is the same, the implementation is not and neither are the default settings or how to tune it. This often leads to unexpected problems when running the first load tests or in the worst case... (more)