For a while now I have been writing about how to analyze and optimize Hadoop
jobs beyond just tweaking MapReduce options. The other day I took a look at
some of our Outage Analyzer Hadoop jobs and put words into action.
A simple analysis of the Outage Analyzer jobs with Compuware APM 5.5
identified three hotspots and two potential Hadoop problems in one of our
biggest jobs. It took the responsible developer a couple of hours to fix it
and the result is a 2x improvement overall and a 6x improvement on the Reduce
part of the job. Let's see how we achieved that.
About Outage Analyzer
Outage Analyzer is a free service provided by Compuware that displays in
real-time any availability problems with the most popular third-party content
providers on the Internet. It's available at http://www.outageanalyzer.com.
It uses real time analytical process technologies to do anomaly d... (more)
In a recent article we have shown how the Java Garbage Collection MXBean
Counters have changed for the Concurrent Mark-and-Sweep Collector. It now
reports all GC runs instead of just major collections. That prompted me to
think about what a major GC actually is or what it should be. It is actually
quite hard to find any definition of major and minor GCs. This well-known
Java Memory Management Whitepaper only mentions in passing that a full
collection is sometimes referred to as major collection.
One of the more popular definitions is that a major GC is a stop-the-w... (more)
Setting up Application Performance Monitoring is a big task, but like
everything else it can be broken down into simple steps. You have to know
what you want to achieve and subsequently where to start. So let’s start at
the beginning and take a top-down approach
Know What You Want
The first thing to do is to be clear of what we want when monitoring the
application. Let’s face it: we “do not want to” ensure CPU utilization
to be below 90 percent or a network latency of under one millisecond. We are
also not really interested in garbage collection activity or whether the
database ... (more)
The other day I was looking at a web application that was using MongoDB as
its central database. We were analyzing the application for potential
performance problems and inside five minutes I detected what I must consider
to be a MongoDB anti pattern and had a 40% impact on response time. The funny
thing: It was a Java best practice that triggered it.
Analyzing the Application
The first thing I always do is look at the topology of an application to get
a feel for it.
Overall Transaction Flow of the Application
As we see it's a modestly complex web application and it's using Mongo... (more)
Anyone who ever monitored or analyzed an application uses or has used
averages. They are simple to understand and calculate. We tend to ignore just
how wrong the picture is that averages paint of the world. To emphasis the
point let me give you a real-world example outside of the performance space
that I read recently in a newspaper.
The article was explaining that the average salary in a certain region in
Europe was 1900 Euro's (to be clear this would be quite good in that
region!). However when looking closer they found out that the majority,
namely 9 out of 10 people, only ea... (more)