In my last article I explained what a major Garbage Collection is. While a
major Collection certainly has a negative impact on performance it is not the
only thing that we need to watch out for. And in case of the CMS we might not
always be able to distinguish between major and minor GC. So before we start
tuning the garbage collector we first need to know what we want to tune
for. From a high level there are two main tuning goals.
Execution Time vs. Throughput
The first thing we need to clarify if we want to minimize the time the
application needs to respond to a request or if we want to maximize the
throughput. As with every other optimization these are competing goals and we
can only fully satisfy one of them. If we want to minimize response time we
care about the impact a GC has on the response time first and
on resource usage second. If we optimize for throughpu... (more)
Over the last couple of months I have been talking to more and more customers
who are either bringing their Hadoop clusters into production or have already
done so and are now getting serious about operations. This leads to some
interesting discussions about how to monitor Hadoop properly and one thing
pops up quite often: Do they need anything beyond Ganglia? If yes, what
should they do beyond it?
As in every other system, monitoring in a Hadoop environment starts with the
basics: System Metrics - CPU, Disk, Memory you know the drill. Of special
importance in a Hadoo... (more)
For a while now I have been writing about how to analyze and optimize Hadoop
jobs beyond just tweaking MapReduce options. The other day I took a look at
some of our Outage Analyzer Hadoop jobs and put words into action.
A simple analysis of the Outage Analyzer jobs with Compuware APM 5.5
identified three hotspots and two potential Hadoop problems in one of our
biggest jobs. It took the responsible developer a couple of hours to fix it
and the result is a 2x improvement overall and a 6x improvement on the Reduce
part of the job. Let's see how we achieved that.
About Outage Analyze... (more)
Anyone who ever monitored or analyzed an application uses or has used
averages. They are simple to understand and calculate. We tend to ignore just
how wrong the picture is that averages paint of the world. To emphasis the
point let me give you a real-world example outside of the performance space
that I read recently in a newspaper.
The article was explaining that the average salary in a certain region in
Europe was 1900 Euro's (to be clear this would be quite good in that
region!). However when looking closer they found out that the majority,
namely 9 out of 10 people, only ea... (more)
(Note: If you’re interested in WebSphere in a production environment, check
out Michael's upcoming webinar with The Bon-Ton Stores)
Most articles about Garbage Collection ignore the fact that the Sun Hotspot
JVM is not the only game in town. In fact whenever you have to work with
either IBM WebSphere or Oracle WebLogic you will run on a different runtime.
While the concept of Garbage Collection is the same, the implementation is
not and neither are the default settings or how to tune it. This often leads
to unexpected problems when running the first load tests or in the worst case... (more)