Welcome!

In the Cloud, you have to trust your instruments...

Michael Kopp

Subscribe to Michael Kopp: eMailAlertsEmail Alerts
Get Michael Kopp via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Top Stories by Michael Kopp

For a while now I have been writing about how to analyze and optimize Hadoop jobs beyond just tweaking MapReduce options. The other day I took a look at some of our Outage Analyzer Hadoop jobs and put words into action. A simple analysis of the Outage Analyzer jobs with Compuware APM 5.5 identified three hotspots and two potential Hadoop problems in one of our biggest jobs. It took the responsible developer a couple of hours to fix it and the result is a 2x improvement overall and a 6x improvement on the Reduce part of the job. Let's see how we achieved that. About Outage Analyzer Outage Analyzer is a free service provided by Compuware that displays in real-time any availability problems with the most popular third-party content providers on the Internet. It's available at http://www.outageanalyzer.com. It uses real time analytical process technologies to do anomaly d... (more)

Major Garbage Collections - Separating Myth from Reality

In a recent article we have shown how the Java Garbage Collection MXBean Counters have changed for the Concurrent Mark-and-Sweep Collector. It now reports all GC runs instead of just major collections. That prompted me to think about what a major GC actually is or what it should be. It is actually quite hard to find any definition of major and minor GCs. This well-known Java Memory Management Whitepaper only mentions  in passing that a full collection is sometimes referred to as major collection. Stop-the-world One of the more popular definitions is that a major GC is a stop-the-w... (more)

Lessons Learned from Real-World Big Data Implementations

In the past few weeks I visited several Cloud and Big Data conferences that provided me with a lot of insight. Some people only consider the technology side of Big Data technologies like Hadoop or Cassandra. The real driver however is a different one. Business analysts have discovered Big Data technologies as a way to leverage tons of existing data and ask questions about customer behavior and all sorts relationships to drive business strategy. By doing that they are pushing their IT departments to run ever bigger Hadoop environments and ever faster real-time systems. What's int... (more)

How to Identify a MongoDB Performance Anti Pattern in Five Minutes

The other day I was looking at a web application that was using MongoDB as its central database. We were analyzing the application for potential performance problems and inside five minutes I detected what I must consider to be a MongoDB anti pattern and had a 40% impact on response time. The funny thing: It was a Java best practice that triggered it. Analyzing the Application The first thing I always do is look at the topology of an application to get a feel for it. Overall Transaction Flow of the Application As we see it's a modestly complex web application and it's using Mongo... (more)

Why Averages Are Inadequate, and Percentiles Are Great

Anyone who ever monitored or analyzed an application uses or has used averages. They are simple to understand and calculate. We tend to ignore just how wrong the picture is that averages paint of the world. To emphasis the point let me give you a real-world example outside of the performance space that I read recently in a newspaper. The article was explaining that the average salary in a certain region in Europe was 1900 Euro's (to be clear this would be quite good in that region!). However when looking closer they found out that the majority, namely 9 out of 10 people, only ea... (more)