
Your phone rings at 3am. A frantic Level 1 support staffer is panicking. The web storefront has gone down. You dig around and quickly find out that the ESB has crashed and the warm standby instance failed to kick in. The logs reveal that the JVM terminated because it ran out of heap memory. How could this have happened? We tested it, right?
The dreaded java.lang.OutOfMemoryError is by far the most commonly recurrent problem I’ve seen in JVM based applications throughout my career.
So let’s just go over a few things you failed to do.
- You failed to use “Splunk for JMX” to monitor your JVM heap and proactively alert you when a heap usage threshold was breached.
- You failed to use Splunk during load/soak testing of your system so as to facilitate more thorough test case assertions based on Splunk searches. One such assertion could have been detection of abnormal memory growth or long garbage collection times–conditions that your load test harness couldn’t have known about.
- You wrongly assumed that just because the load test didn’t make the system “fall over”, that it is production ready.
You found out the painful way that your application has a Heap Memory issue. Triage time.
Typically, to diagnose exactly what is causing this memory issue, whether you have a leak or are just poorly utilizing the available heap space, you would generate a JVM profiling dump(hprof). These can be done on demand via JMX or you can instruct the JVM to dump a hprof file when OutOfMemoryErrors occur.
Either way, what you get is a binary hprof file which has detailed information about memory usage by classes and objects.
Typically you would then feed this hprof file into a profiling tool such as Eclipse MAT (memory analysis tool), which has heaps of neat wizards for reporting on various potential anomalies.
However, I have written a scripted input that can take this binary hprof file and decode it into ASCII in Splunk best practice semantic format.
You can then index this data and use Splunk as your very own JVM profiling tool!
You can use “Splunk for JMX” to generate hprof dumps (via a JMX MBean operation), the scripted input to monitor and decode this output, and Splunk to index it !
2012-08-10 17:29:40:824+1200 name="HPROF_METRIC" event_id="OBJECT_INSTANCE" OBJ="f22b5320" SZ="80" TRACE="0" CLASS="java/util/TreeMap" KID="f07ec748" navigableKeySet="f22b53e8" root="f22b53c0"
2012-08-10 17:29:40:824+1200 name="HPROF_METRIC" event_id="OBJECT_INSTANCE" OBJ="f22b5350" SZ="44" TRACE="0" CLASS="java/util/HashMap$Entry" KID="f07e14f0" value="f22b4e68" key="f0815ea8"
2012-08-10 17:29:41:153+1200 name="HPROF_METRIC" event_id="PRIMITIVE_ARRAY" ARR="fbfcabb8" SZ="63" TRACE="0" NELEMS="43" ELEMTYPE="byte"
2012-08-10 17:29:36:900+1200 name="HPROF_METRIC" event_id="CLASS" CLS="f0cd5f78" NAME="sun/reflect/GeneratedMethodAccessor44" TRACE="0" SUPER="f07fc0e0" LOADER="f0cd42d8"
2012-08-10 17:29:37:454+1200 name="HPROF_METRIC" event_id="ROOT_SYSTEM_CLASS" ROOT="f095ea28" TYPE="system class" NAME="sun/security/x509/AlgorithmId"
2012-08-10 17:29:37:454+1200 name="HPROF_METRIC" event_id="ROOT_SYSTEM_CLASS" ROOT="f086ff60" TYPE="system class" NAME="com/sun/org/apache/xerces/internal/impl/dtd/XMLSimpleType"
That’s the teaser. If you want more, you’re just going to have to come along to Splunk Conf and check out the Developer track.
System.exit(5150);
----------------------------------------------------
Thanks!
Damien Dallimore