Performance Monitoring Toolset : Connecting the dots using JVM Metrics
As we know that the ColdFusion server runs on top of Java, and JVM is the engine, which helps ColdFusion do amazing things.
When we talk about the performance of ColdFusion, then JVM by default becomes a major stakeholder. The Performance Monitoring Toolset provides detailed metric about JVM as shown in the screenshot below.
The list of metrics shown under JVM section are:
- Heap & Non-Heap memory along with distribution
- GC event distribution as time series
- Aggregation of GC Metrics
- Thread and Thread Pool Metrics
- Class Loading Metrics
There is a wealth of information and details on this page, but how is it going to help us in finding root cause of any issue?
And answer lies in connecting the dots using one or more than one set of metrics and drilling down to root cause.
But before i move on, I want to highlight that apart from providing metrics, JVM page also provides below actions .
- Trigger Heap Dump
- Trigger Thread Dump
- Trigger GC
These actions can be triggered by user on need basis and that will help in getting closer to root cause.
Coming back to how we can connect the dots using these metrics on JVM page, below are some issues which users come across frequently.
CF server response is slow:
This can be because of several other reasons , but i will put it here in context of JVM metrics.
JVM can be slow if Garbage Collection(GC) is happening frequently, and that can be confirmed by looking at memory and GC graph.
If there is a drop is memory graph and there is corresponding peak in GC graph , then that explains why JVM (and CF ) is slow .
GC graph will tell us how many GCs are happening and whether it is major GC or minor GC.
Reason behind frequent GC might be because of JVM not having enough memory as required . If CF is slow only for a specific duration then we can look at the type of cfm pages being processed in CF Server page and use Code Profiling (with Memory flag ON) if required.
JVM Crashes with OutOfMemory:
Sometimes JVM goes OutOfMemory and that might be because of different memory areas like Heap, Metaspace .
we can look at time series graph of different Memory areas and see if there is any memory leak happening.
If there is already some alert configured to take Heap Dump when memory crossed a given threshold then we can analyze the Heap Dump using any Heap analyzer and find out where problem lies with memory.
As Performance Monitoring Toolset provides persistence so we can go back in time and see how memory usage for different memory areas varies with time.
Triggering Thread Dump when CF Server is slow:
When CF Server is slow and Grabage Collection is not the root cause, then we can trigger thread dump .
Performance Monitoring Toolset provides detailed analysis of thread dump and that will help in finding out which threads are blocked and what cfm/cfc page they are executing. Please refer to Thread Dump blog for further details.