CF Server Section – Performance Monitoring Toolset

The primary job of Performance Monitoring Toolset is to monitor CF Servers. It provides end-to-end monitoring for all kinds of setups including individual nodes, individual cluster, J2EE environment or a combination of one or more of these.

Performance Monitoring Toolset has a left navigation bar using which you can go to the CF Server section. Based on your selection of global filter (for more information about global filter please refer : Global filter), you are taken to the overview of your CF Server(s).

Your selection of global filter can be a cluster or a node or a group. In case of a cluster or group, you will be seeing all kinds of metrics aggregated for your selection. In case of a node, you will be seeing all the metrics at the node level.

Cluster/Group

You see a screen like this. At the top, the overall health of the cluster/group is shown. At the top right like all pages, there is a time filter which you can adjust with and check the data at a granular level.

There are various charts on this screen –

Average Response Time : Average response time aggregated for all requests across cluster/group in timeseries. The ART is broken down into various components to give more clear picture. There is a dotted line, which helps in comparison with the baseline ART you had configured in settings section. If no ART is configured, this is simple average of all the requests in the time selected on top right.

Throughput : Throughput aggregated for all requests across cluster/group in timeseries.

List of nodes : List of nodes in cluster/group alongwith their health.

Errors : Error distribution (400/500) aggregated for all nodes across cluster/group in timeseries.

Slow URL’s : List of distinct URL’s with slowest ART. Number of URL’s you want to see is again taken as input.

Errors Across Nodes : Pie chart denoting the distribution of errors across nodes.

There are various other metrics at cluster/group level which are shown in other tabs – Applications and Instances.

Node

You see a screen like this. On top, health of node is shown. On top right like all pages, there is a time filter which you can play with and check the data at granular level.

There are various charts on this screen –

Average Response Time : Average response time aggregated for all requests hitting the node in timeseries. There is a dotted line, which helps in comparison with the baseline ART you had configured in settings section. If no ART is configured, this is simple average of all the requests in the time selected on top right.

Throughput : Throughput aggregated for all requests hitting node in timeseries.

Errors : Error distribution (400/500) aggregated for all nodes across cluster/group in timeseries.

Success/Failures : Pie chart denoting percentage of requests failed/passed.

Top Slow URLs : Table containing list of slowest distinct URLs hitting the node.

Info : Basic information about the node like Operating System Version, JVM version, and many more things.

Thread Dump History : Table containing list of thread dumps taken in descending order of time. You can go back in time and analyse thread dumps taken in past for the node by you.

Heap Dump History : Table containing list of heap dumps taken in descending order of time alongwith their respective locations.

There are various other metrics at node level which are shown in other tabs – CF Metrics, System, JVM, Active Monitoring, Caching.