Metrics data and monitoring

You can monitor the metrics data from your node for a variety of reasons, and in different ways. Some suggested scenarios for monitoring are:

Risk of out of memory error - Monitor the used memory in your node’s HeapMemoryUsage attribute.
High CPU usage - Monitor the SystemCpuLoad property of your node, to check for high CPU measurements.
High flow error rate - Check for repeated errors in the flows being used on your node. Flows are the way CorDapps perform their functions, if there is a high level of errors, there may be either an issue with your node, or a bug in the CorDapp or flow itself.
Network parameter update proposal not accepted - Check to see whether network parameters that you or another party has proposed to the Network Map have yet been accepted. The updates could still be awaiting approval.
Processing messages takes too long - Measure the time taken for Peer to Peer (P2P) messaging to be processed. If there is a high latency, you can choose to flag this as an error.
Committing transactions time - Measure how long it takes to commit an executed action on the network.
Signing transactions time - Where a signature is required for a transaction, you can measure the time being taken for this to be completed.

You can see a complete list, and guidance on monitoring specific scenarios in the Monitoring scenarios docs.

Metrics data

A Corda node exports a number of metrics for the purpose of monitoring the health of the node via JMX.

You can get metrics for your node from these key sources:

Caches - A Corda node maintains a number of caches. For each of the metrics below, the name of the cache must be supplied in the component field to show metrics for that cache.
Flows - Flow metrics can be used to measure key data about the activity on your node. Metrics include the total number of flows in flight at a given time, the total number of completed flows, and the total number of flows that failed with an error.
Actions - Actions are reified IO actions to execute as part of state machine transitions. These metrics are only exposed when the relevant action gets executed for the first time.
Metering - Metering metrics can be used to get an overview of the performance of commands that are persisted, the number of persisted signing events, the length of a queue of events waiting to be persisted, and more.
P2P - Messaging between parties can be measured in a number of ways, including metrics for latency between messages being sent and received between nodes, the size of sent messages, the interval between received P2P messages.
Other metrics - Measure the tine taken to sign a transaction or check whether proposed network parameter updates have been accepted yet.

Take a look at the Node metrics documentation for a complete range of the metrics data available from your node.