Monitoring Notary Latency

To measure notary latency, combine three metrics:

  • P2P.ReceiveDuration - The time between receiving a message to queueing a flow.
  • Flows.StartupQueueTime - The time flows spends in the queue.
  • FlowDuration.(Non)ValidatingFLow - The time the flow spends being running.

P2P.ReceiveDuration measures the latency for a consumer to deliver the message to the state machine. At the end of the delivery, a flow state machine is created. The Flows.StartupQueueTime timer starts after the flow state machine is created, and ends when FlowDuration starts. The FlowDuration timer starts immediately before the flow state machine calls the flow’s call() function.

Using Flows.StartupQueueTime alone to measure latency doesn’t take account of the queue time or the time until a flow is queued.

If notary workers are performing badly, the latency of the entire cluster can be affected. There are several metrics that you can monitor to alert you to problems within your notary cluster:

  • Flows.Actions.PersistCheckpoint - This measures checkpoint latency and should remain stable. If it increases, there is likely a problem with the notary worker database.
  • Flows.Actions.CommitTransaction - This metric can vary, but a sudden spike can indicate problems with the notary worker database.

For cluster-specific monitoring:

  • UniquenessProvider.Rollback - This counter increases every time a transaction fails and is rolled back. This may indicate attempted double-spends, race conditions between workers, and database errors. This metric should remain stable.
  • UniquenessProvider.BatchSignLatency - This metric provides insights about how long generating a batch signature takes. This includes the building of a merkle tree, so it is expected that the time will vary depending on the number of events to be signed. This metric should remain stable.
  • P2P.SendQueueSize - This metric should be monitored to ensure the outbound bandwidth is sufficient. If the outbound bandwidth is not sufficient, the cluster may fail.

Was this page helpful?

Thanks for your feedback!

Chat with us

Chat with us on our #docs channel on slack. You can also join a lot of other slack channels there and have access to 1-on-1 communication with members of the R3 team and the online community.

Propose documentation improvements directly

Help us to improve the docs by contributing directly. It's simple - just fork this repository and raise a PR of your own - R3's Technical Writers will review it and apply the relevant suggestions.

We're sorry this page wasn't helpful. Let us know how we can make it better!

Chat with us

Chat with us on our #docs channel on slack. You can also join a lot of other slack channels there and have access to 1-on-1 communication with members of the R3 team and the online community.

Create an issue

Create a new GitHub issue in this repository - submit technical feedback, draw attention to a potential documentation bug, or share ideas for improvement and general feedback.

Propose documentation improvements directly

Help us to improve the docs by contributing directly. It's simple - just fork this repository and raise a PR of your own - R3's Technical Writers will review it and apply the relevant suggestions.