ArangoDB v3.4 reached End of Life (EOL) and is no longer supported.

This documentation is outdated. Please see the most recent version here: Latest Docs

Monitoring replication slave

Note: this recipe is working with ArangoDB 2.5, you need a collectd curl_json plugin with correct boolean type mapping.


How to monitor the slave status using the collectd curl_JSON plugin.


Since arangodb reports the replication status in JSON, integrating it with the collectd curl_JSON plugin should be an easy exercise. However, only very recent versions of collectd will handle boolean flags correctly.

Our test master/slave setup runs with the master listening on tcp:// and the slave (which we query) listening on tcp:// They replicate a database by the name testDatabase.

Since replication appliers are active per database and our example doesn’t use the default _system, we need to specify its name in the URL like this: _db/testDatabase.

We need to parse a document from a request like this:

curl --dump - http://localhost:8530/_db/testDatabase/_api/replication/applier-state

If the replication is not running the document will look like that:

  "state": {
    "running": false,
    "lastAppliedContinuousTick": null,
    "lastProcessedContinuousTick": null,
    "lastAvailableContinuousTick": null,
    "safeResumeTick": null,
    "progress": {
      "time": "2015-11-02T13:24:07Z",
      "message": "applier shut down",
      "failedConnects": 0
    "totalRequests": 1,
    "totalFailedConnects": 0,
    "totalEvents": 0,
    "totalOperationsExcluded": 0,
    "lastError": {
      "time": "2015-11-02T13:24:07Z",
      "errorMessage": "no start tick",
      "errorNum": 1413
    "time": "2015-11-02T13:31:53Z"
  "server": {
    "version": "2.7.0",
    "serverId": "175584498800385"
  "endpoint": "tcp://",
  "database": "testDatabase"

A running replication will return something like this:

  "state": {
    "running": true,
    "lastAppliedContinuousTick": "1150610894145",
    "lastProcessedContinuousTick": "1150610894145",
    "lastAvailableContinuousTick": "1151639153985",
    "safeResumeTick": "1150610894145",
    "progress": {
      "time": "2015-11-02T13:49:56Z",
      "message": "fetching master log from tick 1150610894145",
      "failedConnects": 0
    "totalRequests": 12,
    "totalFailedConnects": 0,
    "totalEvents": 2,
    "totalOperationsExcluded": 0,
    "lastError": {
      "errorNum": 0
    "time": "2015-11-02T13:49:57Z"
  "server": {
    "version": "2.7.0",
    "serverId": "175584498800385"
  "endpoint": "tcp://",
  "database": "testDatabase"

We create a simple collectd configuration in /etc/collectd/collectd.conf.d/slave_testDatabase.conf that matches our API:

TypesDB "/etc/collectd/collectd.conf.d/slavestate_types.db"
<Plugin curl_json>
  # Adjust the URL so collectd can reach your arangod slave instance:
  <URL "http://localhost:8530/_db/testDatabase/_api/replication/applier-state">
   # Set your authentication to that database here:
   # User "foo"
   # Password "bar"
    <Key "state/running">
       Type "boolean"
    <Key "state/totalOperationsExcluded">
       Type "counter"
    <Key "state/totalRequests">
       Type "counter"
    <Key "state/totalFailedConnects">
       Type "counter"

To get nice metric names, we specify our own types.db file in /etc/collectd/collectd.conf.d/slavestate_types.db:

boolean                     value:ABSOLUTE:0:1

So, basically state/running will give you 0/1 if its (not / ) running through the collectd monitor.

Author: Wilfried Goesgens

Tags: #monitoring #foxx #json