Working with Indexes
Learn how to use different indexes efficiently by going through the
ArangoDB Performance Course.
Index Identifiers and Handles
An index handle uniquely identifies an index in the database. It is a string and
consists of the collection name and an index identifier separated by a /
. The
index identifier part is a numeric value that is auto-generated by ArangoDB.
A specific index of a collection can be accessed using its index handle or
index identifier as follows:
db.collection.index("<index-handle>");
db.collection.index("<index-identifier>");
db._index("<index-handle>");
For example: Assume that the index handle, which is stored in the _id
attribute of the index, is demo/362549736
and the index was created in a collection
named demo
. Then this index can be accessed as:
db.demo.index("demo/362549736");
Because the index handle is unique within the database, you can leave out the
collection and use the shortcut:
db._index("demo/362549736");
An index may also be looked up by its name. Since names are only unique within
a collection, rather than within the database, the lookup must also include the
collection name.
db._index("demo/primary")
db.demo.index("primary")
Collection Methods
Listing all indexes of a collection
returns information about the indexes
getIndexes()
Returns an array of all indexes defined for the collection.
Since ArangoDB 3.4, indexes()
is an alias for getIndexes()
.
Note that _key
implicitly has an index assigned to it.
arangosh> db.test.ensureIndex({ type: "persistent", fields: [
........> "attribute", "secondAttribute.subAttribute"] });
arangosh> db.test.getIndexes();
{
"deduplicate" : true,
"estimates" : true,
"fields" : [
"attribute",
"secondAttribute.subAttribute"
],
"id" : "test/71443",
"isNewlyCreated" : true,
"name" : "idx_1733157480581562369",
"selectivityEstimate" : 1,
"sparse" : false,
"type" : "persistent",
"unique" : false,
"code" : 201
}
[
{
"fields" : [
"_key"
],
"id" : "test/0",
"name" : "primary",
"selectivityEstimate" : 1,
"sparse" : false,
"type" : "primary",
"unique" : true
},
{
"deduplicate" : true,
"estimates" : true,
"fields" : [
"attribute"
],
"id" : "test/71435",
"name" : "idx_1733157480580513793",
"selectivityEstimate" : 1,
"sparse" : false,
"type" : "persistent",
"unique" : true
},
{
"deduplicate" : true,
"estimates" : true,
"fields" : [
"uniqueAttribute"
],
"id" : "test/71439",
"name" : "idx_1733157480581562368",
"selectivityEstimate" : 1,
"sparse" : false,
"type" : "persistent",
"unique" : true
},
{
"deduplicate" : true,
"estimates" : true,
"fields" : [
"attribute",
"secondAttribute.subAttribute"
],
"id" : "test/71443",
"name" : "idx_1733157480581562369",
"selectivityEstimate" : 1,
"sparse" : false,
"type" : "persistent",
"unique" : false
}
]
Creating an index
ensures that an index exists
collection.ensureIndex(index-description)
Ensures that an index according to the index-description exists. A
new index will be created if none exists with the given description.
The index-description must contain at least a type attribute.
Other attributes may be necessary, depending on the index type.
type can be one of the following values:
- persistent: persistent index
- fulltext: fulltext index
- geo: geo index, with one or two attributes
name can be a string. Index names are subject to the same character
restrictions as collection names. If omitted, a name will be auto-generated so
that it is unique with respect to the collection, e.g. idx_832910498
.
sparse can be true or false.
For persistent the sparsity can be controlled, fulltext and geo
are sparse by definition.
unique can be true or false and is supported by persistent
Calling this method returns an index object. Whether or not the index
object existed before the call is indicated in the return attribute
isNewlyCreated.
deduplicate can be true or false and is supported by array indexes of
type persistent. It controls whether inserting duplicate index values
from the same document into a unique array index will lead to a unique constraint
error or not. The default value is true, so only a single instance of each
non-unique index value will be inserted into the index per document. Trying to
insert a value into the index that already exists in the index will always fail,
regardless of the value of this attribute.
estimates can be true or false and is supported by indexes of type
persistent. This attribute controls whether index selectivity estimates are
maintained for the index. Not maintaining index selectivity estimates can have
a slightly positive impact on write performance.
The downside of turning off index selectivity estimates will be that
the query optimizer will not be able to determine the usefulness of different
competing indexes in AQL queries when there are multiple candidate indexes to
choose from.
The estimates attribute is optional and defaults to true if not set. It will
have no effect on indexes other than persistent (with hash and skiplist
being mere aliases for persistent nowadays).
Examples
arangosh> db.test.ensureIndex({ type: "persistent", fields: [ "a" ], sparse: true });
arangosh> db.test.ensureIndex({ type: "persistent", fields: [ "a", "b" ], unique: true });
{
"deduplicate" : true,
"estimates" : true,
"fields" : [
"a"
],
"id" : "test/71401",
"isNewlyCreated" : true,
"name" : "idx_1733157480567930881",
"selectivityEstimate" : 1,
"sparse" : true,
"type" : "persistent",
"unique" : false,
"code" : 201
}
{
"deduplicate" : true,
"estimates" : true,
"fields" : [
"a",
"b"
],
"id" : "test/71405",
"isNewlyCreated" : true,
"name" : "idx_1733157480568979456",
"selectivityEstimate" : 1,
"sparse" : false,
"type" : "persistent",
"unique" : true,
"code" : 201
}
Dropping an index via a collection handle
drops an index
collection.dropIndex(index)
Drops the index. If the index does not exist, then false is
returned. If the index existed and was dropped, then true is
returned. Note that you cannot drop some special indexes (e.g. the primary
index of a collection or the edge index of an edge collection).
collection.dropIndex(index-handle)
Same as above. Instead of an index an index handle can be given.
arangosh> db.example.ensureIndex({ type: "persistent", fields: ["a", "b"] });
arangosh> var indexInfo = db.example.getIndexes();
arangosh> indexInfo;
arangosh> db.example.dropIndex(indexInfo[0])
arangosh> db.example.dropIndex(indexInfo[1].id)
arangosh> indexInfo = db.example.getIndexes();
{
"deduplicate" : true,
"estimates" : true,
"fields" : [
"a",
"b"
],
"id" : "example/71274",
"isNewlyCreated" : true,
"name" : "idx_1733157480516550657",
"selectivityEstimate" : 1,
"sparse" : false,
"type" : "persistent",
"unique" : false,
"code" : 201
}
[
{
"fields" : [
"_key"
],
"id" : "example/0",
"name" : "primary",
"selectivityEstimate" : 1,
"sparse" : false,
"type" : "primary",
"unique" : true
},
{
"deduplicate" : true,
"estimates" : true,
"fields" : [
"a",
"b"
],
"id" : "example/71274",
"name" : "idx_1733157480516550657",
"selectivityEstimate" : 1,
"sparse" : false,
"type" : "persistent",
"unique" : false
}
]
false
true
[
{
"fields" : [
"_key"
],
"id" : "example/0",
"name" : "primary",
"selectivityEstimate" : 1,
"sparse" : false,
"type" : "primary",
"unique" : true
}
]
Load Indexes into Memory
Loads all indexes of this collection into Memory.
collection.loadIndexesIntoMemory()
This function tries to cache all index entries
of this collection into the main memory.
Therefore it iterates over all indexes of the collection
and stores the indexed values, not the entire document data,
in memory.
All lookups that could be found in the cache are much faster
than lookups not stored in the cache so you get a nice performance boost.
It is also guaranteed that the cache is consistent with the stored data.
This function honors memory limits. If the indexes you want to load are smaller
than your memory limit this function guarantees that most index values are
cached. If the index is larger than your memory limit this function will fill
up values up to this limit and for the time being there is no way to control
which indexes of the collection should have priority over others.
arangosh> db.example.loadIndexesIntoMemory();
Database Methods
Fetching an index by handle
finds an index
db._index(index-handle)
Returns the index with index-handle or null if no such index exists.
arangosh> db.example.ensureIndex({ type: "persistent", fields: [ "a", "b" ] });
arangosh> var indexInfo = db.example.getIndexes().map(function(x) { return x.id; });
arangosh> indexInfo;
arangosh> db._index(indexInfo[0])
arangosh> db._index(indexInfo[1])
{
"deduplicate" : true,
"estimates" : true,
"fields" : [
"a",
"b"
],
"id" : "example/66796",
"isNewlyCreated" : true,
"name" : "idx_1733157426655395841",
"selectivityEstimate" : 1,
"sparse" : false,
"type" : "persistent",
"unique" : false,
"code" : 201
}
[
"example/0",
"example/66796"
]
{
"fields" : [
"_key"
],
"id" : "example/0",
"name" : "primary",
"sparse" : false,
"type" : "primary",
"unique" : true,
"code" : 200
}
{
"deduplicate" : true,
"estimates" : true,
"fields" : [
"a",
"b"
],
"id" : "example/66796",
"name" : "idx_1733157426655395841",
"sparse" : false,
"type" : "persistent",
"unique" : false,
"code" : 200
}
Dropping an index via a database handle
drops an index
db._dropIndex(index)
Drops the index. If the index does not exist, then false is
returned. If the index existed and was dropped, then true is
returned.
db._dropIndex(index-handle)
Drops the index with index-handle.
arangosh> db.example.ensureIndex({ type: "persistent", fields: [ "a", "b" ] });
arangosh> var indexInfo = db.example.getIndexes();
arangosh> indexInfo;
arangosh> db._dropIndex(indexInfo[0])
arangosh> db._dropIndex(indexInfo[1].id)
arangosh> indexInfo = db.example.getIndexes();
{
"deduplicate" : true,
"estimates" : true,
"fields" : [
"a",
"b"
],
"id" : "example/71917",
"isNewlyCreated" : true,
"name" : "idx_1733157480683274241",
"selectivityEstimate" : 1,
"sparse" : false,
"type" : "persistent",
"unique" : false,
"code" : 201
}
[
{
"fields" : [
"_key"
],
"id" : "example/0",
"name" : "primary",
"selectivityEstimate" : 1,
"sparse" : false,
"type" : "primary",
"unique" : true
},
{
"deduplicate" : true,
"estimates" : true,
"fields" : [
"a",
"b"
],
"id" : "example/71917",
"name" : "idx_1733157480683274241",
"selectivityEstimate" : 1,
"sparse" : false,
"type" : "persistent",
"unique" : false
}
]
false
true
[
{
"fields" : [
"_key"
],
"id" : "example/0",
"name" : "primary",
"selectivityEstimate" : 1,
"sparse" : false,
"type" : "primary",
"unique" : true
}
]
Revalidating whether an index is used
finds an index
So you’ve created an index, and since its maintenance isn’t for free,
you definitely want to know whether your query can utilize it.
You can use explain to verify that a certain index is used:
arangosh> var explain = require("@arangodb/aql/explainer").explain;
arangosh> db.example.ensureIndex({ type: "persistent", fields: [ "a", "b" ] });
arangosh> explain("FOR doc IN example FILTER doc.a < 23 RETURN doc", {colors: false});
{
"deduplicate" : true,
"estimates" : true,
"fields" : [
"a",
"b"
],
"id" : "example/66810",
"isNewlyCreated" : true,
"name" : "idx_1733157426660638720",
"selectivityEstimate" : 1,
"sparse" : false,
"type" : "persistent",
"unique" : false,
"code" : 201
}
Query String (47 chars, cacheable: true):
FOR doc IN example FILTER doc.a < 23 RETURN doc
Execution plan:
Id NodeType Est. Comment
1 SingletonNode 1 * ROOT
6 IndexNode 0 - FOR doc IN example
5 ReturnNode 0 - RETURN doc
Indexes used:
By Name Type Collection Unique Sparse Selectivity Fields Ranges
6 idx_1733157426660638720 persistent example false false 100.00 % [ `a`, `b` ] (doc.`a` < 23)
Optimization rules applied:
Id RuleName
1 use-indexes
2 remove-filter-covered-by-index
3 remove-unnecessary-calculations-2
Optimization rules with highest execution times:
RuleName Duration [s]
use-indexes 0.00002
remove-filter-covered-by-index 0.00001
reduce-extraction-to-projection 0.00000
remove-unnecessary-calculations-2 0.00000
move-calculations-up 0.00000
40 rule(s) executed, 1 plan(s) created
(If you omit colors: false
you will get nice colors in ArangoShell.)