Skip to main content
Loading
Version: Graph 1.x.x

Indexing

Indexes can make graph database queries faster and more efficient. To create an index on a vertex property or label in Aerospike Graph, edit the configuration file you use to start the Aerospike Graph Service (AGS) Docker image.

Vertex property index creation

To create an index on a vertex property, add the configuration parameter aerospike.graph.index.vertex.properties to the file and assign it a comma-separated list of vertex property keys to index. In the following example, vertex properties property_key1 and property_key2 are specified for indexing:

aerospike.graph.index.vertex.properties=property_key1,property_key2

Vertex property indexes are taken as a union from all AGS instances. This means that if one AGS instance has an index on vertex property property_key1 and another has an index on vertex property property_key2, AGS creates indexes for both properties. If an index is created on any AGS instance in a cluster, the other instances detect it and leverage it as well.

When a vertex property index is first created on a dataset, the time it takes to create the index is proportional to the amount of data in the Aerospike database. Larger amounts of data take longer to index. You can create a property index either before or after populating the database with data, but before is faster.

note

Vertex property indexes have a value limit of 2k bytes. Any property values which are greater than 2k bytes cannot be indexed.

Vertex label index creation

To create indexes on all vertex labels, add the configuration parameter aerospike.graph.index.vertex.label.enabled to the configuration file and set it to true.

aerospike.graph.index.vertex.label.enabled=true

If you create a label index on one AGS instance, all the other AGS instances in the cluster detect the change and leverage the same index.

Example

Consider an Aerospike Graph database with the following schema:

VERTICES:
label: "Person"
{
"name": "John Doe",
"age": 30,
"address": "123 Main St",
"city": "San Francisco",
"state": "CA",
"country": "USA",
"zip": "94105"
}

EDGES:
label: "knows"
{
}

To create an index on the name and age fields, as well as a vertex label index, add the following line to the Aerospike Graph configuration file:

aerospike.graph.index.vertex.properties=name,age
aerospike.graph.index.vertex.label.enabled=true

Impact of indexes on traversals

A vertex property index affects only the first step of a traversal. Subsequent steps are not affected. However, if a traversal's initial steps involve an indexed property and a non-indexed property, Graph reorders the steps automatically to perform the indexed property step first to obtain its benefit.

For maximum benefit, the best vertex properties to index are ones that a query can use to narrow the dataset down to one or very few vertices which the traversal can start from. Properties that tend to have distinct values and a low level of duplication throughout the dataset are best to index.

Example traversals

The following traversals use the schema and indexes shown in the index example.

Single indexed vertex property

This traversal uses the index on the name field:

        ______ The first step uses the index, so it is fast and efficient.
|
| _______________ Subsequent steps do not use
| | | | the index because they are not at the
| | | | start of the traversal.
v V v v
g.V().has("name", "Lyndon").out().in().has("name", "Simon").toList()

Single non-indexed vertex property

This traversal does not use an index and may perform badly.

        ______ This step does not use an index and must scan the entire database
| for the `country` property.
|
| __________ These steps do not use the index because they
| | | are not at the start of the traversal.
v V v
g.V().has("country", "USA").out().has("name", "Lyndon").toList()

One indexed and one unindexed vertex property

This traversal performs two has steps, one on the unindexed country field and one on the indexed name field. Graph compounds the two has steps together and runs the indexed one first, improving the traversal's performance.

 g.V().has("country", "USA").has("name", "Lyndon").out().has("name", "Simon").toList()

Two indexed vertex properties

This traversal performs two initial has steps, both on indexed properties. AGS uses cardinality metadata from the Aerospike database to determine which step to run first for maximum efficiency.

note

Cardinality metadata in Aerospike is updated once per hour, so index efficiency information may not always be current.

g.V().has("age", 29).has("name", "Lyndon").out().has("name", "Simon").toList()

Label index and indexed vertex property

This traversal's first two steps are a hasLabel step which uses the instance's label index, and a has step which uses the name property index. AGS performs the has step first, because property indexes usually have higher cardinality than label indexes.

g.V().hasLabel("Person").has("name", "Lyndon").out().has("name", "Simon").toList()

Label index and unindexed vertex property

This traversal begins with a hasLabel step which uses the instance's label index, and a has step which involves the unindexed country property. AGS performs the hasLabel step first and uses the index, but the country step may be slow and inefficient.

g.V().hasLabel("Person").has("country", "USA").out().has("name", "Simon").toList()