Skip to main content

Aerospike Graph Technical Overview

High-level architecture

Aerospike Graph leverages Apache TinkerPop, an open source graph computing framework for online transaction processing and online analytical processing graph queries. It’s a mature codebase that has been developed and tested since 2009. In Aerospike Graph, Aerospike Database is the underlying persistence layer for TinkerPop, the graph computing engine.

Aerospike Graph Service is the name of the Graph API implementation that provides a deep integration between TinkerPop and Aerospike. This is where we’ve created a highly optimized data model to represent graph elements, such as vertices and edges, in the Aerospike data model, using records, bins, and other Aerospike features. This is also where we’ve implemented numerous traversal strategies and step optimizations by taking advantage of secondary indexes, expressions and other features supported by the Aerospike server.

Aerospike Graph uses a Gremlin interface. Gremlin is a graph query language. Aerospike Graph supports Gremlin out of the box as a first-class citizen in the TinkerPop ecosystem.

Aerospike Graph high-level architecture

To sum up:

  • Gremlin provides the interface to TinkerPop.
  • TinkerPop serves as the computing engine.
  • Aerospike Graph Service provides integration between TinkerPop and Aerospike Database, the data storage foundation.

Deployment model

The Aerospike Graph deployment model consists of three main components: an application layer, a query language layer, and a data storage layer.

Aerospike Graph deployment model

  1. The application layer may consist of multiple applications or application threads, using drivers in one or more programming languages. See Gremlin Drivers and Variants for more information.
  2. On the query layer, applications communicate with Aerospike Graph Service instances through a websocket protocol.
  3. On the data storage layer, Aerospike clusters may reside on bare metal, in on-premise servers or in cloud-based installations.

For optimal performance in production environments, you should run the three components as separate instances or virtual machines. For development purposes, they can all run on a single machine.