Aerospike's primary key index is a blend of distributed hash table technology with a distributed tree structure in each server. The entire keyspace in the namespace is separated via a robust hash function into partitions. A total of 4096 partitions are equally distributed across cluster nodes. See data-distribution for details on hashing and partitioning.
Aerospike uses a red-black tree structure called a sprig. You can configure the number of sprigs for each partition. Configuring the right number of sprigs is a trade-off between extra space overhead and optimized parallel access.
Where sprigs are stored is determined by the
configuration parameter. For more information, see Index storage.
Most Aerospike instances use hybrid storage,
with indexes in memory and data on SSD.
The primary index is on the 20 byte hash called the digest of the specified primary key. While this expands the key size of some records (for example, an integer key which is only 8-bytes), it is beneficial because code operation is predictable regardless of input key size or distribution.
When a server fails, the indexes on another server are immediately available. If the failed server remains down, data starts rebalancing, and replicated indexes are built on new nodes.
Currently, each index entry requires 64 bytes. In addition to the 20-byte digest, the following metadata are also stored in index.
Generation count: Tracks all writes to the record; used for resolving conflicting updates.
Expiration time or TTL: Tracks time when a key expires. The eviction subsystem uses this metadata.
Last Update Time: Tracks the last writes to the key (Citrusleaf epoch). Used for conflict resolution during cold restart, conflict resolution during migration (depending on your configuration settings), Filter Expressions, incremental backup scans, truncate and truncate-namespace commands.
The primary index is derived from the data itself and can be rebuilt from that data, depending on the configuration setting for fast restart.
Fast restart feature
Aerospike's fast restart feature enables upgrades with minimal downtime in Aerospike Database Enterprise Edition (EE) and Aerospike Database Standard Edition (SE). Fast restart allocates index memory from a Linux shared memory segment. For planned shutdowns and restarts, for an upgrade for example, the server re-attaches to the shared memory segment and activates the primary indexes on restart without a data scan of the storage.
To enable fast restarts, set the
shmem (shared memory) or
pmem (persistent memory).
Where the server stores a primary index is determined by the
configuration parameter. The following options are available:
|Linux shared memory.|
|A block storage device (typically NVMe SSD).|
|Persistent memory (e.g. Intel Optane DC Persistent Memory).|
index-type configuration option is available only in Aerospike Server Enterprise Edition (EE).
Community Edition (CE) stores primary and secondary indexes in volatile process memory.
For more information about primary index storage methods, see Configure the Primary Index.