Skip to main content
Loading

Resilience

Overview

Aerospike optimizes writing to disk by grouping multiple record writes together. If a namespace is configured to store data on an SSD device, or is in-memory with persistence to a device or filesystem, the new version of the record is placed in a streaming write buffer pending a flush to the storage device. The record's metadata entry in the primary index is adjusted, updating its pointer to the new location of the record. Aerospike performs a copy-on-write for create/update/replace.

Block size and cache size

The write-block-size configuration parameter defines the size in bytes of each I/O block that is written to the disk. You can increase or decrease the write block size depending on your record size. The default value is 1MB and the configured value of this parameter must be a power of 2. The options are: 128K, 256K, 512K, 1M, 2M, 4M, and 8M. To identify the optimal settings, Aerospike recommends running a benchmark tool (ACT). Enterprise licensees can contact Aerospike Support for guidance.

Each device associated with a namespace has a write queue, and a cache. The configuration max-write-cache controls the number of bytes of pending write blocks that the system is allowed to keep before failing writes, if the write queue can't immediately flush a streaming write buffer to a write block on the disk.

Writes throttling circuit-breaker

The size of the write cache is calculated using the number of devices in the namespace multiplied by the max-write-cache. This value is a baseline, not a limit. The system throttles various write types at specific thresholds past the baseline with Error Code 18: Device overload returned to the client when appropriate. Each threshold has its own "queue too deep" errors in the server logs. The 'max' number listed in the following example log messages assumes the example baseline is 512 write blocks.

  1. At baseline, the calculated write cache size, the master writes fail with an error message in the server logs - write fail: queue too deep: exceeds max 512.
  2. At baseline, the UDFs writes fail with an error message in the server logs - UDF fail: queue too deep: exceeds max 512. All UDF writes fail by design.
  3. At baseline, duplicate resolutions fail with an error message in the server logs - dup-res fail: queue too deep: exceeds max 512.
  4. At baseline + 32 write blocks, durable deletes fail with this error message in the server logs - durable delete fail: queue too deep: exceeds max 544.
  5. At baseline + 64 write blocks, immigration writes stop with this error message in the server logs - immigrate fail: queue too deep: exceeds max 576. This will cause retransmits until the write queue gets below the threshold.
  6. At baseline + 128 write blocks, defrag writes stop (changed in 5.7 from 100). The defrag process sleeps until the cache is back under the limit. There's no associated log message for this throttling.
  7. At baseline + 192 write blocks, replica writes stop with this error message in the server logs - replica write: queue too deep: exceeds max 704.

Introduced in v.5.7

Various write types are throttled at margins greater than the write cache baseline.

Introduced in v.5.1

Defrag writes are throttled when the write cache reaches 100 write blocks greater than the calculated write cache baseline. Throttling defrag does not affect migration and replica writes. Client writes are allowed until the sum of all in-use streaming write buffers (swb) equals the number of devices multiplied by the configured value of max-write-cache. In systems prior to v.5.1 If any single device reaches the max-write-cache, all devices block client writes.