Skip to main content

System Overview

The mission of Aerospike Database is to be very fast, highly scalable, and extremely reliable for use in real-time big data applications. The Operations Manual explains how to create and maintain an Aerospike implementation - plan, install, configure, manage, monitor, tune and troubleshoot. This Introduction gives an overview of the content - to help you understand the various subsections of the Operations Manual and to help guide you to the right material.

Plan

This section covers how to plan and select the best hardware configuration for your application.

Install

This sections describes how to install Aerospike on Amazon EC2, different Linux distributions, macOS, Windows and on several cloud providers.

Configure

In Aerospike there is a single configuration file on each database node which specifies parameters for network, namespace, log and datacenter replication. For a given namespace most of the information in the configuration files will be the same.

  • Amazon EC2 - recommendations for configuring port, ip address, heartbeat mode, rack awareness and other parameters
  • Google Cloud Compute - recommendations for configuring network, firewall, and clusters
  • Network - configure port, ip address, heartbeat mode, rack awareness and other parameters
  • Namespace - configure data storage location, data retention and data replication
  • Access Control - configure Access Control for user, role, and privilege creation and maintenance
  • LDAP - configure using an External Authentication system
  • Encryption at Rest - configure encrypting database record data on storage devices using symmetric AES-128 encryption
  • Consistency - configure namespace with "strong-consistency"
  • Log - configure log location and logging level, and learn use of logrotate tool
  • Cross-Datacenter Replication - establish and configure Cross-Datacenter Replication (XDR) for Aerospike Enterprise Edition customers (set parameters, establish topology, configure network and specify data replication)
  • Non-Root - set-up Aerospike to run as a non-root user

Manage

Aerospike management functions include starting and stopping Aerospike and XDR services, adjusting data retention policies, and managing Aerospike features like indexes, queries, scans, and UDFs.

  • Aerospike Daemon - control the Aerospike Daemon with the SysV init script
  • Aerospike systemd - control the Aerospike Daemon with systemd
  • Consistency - add and remove nodes in strong consistency namespaces, as well as how to detect and repair unavailability
  • Log Files - working with the Log File
  • Storage Capacity - setting data eviction, time-to-live and defragmentation parameters
  • Migrations - understanding, managing and monitoring migrations
  • Sets - use asadm to set and manage parameters for a set
  • Indexes - use asadm to create and manage secondary indexes
  • Queries - use asadm to set and update parameters for queries across a cluster
  • Scans - set configuration parameters to manage scans
  • UDFs - using asadm and aql tools or a Java, C# or C client to manage UDFs

Upgrade

Aerospike supports upgrading a cluster or repairing a server without service downtime and without data loss.

Monitor

It is important to monitor your Aerospike system in order to decrease operational response time to outage events such as hardware failure and software errors. Also, some monitoring tools (such as the Aerospike Monitoring Stack) can provide trend data to allow your operations team to effectively recognize and address future scale hurdles. Important metrics can be gathered in the areas of applications, memory, networks, storage, services and trends.

  • Key Metrics - recommended metrics to use for monitoring and trending
  • Latency - access latency trends from Aerospike Logs

Troubleshoot

note

Make sure to consult and search over the Knowledge-Base topics as a wide variety of issues and remediations are covered there.

What to do, step-by-step, to diagnose system problems. Also, specific points in the several areas listed below.

  • Install - problems with installation
  • Startup - problems with: ASD daemon, file descriptors in log, defrag loop, network device replacement
  • Node - adjusting eviction rate to avoid an out of memory (OOM) problem
  • Cluster - cluster integrity fault; check for node down; Paxos/fabric health issues after network glitch or cluster size change
  • Dynamic Config - using asadm to dynamically change parameters, and a list of several common parameter settings
  • Misc - fire-forget feature, transaction-pending-limit, response to stack trace, "key field too big"

Reference Manuals