Cold Restart
Overview
This page describes how to manage a cold restart. A cold restart (AKA coldstart) of the Aerospike daemon (asd) rebuilds namespace indexes by reading namespace data storage. A warm restart reattaches to persisted indexes.
The speed of a cold restart is relative to what the namespace in which data is stored - shared memory will be the fastest, followed by Intel Optane Persistent Memory (PMem), then SSD devices.
When does Aerospike cold restart?
- Always for Aerospike Database Community Edition (CE).
- When explicitly instructed to cold restart from command-line.
# systemd systemctl doesn't support application commands, so you can run
asd-coldstart
# or
service aerospike coldstart - After the Aerospike Daemon (asd) crashes.
- If namespace indexes are stored in shared memory, and their shmem segments are missing, for example after the host machine reboots. To speed up planned reboots, see the Aerospike Shared Memory Tool (ASMT).
- Before Database 7.0, in-memory namespaces, with the exception of
data-in-index namespaces, could not warm restart.
However, after a clean shutdown such namespaces would 'cool restart', which was
faster than a regular cold restart. The indexes were present, but namespace
data needed to be reloaded from the persistence layer into process memory.
INFO (namespace): (namespace_ee.c:360) {test} beginning cool restart
- Before Database 6.1, namespaces with secondary indexes could not warm restart.
- Changing the value of
partition-tree-sprigs
, for example, from a value of 512 sprigs per-partition to 1024:WARNING (namespace): (namespace_ee.c:495) {test} persistent memory partition-tree-sprigs 512 doesn't match config 1024
- Namespace data storage was wiped clean while the server was down. Cold restart will be fast in this case, since there is no data to read.
Namespaces restart independently, some may cold start and some may warm start, depending on the conditions described above. The Aerospike server node only joins the cluster after all its namespaces have restarted.
Factors impacting cold restart
Multiple versions of existing records can exist in namespace storage. This depends on the frequency of updates, and the rate of storage defragmentation.
- Starting Database 7.0, an in-memory namespace without storage-backed persistence will cold restart from shared memory. This option is much faster than an in-memory namespace cold restarting from a persistence layer.
- Starting Database 7.0, an in-memory namespace with storage-backed persistence will restart from shared memory (and not the persistence layer), if the number of sprigs per-partition is changed. This will be fast.
- You may be able to speed up the building of secondary indexes by setting
sindex-startup-device-scan
to true. - Cold restarts will be slowed down if evictions are triggered during restart,
which can happen if NSUP has not been keeping up.
You can
disable-cold-start-eviction
for the namespace. For more, see the article How do I speed up cold start eviction? - Records that have not been durably deleted may be reverted to an older version.
- In
strong-consistency
enabled namespaces, cold restarts might be affected by the following:- All records will be marked as unreplicated (refer to the
appeals_tx_remaining
stat). - Unreplicated records on a single node should not have a significant impact, as they should be quickly resolved (checked against the current master) prior to migrations starting.
- In case of multiple nodes being cold restarted, partitions that have all their roster replicas on the restarting nodes will incur additional latency, as it would require initial transaction to an unreplicated record to re-replicate the record.
- When recovering from an ungraceful shutdown (power loss for example),
partitions will be marked as un-trusted, excluding them from being counted
as a valid replica in the roster (will not count toward a super majority for
example).
- This would degrade overall availability in case of subsequent network partitions (until migrations complete).
- In case of multiple nodes cold restarting following an ungraceful
shutdown, partitions having all their roster replicas on those nodes will
be marked as dead upon re-formation of complete roster (refer to the
dead_partitions
metric).
- All records will be marked as unreplicated (refer to the
It is always recommended to back up the data on the cluster prior to maintenance, especially when a cold restart would be triggered.
Avoiding resurrecting records with a cold restart
If the following conditions are met, a cold restart will not resurrect previously deleted and expired records.
- Records are exclusively deleted using the durable delete policy.
- Records with expiration never had their void-time shortened.
- Records have never been evicted.
You risk resurrecting records in the following situations:
- Records have been deleted without the durable delete policy.
- In CE, records have been truncated. Truncation is durable in EE and SE.
- Records with an expiration have had their void-time shortened.
Here are ways to handle a cold restart without bringing back previously deleted records.
Wipe the node's persistent storage before restarting
Must have a replication factor 2 or more on the namespace.
- Wipe the namespace persistent storage:
- When using files for persistence, delete the files associated with the namespace.
- When using raw SSD partitions, initialize them.
- Introduce the empty node into the cluster.
- Wait for migrations to complete, which will fully repopulate the node with data, before proceeding with the next node.
Use the cold-start-empty configuration parameter
Must have a replication factor 2 or more on the namespace.
The cold-start-empty
configuration parameter will instruct the server to ignore the data on disk upon cold
restart. Migrations will then rebalance the data across the cluster and
repopulate this node. It is therefore necessary to wait for the completion of
migrations before proceeding with the next node. After a node
restarts with cold-start-empty true
, we don't recommended
removing this configuration without a full 'cleaning' of the persistent storage as
described in the previous section.
Although the data on the persistent storage is ignored, it is not
removed from the device. Therefore, a subsequent cold restart would potentially
resurrect older records (if the cold-start-empty
configuration is set to
false). Setting the cold-start-empty
to false would prevent data
unavailability (and potential data loss) in the event of multiple node cold
restarting at close interval (before migrations complete).
Cold restart the node as-is
If the use case can handle the potential resurrection of deleted records, the node can be cold restarted without any further action.
As deleted records may be resurrected, more data than previously existing may be loaded.
If evictions are not effective during the cold restart (for example, records do
not have any TTL set) the stop-writes-pct
threshold could be breached. In such
event, the node will abort and not complete the start up.