Cold Start
Discussed here are aspects and effects of coldstarting the Aerospike Server.
When Aerospike cold restarts
After a shutdown, Aerospike will rebuild the primary index, which Enterprise Edition stores in shared memory, in the following situations:
- Community Edition always cold starts since the fast-start feature is exclusive to the Enterprise Edition.
- After stopping unexpectedly (for example a segmentation fault, out of memory situation, or a kernel freeze).
- After a server reboot (for example due to a kernel upgrade or RAM addition).
- For server versions prior to 3.15.1.3, if namespace is configured for
data-in-memory true
. - Cold restart is forced using the 'coldstart' command-line option.
- Changing the value of
partition-tree-sprigs
. - If applicable, as part of an expected upgrade path - e.g. upgrading to version 4.2.0.2.
See 'When does fast restart NOT happen?' for more details.
Impact of cold restart
After a cold restart, the index is repopulated from storage. There can be multiple copies of the data existing on the device/file depending on the frequency of updates and if the blocks holding the older data have not yet been defragmented and subsequently overwritten with new data. Here are some consequences of a cold restart:
- Aerospike daemon start up takes much longer (compared to a fast restart - Enterprise Edition only). This can be exacerbated if evictions are triggered during the cold restart.
- Non-durably deleted records may be reverted to an older version.
- When running
strong-consistency
enabled namespaces, cold restarts has potential further impacts:- All records will be marked as unreplicated (refer to the
appeals_tx_remaining
stat).- Unreplicated records on a single node should not have a significant impact as they should be quickly resolved (checking against the current master) prior to migrations starting.
- In case of multiple nodes being cold restarted, partitions having all their roster replicas on those nodes will incur additional latency as it would require initial transaction to an unreplicated record to re-replicate the record.
- When recovering from an ungraceful shutdown (power loss for example),
partitions will be marked as un-trusted, excluding them from being counted
as a valid replica in the roster (will not count toward a super majority for
example).
- This would degrade overall availability in case of subsequent network partitions (until migrations complete).
- In case of multiple nodes cold restarting following an ungraceful
shutdown, partitions having all their roster replicas on those nodes will
be marked as dead upon re-formation of complete roster (refer to the
dead_partitions
metric).
- All records will be marked as unreplicated (refer to the
Different use cases to consider
In order to clearly understand the consequences, you would need to quantify the type of data and the use-case for your application on your servers.
The following details can be applied on a per namespace basis depending on their particular usage.
It is always recommended to backup the data on the cluster prior to maintenance, especially when a cold restart would be triggered.
1 Data expiring naturally without ever having any record's expiration time shortened or durably deleted
This is the best case scenario. Here are the exact conditions required:
- Records are exclusively deleted using the durable delete policy.
- Records with TTL set never have their expiration time shortened.
- Records have never been evicted.
In this situation, a cold restart will reload the same data that existed previously. Durably deleted or expired data will not be resurrected.
2 Data explicitly non-durably deleted, evicted, or records having their expiration time shortened
Here are the conditions for this scenario:
- Records have been deleted without the durable delete policy.
- In Community Edition, records have been deleted through the
truncate
ortruncate-namespace
info commands. Note that truncation
is durable in the Enterprise Edition. - Records with a TTL set have had their expiration time shortened.
Here are ways to handle a cold restart without bringing back previously deleted records.
As stated previously on this page, it is always recommended to back up the data on the cluster prior to maintenance, especially when a cold restart would be triggered.
2a Manually clean up the node's persistent storage before starting it back up
Must have a replication factor 2 or more on the namespace.
- Clean up the persistent storage:
- If using a file for persistence, simply delete the file.
- If using a raw device, initialize it. Refer to Initializing Solid State Drives (SSDs).
- Introduce the empty node into the cluster.
- Wait for migrations to complete (which will fully repopulate the data on this node) before proceeding with the next node.
2b Use the cold-start-empty configuration parameter
Must have a replication factor 2 or more on the namespace.
The cold-start-empty
configuration parameter will instruct the server to ignore the data on disk upon cold
restart. Migrations will then rebalance the data across the cluster and
repopulate this node. It is therefore necessary to wait for the completion of
migrations before proceeding with the next node. Once a node has been
restarted with cold-start-empty true
, it is typically not recommended to
remove this configuration without a full 'cleaning' the persistent storage as
described in (a).
Note that although the data on the persistent storage is ignored, it is not
removed from the device. Therefore, a subsequent cold restart would potentially
resurrect older records (if the cold-start-empty
configuration is set to
false). Setting the cold-start-empty
to false would prevent data
unavailability (and potential data loss) in the event of multiple node cold
restarting at close interval (before migrations complete).
2c Cold restart the node as is
If the use case can handle the potential resurrection of deleted records, the node can be cold restarted without any further action.
As deleted records may be resurrected, much more data than previously existing may be loaded.
In the event of a cold restart causing evictions to occur (disk or memory high water mark breached), the start up time could be drastically increased. Refer to the following article for details on Speeding up cold restart evictions.
If evictions are not effective during the cold restart (for example, records do
not have any TTL set) the stop-writes-pct
threshold could be breached. In such
event, the node will abort and not complete the start up.
Coldstart with systemd
If you prefer systemd
and sysctl
to manage your services for such needs as coldstarts, see Aerospike systemd Daemon Management.