asbackup command-line options
The asbackup
utility is used to back up namespaces, sets, or partitions from an Aerospike cluster to local storage. By default, backups are not parallelized. This can be changed with the --parallel
command line option (refer to the parallel flag under connection options). If a record is created or updated after its partition was backed up, the backup will not reflect that.
When run with the --directory
option, asbackup
creates multiple .asb
backup files in the given directory. The backup consists of all created files. Alternatively, --output-file
makes asbackup
store the complete backup in the given single file. If -
is specified as the file, asbackup
writes the backup to stdout
. This allows for pipelines:
asbackup --output-file - [...] | gzip -1 [...] > backup.asb.gz
Required permissions
The asbackup
utility requires the read
access role or higher. For more information about
Aerospike's role-based access control system, see
Privileges, permissions, and scopes.
When running in directory mode, each worker thread creates its own backup file. If, after completing the scan, the backup file is not full (that is, less than --file-limit MB
in size), place the backup file on a queue to be reused by another backup job.
Usage
The -Z
or --help
option of asbackup
gives an overview of all supported command line options.
asbackup --help
The simplest way to run asbackup
is to just specify the cluster to back up (--host
), the namespace to back up (--namespace
), and the local directory for the backup files (--directory
). For example, a cluster contains a node with IP address 1.2.3.4
. To back up the test
namespace on this cluster to the directory backup_2015_08_24
, issue the following command:
asbackup --host 1.2.3.4 --namespace test --directory backup_2015_08_24
Estimating the backup size
When passing the --estimate
command line option to asbackup
(and skipping --directory
and --output-file
), asbackup
creates a temporary test backup of 10,000 records from the namespace. It then outputs, based on the observed record sizes, an estimate of the average size of a record in the backup. In order to estimate the total size of the backup file or files, multiply this size by the number of records in the namespace and add 10% for indexes and overhead.
Per-record filters (filter-exp
, modified-after
, modified-before
, no-ttl-only
, after-digest
, and partition-list
) and node-list
are not accounted for in the estimate, and using these options will have no effect on the estimate.
--parallel
and --estimate
are mutually exclusive.
Backup-to-file estimate
Before a backup-to-file is run, asbackup
runs an estimate on the namespace being backed up and uses it to calculate a 99.9% confidence upper bound on the total size of the backup file. The number of estimate samples taken can be controlled with --estimate-samples
, with the default being 10000, just as in normal estimate mode.
Incremental backup
Timestamps can be specified so only records updated since timestamp X are backed up. An operational routine can be established to do incremental daily backups. Refer to --modified-after
option in Data Options section.
Additionally, you can run incremental backups using the --partition-list
option. For more
information, refer to partition list.
Connection options
Option | Default | Description |
---|---|---|
-h HOST1:TLSNAME1:PORT1,... or --host HOST1:TLSNAME1:PORT1,... | 127.0.0.1 | The host that acts as the entry point to the cluster. Any of the cluster nodes can be specified. The remaining cluster nodes will be automatically discovered. |
-p PORT or --port PORT | 3000 | Port to connect to. |
-U USER or --user USER | - | User name with read permission. Mandatory if the server has security enabled. |
-P PASSWORD or --password | - | Password to authenticate the given user. The first form passes the password on the command line. The second form prompts for the password. |
-A or --auth | INTERNAL | Set authentication mode when user and password are defined. Modes are (INTERNAL, EXTERNAL, EXTERNAL_INSECURE, PKI) and the default is INTERNAL. This mode must be set EXTERNAL when using LDAP. |
-l or --node-list ADDR1:TLSNAME1:PORT1,... | localhost:3000 | While --host and --port automatically discover all cluster nodes, --node-list can be used to back up a subset of cluster nodes. This is done by first calculating the subset of partitions owned by the listed nodes, and then backing up that list of partitions. This option is mutually exclusive with --partition-list and --after-digest . |
--parallel N | 1 | Maximum number of scan calls to run in parallel. If only one partition range is given, or the entire namespace is being backed up, the range of partitions will be evenly divided by this number to be processed in parallel. Otherwise, each filter cannot be parallelized individually, so you may only achieve as much parallelism as there are partition filters. |
--tls-enable | disabled | Indicates a TLS connection should be used. |
-S or --services-alternate | false | Use to connect to alternate-access-address when the cluster nodes publish IP addresses through access-address which are not accessible over WAN and alternate IP addresses accessible over WAN through alternate-access-address . |
Timeout options
Option | Default | Description |
---|---|---|
--socket-timeout MS | 10000 | Socket timeout in milliseconds. If this value is 0, it is set to total-timeout. If both are 0, there is no socket idle time limit. |
--total-timeout MS | 0 | Total socket timeout in milliseconds. Default is 0, that is, no timeout. |
--max-retries N | 5 | Maximum number of retries before aborting the current transaction. |
--sleep-between-retries MS | 0 | The amount of time to sleep between retries. |
TLS options
Option | Default | Description |
---|---|---|
--tls-cafile=TLS_CAFILE | Path to a trusted CA certificate file. | |
--tls-capath=TLS_CAPATH | Path to a directory of trusted CA certificates. | |
--tls-name=TLS_NAME | The default TLS name used to authenticate each TLS socket connection. Note: this must also match the cluster name. | |
--tls-protocols=TLS_PROTOCOLS | Set the TLS protocol selection criteria. This format is the same as Apache's SSL Protocol. If not specified, asrestore uses TLSv1.2 if supported. Otherwise it uses -all +TLSv1 . | |
--tls-cipher-suite=TLS_CIPHER_SUITE | Set the TLS cipher selection criteria. The format is the same as OpenSSL's Cipher List Format. | |
--tls-keyfile=TLS_KEYFILE | Path to the key for mutual authentication (if Aerospike cluster supports it). | |
--tls-keyfile-password=TLS_KEYFILE_PASSWORD | Password to load protected TLS-keyfile. Can be one of the following: 1) Environment variable: env:VAR 2) File: file:PATH 3) String: PASSWORD User will be prompted on command line if --tls-keyfile-password specified and no password is given. | |
--tls-certfile=TLS_CERTFILE <path> | Path to the chain file for mutual authentication (if Aerospike Cluster supports it). | |
--tls-cert-blacklist <path> | Path to a certificate blocklist file. The file should contain one line for each blocklisted certificate. Each line starts with the certificate serial number expressed in hex. Each entry may optionally specify the issuer name of the certificate (serial numbers are only required to be unique per issuer). Example: 867EC87482B2 /C=US/ST=CA/O=Acme/OU=Engineering/CN=TestChainCA | |
--tls-crl-check | Enable CRL checking for leaf certificate. An error occurs if a valid CRL files cannot be found in TLS_CAPATH . | |
--tls-crl-checkall | Enable CRL checking for entire certificate chain. An error occurs if a valid CRL files cannot be found in TLS_CAPATH . | |
--tls-log-session-info | Enable logging session information for each TLS connection. |
TLS_NAME
is only used when connecting with a secure TLS enabled server.
The following example creates a backup with the following parameters:
- Cluster nodes
1.2.3.4
and5.6.7.8
- Port 3000
- Namespace
test
- Output directory
backup_2015_08_24
- TLS enabled
HOST is "HOST1
:TLSNAME1
:PORT1
,...".
asbackup --host 1.2.3.4:cert1:3000,5.6.7.8:cert2:3000 --namespace test --directory backup_2015_08_24 --tls-enable --tls-cafile /cluster_name.pem --tls-protocols TLSv1.2 --tls-keyfile /cluster_name.key --tls-certfile /cluster_name.pem
Output options
Option | Default | Description |
---|---|---|
-d PATH or --directory PATH | - | Directory to store the .asb backup files in. If the directory does not exist, it will be created before use. Mandatory, unless --output-file or --estimate is given. |
-o PATH or --output-file PATH | - | The single file to write the backup to. - means stdout . Mandatory, unless --directory or --estimate is given. |
-q DESIRED-PREFIX or --output-file-prefix DESIRED-PREFIX | Must be used with the --directory option. A desired prefix for all output files. | |
-e or --estimate | - | Specified in lieu of --directory or --output-file , estimates the average size of a single record in the backup file. Useful for estimating the expected size of a backup before actually starting it. Multiply the returned value by the number of records in the namespace and add 10% for overhead. This option is mutually exclusive to --remove-artifacts and --continue . |
--estimate-samples N | 10000 | Sets the number of record samples to take in a backup estimate. This also sets the number of estimate samples taken for the estimate run before backup-to-file. |
-F LIMIT or --file-limit LIMIT | 250 MiB | File size limit (in MiB) for --directory . If a .asb backup file crosses this size threshold, asbackup will switch to a new file. |
-r or --remove-files | - | Clear directory or remove output file. By default, asbackup refuses to write to a non-empty directory or to overwrite an existing backup file. This option clears the given --directory or removes an existing --output-file . Mutually exclusive to --continue . |
--remove-artifacts | - | Clear directory or remove output file, like --remove-files , without running a backup. This option is mutually exclusive to --continue and --estimate . |
-C or --compact | - | Do not base-64 encode BLOB values. For better readability of backup files, asbackup base-64 encodes BLOB values by default. This option disables the encoding step, which saves space in the backup file. However, be prepared to encounter odd-looking binary data in your backup files. |
-N BANDWIDTH or --nice BANDWIDTH | - | Throttles asbackup 's write operations to the backup file(s) to not exceed the given bandwidth in MiB/s. Effectively also throttles the scan on the server side as asbackup refuses to accept more data than it can write. |
-y ENCRYPTION-ALG or --encrypt ENCRYPTION-ALG | none | The encryption algorithm to be used on backup files as they are written. The options available are aes128 and aes256 . This option must be accompanied by either --encryption-key-file or --encryption-key-env . Refer to compression and encryption |
-z COMPRESSION-ALG or --compress COMPRESSION-ALG | none | The compression algorithm to be used on backup files as they are written. The options available are zstd . Refer to compression and encryption |
--compression-level N | 3 | The zstd compression level to be used. Refer to the zstd manual for more information. |
Namespace data selection options
Option | Default | Description |
---|---|---|
-n NAMESPACE or --namespace NAMESPACE | - | Namespace to backup. Mandatory. |
-s SETS or --set SETS | All sets | The set(s) to backup. May pass in a comma-separated list of sets to back up (version 3.6.1+). Starting with asbackup 3.9.0, server version 5.2 or later is required for multi-set backup. Note: multi-set backup cannot be used with --filter-exp . |
-B BIN1,BIN2,... or --bin-list BIN1,BIN2,... | All bins | The bins to back up. |
-x or --no-bins | - | Only backup record metadata (digest, TTL, generation count, key). WARNING: No data (bin contents) is backed up. Also, this is unrelated to the single-bin option in the Aerospike server configuration file. |
-R or --no-records | - | Do not back up any record data (metadata or bin data). By default, asbackup includes record data, secondary index definitions, and UDF modules. |
-I or --no-indexes | - | Do not back up any secondary index definitions. |
-u or --no-udfs | - | Do not back up any UDF modules. |
-M or --max-records N | 0 = all records. | An approximate limit for the number of records to process. Available in server 4.9 and above. Note: this option is mutually exclusive to --partition-list and --after-digest . |
-a YYYY-MM-DD_HH:MM:SS or --modified-after YYYY-MM-DD_HH:MM:SS | - | Back up data with last-update-time after the specified date-time. The system's local timezone applies. Available in server 3.12 and later. Starting with asbackup 3.9.0, server version 5.2 or later is required. |
-b YYYY-MM-DD_HH:MM:SS or --modified-before YYYY-MM-DD_HH:MM:SS | - | Back up data with last-update-time before the specified date-cal timezone applies. Available in server 3.12 and later. Starting with asbackup 3.9.0, server version 5.2 or later is required. |
--no-ttl-only | - | Include only records that have no TTL; that is, persistent records. Starting with asbackup 3.9.0, server version 5.2 or later is required. |
Partition scanning backup options
Partition list
-X, --partition-list LIST
Back up list of partition filters. Partition filters can be ranges, individual partitions, or records after a specific digest within a single partition.
This option is mutually exclusive with the -D
, --after-digest
option described in After specific digest, --node-list
, and --max-records
.
Default number of partitions to back up: 0 to 4095: all partitions.
LIST
format:FILTER1,FILTER2,...
FILTER
format:BEGIN-PARTITION -PARTITION-COUNT|DIGEST
BEGIN-PARTITION
: 0 to 4095.- Either the optional
PARTITION-COUNT
: 1 to 4096. Default: 1 - Or the optional
DIGEST
: Base64-encoded string of desired digest to start at in specified partition.
Note: when using multiple partition filters, each partition filter is a single scan call and cannot be parallelized with the parallel
option. To have more parallelizability, you can either break up the partition filters, or run a backup using only one partition filter.
When backing up only a single partition range, the range is automatically divided into parallel
segments of near-equal size, each of which is backed up in parallel.
Examples
-X 361
- Back up only partition 361
-X 361,529,841
- Back up partitions 361, 529, and 841
-X 361-481
- Back up 481 partitions, starting at 361 (that is, partitions 361 through 841)
-X VSmeSvxNRqr46NbOqiy9gy5LTIc=
- Back up all records after the digest
VSmeSvxNRqr46NbOqiy9gy5LTIc=
in its partition (which in this case is partition 2389)
-X 0-1000,2222,EjRWeJq83vEjRRI0VniavN7xI0U=
- Back up partitions 0 to 999 (1000 partitions starting from 0)
- Then back up partition 2222
- Then back up all records after the digest
EjRWeJq83vEjRRI0VniavN7xI0U=
in its partition
After specific digest
-D
, --after-digest DIGEST
Back up records after the specified record digest in that record's partition and all succeeding partitions.
This option is mutually exclusive with the -X
, --partition-list
option described in Partition filter, --max-records
, and --node-list
.
DIGEST
format: Base64-encoded string of desired digest. This is the same encoding used for backup of digests, so you can copy-and-paste digest identifiers from backup files to use as the command-line argument with-D
.
Example
-D EjRWeJq83vEjRRI0VniavN7xI0U=
Filter expression
Backups can be made of only a subset of data matching a provided Aerospike Expression. You must provide the base-64 encoding of the filter expression, which can be generated if using the C client (as_exp_build_b64
) or the Java client (Expression.getBytes()
).
This option is mutually exclusive with multi-set backup (that is, --set
with more than one set specified).
Example
To build an expression that filters for bin "name" = "bob"
, first, build the expression in the C client and print out its base 64 encoding:
as_exp_build_b64(b64_exp, as_exp_cmp_eq(as_exp_bin_str("name"), as_exp_str("bob")));
printf("%s\n", b64_exp);
This should print kwGTUQOkbmFtZaQDYm9i
. Then, to run a backup with this filter expression, run
asbackup --filter-exp kwGTUQOkbmFtZaQDYm9i ...
Backup resumption
Option | Default | Description |
---|---|---|
--continue STATE-FILE | disabled | Enables the resumption of an interrupted backup from provided state file. All other command line arguments should match those used in the initial run (except --remove-files , which is mutually exclusive with --continue ). |
--state-file-dst | see below | Specifies where to save the backup state file to. If this points to a directory, the state file is saved within the directory using the same naming convention as backup-to-directory state files. If this does not point to a directory, the path is treated as a path to the state file. |
Default backup state file location
For backups to a file, the backup state is saved to a file with the same name and location as the backup state file with .state
appended as a postfix. For backups to a directory, the backup state is saved in the directory with name NAMESPACE.asb.state
. If --output-file-prefix
is supplied, that is used in place of NAMESPACE
.
Backup to S3
To back up files to Amazon S3, prefix file and directory names with s3://BUCKET/KEY
, where BUCKET
is the name of the S3 bucket to upload to or download from, KEY
is the key of the object to download/prefix of files in the S3 "directory". If using the default S3 endpoint, --s3-region REGION
must be set to the region where the bucket is located. If using another endpoint, specify that endpoint with --s3-endpoint-override URL
.
Files are uploaded in parts asynchronously. The maximum number of simultaneous asynchronous upload parts across all threads is controlled with --s3-max-async-uploads
. Each connection to S3 is throttled, so this number may need adjustment to maximize throughput.
You can upload a maximum of 10,000 parts to S3. Each part must be between 5MB and 5GB, except for the last part which has no lower bound. When backing up to a directory, the value of --file-limit
is used to calculate what part size should be used (that is, max(file-limit
/ 10000, 5MB)). When backing up to a file, the estimate run before starting the backup is used in the prior equation to calculate the proper upload part size. The upload part size may also be overridden with --s3-min-part-size
, though it is unlikely this option will need to be used in practice.
Required permissions
asbackup
requires certain permissions for successful use with Amazon S3. The IAM JSON policy should include the following elements. Replace backup-bucket
with the name of the S3 bucket you are using for the backup.
{
"Statement": [
{
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation",
"s3:ListBucketMultipartUploads",
"s3:ListBucketVersions"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::backup-bucket"
]
},
{
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:AbortMultipartUpload",
"s3:ListMultipartUploadParts"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::backup-bucket/*"
]
}
],
"Version": "2012-10-17"
}
S3 backup options
Option | Default | Description |
---|---|---|
--s3-region REGION | - | Sets the S3 region of the bucket being uploaded to or downloaded from. Must be set if using the default S3 endpoint. |
--s3-endpoint-override URL | - | Sets the S3 endpoint to use. Must point to an S3-compatible storage system. |
--s3-profile PROFILE_NAME | default | Sets the S3 profile to use for credentials. |
--s3-min-part-size SIZE IN MEGABYTES | - | An override for the minimum S3 part size to use for file uploads. By default, this size is calculated based on the expected backup file size (found either with the value of --file-limit for backup-to-directory or from the backup estimate run before backup-to-file). |
--s3-max-async-downloads N | 32 | The maximum number of simultaneous download requests from S3. |
--s3-max-async-uploads N | 16 | The maximum number of simultaneous upload requests from S3. |
--s3-connect-timeout MILLISECONDS | 1000 | The AWS S3 client's connection timeout in milliseconds. Equivalent to cli-connect-timeout in the AWS CLI, or connectTimeoutMS in the aws-sdk-cpp client configuration. |
--s3-log-level LEVEL | Fatal | The log level of the AWS S3 C++ SDK. The possible levels are, from least to most granular.
|
Example
To back up all records from a namespace test
to an S3 bucket test-bucket
in region us-west-1
under directory test-dir
, run:
asbackup -n test -d s3://test-bucket/test-dir --s3-region us-west-1
Configuration file options
asbackup
can be configured by using tools configuration files. Refer to Aerospike Tools Configuration for more details. The following options affect configuration file behavior.
Option | Default | Description |
---|---|---|
--no-config-file | disabled | Do not read any configuration file. The configuration file options --no-config-file and only-config-file are mutually exclusive. |
--instance SUFFIX | - | In the configuration file, you can specify a group of clusters that share a common suffix with the --instance option. Refer to Instances for more information. |
--config-file PATH | - | Read this file after default configuration file. |
--only-config-file PATH | - | Read only this configuration file. The configuration files options --no-config-file and only-config-file are mutually exclusive. |
Other options
Option | Default | Description |
---|---|---|
-v or --verbose | disabled | Output considerably more information about the running backup. |
-m or --machine PATH | - | Output machine-readable status updates to the given path, typically a FIFO. |
-L or --records-per-second RPS | 0 | Available only for server version 4.7 and later. Limit total returned records per second (RPS). If RPS is zero (the default), a records-per-second limit is not applied. |
Resource usage
See how to estimate asbackup
resource usage at asbackup
and asrestore
resource usage.