Skip to main content

Aerospike Benchmark (asbench)

The Aerospike Benchmark tool measures the performance of an Aerospike cluster. It can mimic real-world workloads with configurable record structures, various access patterns, and UDF calls.

The new benchmark tool is a C based benchmark tool, which replaces the Java Benchmark tool.

Usage

The --help option of asbench gives an overview of all supported command line options.

asbench --help

Connection Options

OptionDefaultDescription
-h or --hosts <host1>[:<tlsname1>][:<port1>][,...]127.0.0.1List of seed hosts. The tlsname is only used when connecting with a secure TLS enabled server. If the port is not specified, the default port is used. IPv6 addresses must be enclosed in square brackets.
-p or --port <port>3000Set the default port on which to connect to Aerospike.
-U or --user <user>-User name. Mandatory if security on the server is enabled.
-P[<password>]-User's password for Aerospike servers that require authentication. If -P is set, the actual password is optional. If the password is not given, the user is prompted on the command line. If the password is given, it must be provided directly after -P with no intervening space (ie. -Pmypass).
--auth <mode>INTERNALSet the authentication mode when user/password is defined. Modes are [INTERNAL, EXTERNAL, EXTERNAL_INSECURE, PKI]. This mode must be set to EXTERNAL when using LDAP.
-tls or --tls-enabledisabledUse TLS/SSL sockets.
--services-alternatefalseUse to connect to alternate-access-address when the cluster nodes publish IP addresses through access-address, which are not accessible over WAN, and alternate IP addresses accessible over WAN through alternate-access-address.

TLS Options

OptionDefaultDescription
--tls-cafile=TLS_CAFILEPath to a trusted CA certificate file.
--tls-capath=TLS_CAPATHPath to a directory of trusted CA certificates.
--tls-name=TLS_NAMEThe default TLS name used to authenticate each TLS socket connection. Note: this must also match the cluster name.
--tls-protocols=TLS_PROTOCOLSSet the TLS protocol selection criteria. This format is the same as Apache's SSLProtocol documented at https://httpd.apache.org/docs/current/mod/mod_ssl.html#sslprotocol . If not specified the benchmark uses '-all +TLSv1.2' if it supports TLSv1.2, otherwise it uses '-all +TLSv1'.
--tls-cipher-suite=TLS_CIPHER_SUITESet the TLS cipher selection criteria. The format is the same as Open_SSL's Cipher List Format documented at https://www.openssl.org/docs/manmaster/man1/ciphers.html .
--tls-keyfile=TLS_KEYFILEPath to the key for mutual authentication (if Aerospike Cluster supports it).
--tls-keyfile-password=TLS_KEYFILE_PASSWORDPassword to load protected tls-keyfile. It can be one of the following:
1) Environment variable: 'env:<VAR>'
2) File: 'file:<PATH>'
3) String: 'PASSWORD'
User will be prompted on the command line if --tls-keyfile-password specified and no password is given.
--tls-certfile=TLS_CERTFILE <path>Path to the chain file for mutual authentication (if Aerospike Cluster supports it).
--tls-cert-blacklist <path>Path to a certificate blacklist file. The file should contain one line for each blacklisted certificate. Each line starts with the certificate serial number expressed in hex. Each entry may optionally specify the issuer name of the certificate (serial numbers are only required to be unique per issuer). Example: 867EC87482B2 /C=US/ST=CA/O=Acme/OU=Engineering/CN=TestChainCA
--tls-crl-checkEnable CRL checking for leaf certificate. An error occurs if a valid CRL file cannot be found in tls_capath.
--tls-crl-check-allEnable CRL checking for the entire certificate chain. An error occurs if a valid CRL file cannot be found in tls_capath.
--tls-log-session-infoLog TLS connected session info.
--tls-login-onlyUse TLS for node login only.

The tlsname is only used when connecting with a secure TLS enabled server. The following example runs the default benchmark on a cluster of nodes 1.2.3.4 and 5.6.7.8 using the default Aerospike port of 3000 with tls configured.

HOST is "host1[:tlsname1][:`port1`],...".

asbench --hosts 1.2.3.4:cert1:3000,5.6.7.8:cert2:3000 --namespace test --tls-enable --tls-cafile /cluster_name.pem --tls-protocols TLSv1.2 --tls-keyfile /cluster_name.key --tls-certfile /cluster_name.pem

Global Options

OptionDefaultDescription
-z or --threads <count>16The number of threads used to perform synchronous read/write commands.
--compressdisabledEnable binary data compression through the aerospike client. Internally, this sets the compression policy to true.
--socket-timeout <ms>30000Read/Write socket timeout in milliseconds.
--read-socket-timeout <ms>30000Read socket timeout in milliseconds.
--write-socket-timeout <ms>30000Write socket timeout in milliseconds.
-T or --timeout <ms>0Read/Write total timeout in milliseconds.
--read-timeout <ms>0Read total timeout in milliseconds.
--write-timeout <ms>0Write total timeout in milliseconds.
--max-retries <number>1Maximum number of retries before aborting the current transaction.
-d or --debugdisabledRun benchmark in debug mode.
-S or --shareddisabledUse shared memory cluster tending.
-C or --replica {master,any,sequence}masterWhich replica to use for reads.
-N or --read-mode-ap {one,all}oneRead mode for AP (availability) namespaces.
-B or --read-mode-sc {session,linearize,allowReplica,allowUnavailable}sessionRead mode for SC (strong consistency) namespaces.
-M or --commit-level {all,master}allWrite commit guarantee level.
-Y or --conn-pools-per-node <num>1Number of connection pools per node.
-D or --durable-deletedisabledAll transactions will set the durable-delete flag which indicates to the server that if the transaction results in a delete, to generate a tombstone for the deleted record.
-c or --async-max-commands <command count>50Maximum number of concurrent asynchronous commands that are active at any time.
-W or --event-loops <thread count>1Number of event loops (or selector threads) when running in asynchronous mode.

Namespace and Record Format Options

OptionDefaultDescription
-n or --namespace <ns>testThe Aerospike namespace to perform all operations under.
-s or --set <set name>testsetThe Aerospike set to perform all operations in.
-b or --bin <bin name>testbinThe base name to use for bins. The first bin is <bin_name>, the second is <bin_name>_2, and so on.
-K or --start-key <key>0Set the starting value of the working set of keys. If using an 'insert' workload, the start_value indicates the first value to write. Otherwise, the start_value indicates the smallest value in the working set of keys.
-k or --keys <count>1000000Set the number of keys the client is dealing with. If using an 'insert' workload (detailed below), the client will write this number of keys, starting from value = start-key. Otherwise, the client will read and update randomly across the values between start-key and start-key + num_keys. start-key can be set using '-K' or '--start-key'.
-o or --object-spec <obj_spec>I4Describes a comma-separated bin specification. See object spec below for more details.
--compression-ratio <ratio>1Sets the desired compression ratio for binary data. Causes the benchmark tool to generate binary data which will roughly compress by this proportion. Note: this is only applied to B<n> binary data, not any of the other types of record data.

Object Spec

The object spec is a flexible way to describe how to structure records being written to the database. It is a comma-separated list of bin specs, and each bin spec is one of the following:

Variable Scalars:

TypeFormatDescription
BooleanbA random boolean bin/value.
IntegerI<n>A random integer with the lower n bytes randomized (and the rest remaining 0). n can range from 1 to 8. Note: the nth byte is guaranteed to not be 0, except in the case n=1.
DoubleDA random double bin/value (8 bytes).
StringS<n>A random string of length n of either lowercase letters a-z or numbers 0-9.
Binary DataB<n>Random binary data of length n. Note: if --compression-ratio is set, only the first ratio * n bytes are random, and the rest are 0.

Constant Scalars:

TypeFormatExample
Const Booleantrue/T or false/Ftrue
Const IntegerA decimal, hex (0x...), or octal (0...) number123
Const DoubleA decimal number with a .123.456
Const StringA backslash-escaped string enclosed in double quotes"this -> \" is a double quote\n"

Collection Bins:

TypeFormatNotes
List[<bin_spec>,...]A list of one or more bin specs separated by commas.
Map{<scalar_bin_spec>:<bin_spec>,...}A list of one or more mappings from a scalar bin spec (i.e. anything but a list or map) to a bin spec. These describe the key-value pairs that the map will contain.

Multipliers

Multipliers are positive integer constants, followed by a "*", preceding a bin spec.

In the root-level object spec, multipliers indicate how many times to repeat a bin spec across separate bins. For example, the following object specs are equivalent:

I, I, I, I, S10, S10, S10         = 4*I, 3*S10
123, 123, 123, "string", "string" = 3*123, 2*"string"

In a list, multipliers indicate how many times to repeat a bin spec in the list. The following are equivalent:

[I, I, I, I, S10, S10, S10] = [4*I, 3*S10]

In a map, multipliers must precede variable scalar keys, and they indicate how many unique key-value pairs of the given format to insert into the map. Multipliers may not precede const key bin specs or value bin specs in a key-value mapping. The following are equivalent:

{I:B10, I:B10, I:B10} = {3*I:B10}

Workloads

There are four main types of workloads, which are ways in which the benchmark tool can interact with the Aerospike database. The four types are:

  • I) Linear Insert: Runs over the range of keys specified and inserts a record with that key.
  • RU,<read_pct>) Random Read/Update: Randomly picks keys, and either writes a record with that key or reads a record from the database with that key, with probability according to the given read percentage.
    • <read_pct> can be anything between 0 and 100. 0 would mean to only do writes, and 100 to only do reads.
  • RUF,<read_pct>,<write_pct>) Random Read/Update/Function: Same as RU, except may also perform an apply command on the random key with a given UDF function.
    • The percentage of operations that are function calls (i.e. UDFs) is 100 - <read_pct> - <write_pct>. This value must not be negative, which is checked for at initialization.
  • DB) Delete bins: Same as I, but deletes the record with the given key from the database.
info

In order for DB to delete entire records, it must delete every bin that the record contains. Since bins are named based on their position in the object spec, typically you want to make sure when running this workload you are using the same object spec you used to generate the records being deleted.

  • DB uses write-bins to determine which bins to delete (by default, all), so if you only want to delete a subset of bins, you may use write-bins to select which to delete

Workload Options

OptionDefaultDescription
--read-binsall binsSpecifies which bins from the object-spec to load from the database on read transactions. Must be given as a comma-separated list of increasing bin numbers, starting from 1 (i.e. "1,3,4,6").
--write-binsall binsSpecifies which bins from the object-spec to generate and store in the database on write transactions. Must be given as a comma-separated list of bin numbers, starting from 1 (i.e. "1,3,4,6").
-R or --randomdisabledUse dynamically generated random bin values for each write transaction instead of fixed values (one per thread) created at the beginning of the workload.
-t or --duration <seconds>10 for RU,RUF workloads, 0 for I,DB workloadsSpecifies the minimum amount of time the benchmark will run for. For random workloads with no finite amount of work needing to be done, this value must be above 0 for anything to happen. For workloads with a finite amount of work, like linear insertion/deletion, this value should be set to 0.
-w or --workload <workload>RU,50Desired workload. See workload types
-g or --throughput <tps>0Throttle transactions per second to a maximum value. If tps is zero, do not throttle throughput.
--batch-size <size>1Enable batch mode with a number of records to process in each batch get call. Batch mode is valid only for RU or RUF workloads. If batch size is 1, batch mode is disabled.
-a or --asyncdisabledEnable asynchronous mode, which uses the asynchronous variant of every Aerospike C Client method for transactions.

Workload Stages

Multiple different workloads may be run in sequence using a workload stage config file, which is in YAML format. The config file should only be a list of workload stages in the following format:

- stage: 1
# required arguments
workload: <workload type>
# optional arguments
duration: <seconds>
tps : max possible with 0 (default), or specified transactions per second
object-spec: Object spec for the stage. Otherwise, inherits from the previous
stage, with the first stage inheriting the global object spec.
key-start: Key start, otherwise inheriting from the global context
key-end: Key end, otherwise inheriting from the global context
read-bins: Which bins to read if the workload includes reads
write-bins: Which bins to write to if the workload includes writes
pause: max number of seconds to pause before the stage starts. Waits a random
number of seconds between 1 and the pause.
async: when true/yes, uses asynchronous commands for this stage. Default is false
random: when true/yes, randomly generates new objects for each write. Default is false
batch-size: specifies the batch size of reads for this stage. Default is 1
- stage: 2
...

Each stage must begin with stage: <stage number>, where stage number is the position of the stage in the list. The stages must appear in order.

When arguments say they inherit from the global context, the value they inherit either comes from a command line argument, or is the default value if no command line argument for that value was given.

Latency Histograms

There are multiple ways to record latencies measured throughout a benchmark run. All latencies are recorded in microseconds.

OptionDefaultDescription
--output-filestdoutSpecifies an output file to write periodic latency data, which enables the tracking of transaction latencies in microseconds in a histogram. Currently uses a default layout. The file is opened in append mode.
-L or --latencydisabledEnables the periodic HDR histogram summary of latency data.
--percentiles <p1>[,<p2>[,<p3>...]]"50,90,99,99.9,99.99"Specifies the latency percentiles to display in the periodic latency histogram.
--output-period <seconds>1Specifies the period between successive snapshots of the periodic latency histogram.
--hdr-hist <path/to/output>disabledEnables the cumulative HDR histogram and specifies the directory to dump the cumulative HDR histogram summary.

Periodic Latency Histogram

Periodic latency data is recorded in histograms with three ranges of fixed bucket sizes, which as of right now are not configurable. There is one histogram for reads, one for writes, and one for UDF calls. The three ranges are:

  • 100us to 4000us, bucket width 100us
  • 4000us to 64000us, bucket width 1000us
  • 64000us to 128000us, bucket width 4000us

Format of the histogram output file:

<hist_name> <UTC time>, <period time>, <num records>, <bucket 1 lower bound>:<num records>, ...

First the name of the histogram is printed (either read_hist, write_hist, or udf_hist). This is followed by the UTC-time of the event being recorded (i.e. the end of the interval), then by the length of the interval in seconds, and then by the total number of transaction latencies recorded in the interval. After this, each bucket with at least one latency recorded is displayed in ascending order of interval lower bounds, written as the lower bound of the bucket, colon, number of transaction latencies falling within the bucket's range.

HDR Histogram

Transaction latencies can also be recorded in an HDR Histogram. There is one HDR histogram for reads, one for writes, and one for UDF calls. The two ways to enable the HDR Histograms are to either use --latency, which will display select percentiles from the HDR histograms every output-period seconds, and to use --hdr-hist, which will write the full HDR histograms to the given directory in both a human-readable text format (.txt) and a binary encoding of the HDR Histogram (.hdrhist).

The percentiles that are printed when --latency is enabled can be configured by using --percentiles followed by a comma-separated list of percentiles. This list must be in ascending order, and no percentile can be less than 0 or greater or equal to 100.

UDFs

UDF calls are made in RUF (read/update/function) workloads, being the "function" part of that workload. A key is chosen at random from the range of keys given, and an Aerospike apply call is made on that key with the given UDF function (--udf-function-name) from the given UDF package (--udf-package-name). Optionally, --udf-function-values may be supplied, which takes an object spec and randomly generates arguments every call.

info

Note: the UDF function args follow the same rules as the object spec used on records, and only randomly generate for every call if --random is supplied as an argument

OptionDefaultDescription
-upn or --udf-package-name <package_name>-The package name for the udf to be called.
-ufn or --udf-function-name <function_name>-The name of the UDF function in the package to be called.
-ufv or --udf-function-values <fn_vals>noneThe arguments to be passed to the udf when called, which are given as an object spec (see object spec).