Managing Batch Operations
Batch-Index protocol
A batch is a series of requests that are sent together to the database server. A batch groups multiple operations into one unit and passes it in a single network trip to each database node. Refer to Batch Operations.
Tuning batches
Background
An incoming batch request from a client is assigned to a specific batch response thread if the status is not "full". A thread is declared "full" when batch-max-buffers-per-queue (255 default) are in use in a queue.
Each batch response thread has a queue of 128KiB buffers. Existing batch requests assigned to a response thread can allocate buffers beyond batch-max-buffers-per-queue, as needed. The “full” designation prevents a queue or response thread from assigning new incoming batch requests. If all response threads are at "full" status, new incoming batch requests are not accepted and an error is returned to the client.
When response threads need a new buffer, they take a buffer from a single pool of unused buffers. This pool is empty at startup and serves all the response threads. Response threads allocate the buffers, and then return them to the pool of unused buffers when done.
A destination client is associated with each 128KB buffer allowing the same response thread to serve multiple batch requests. Excess buffers are destroyed if the number of buffers in the unused pool exceeds batch-max-unused-buffers (default 256). Records that require more than 128KB are allocated a "huge" buffer which is destroyed after use and is not saved in the unused buffer pool.
The server provides the following configuration variables for batch performance tuning. Refer to Configuration Reference.
Name | Default | Max | Dynamic | Description |
---|---|---|---|---|
batch-max-requests | 5000 | true | Determines the maximum number of keys in a sub-batch sent to the cluster node. It prevents unexpectedly large batch requests from causing server instability due to excessive memory consumption. If the sub-batch size is exceeded the server returns error code 151 (AS_ERR_BATCH_MAX_REQUESTS ) | |
batch-max-buffers-per-queue | 255 | true | Maximum number of 128KiB response buffers allowed in each batch queue before it is designated "full". Additional buffers beyond batch-max-buffers-per-queue can be allocated for accepted batch requests, if needed. If all batch queues are full, new batch requests are rejected with error code 152 (AS_ERR_BATCH_QUEUES_FULL ). | |
batch-max-unused-buffers | 256 | true | Maximum number of 128KiB response buffers allowed in the unused buffer pool for reuse by any response thread. If the limit is reached, new buffers created by response threads at runtime are destroyed upon completion of the batch request. This limits the size of the unused buffer pool that serves all response threads. | |
batch-index-threads | #cpu | 256 | true | Number of batch index response worker threads. Each thread has its own queue. These threads only handle returning batch response buffers to the client using sockets. The maximum memory consumption can be computed as: batch-index-threads x batch-max-buffer-per-queue x 128KB. Tuning batch threads to 0 will disable batch functionality, rejecting batch commands with error code 150 (AS_ERR_BATCH_DISABLED ) |
The tools package 6.0.x or later is required to use asadm
manage config commands. Otherwise, use the equivalent asinfo - set-config command.
Example: Increasing the number of idle batch response buffers:
asadm -e "enable; manage config service param batch-max-unused-buffers to 512"
Statistics
The server provides the following batch statistics variables. Refer to Metrics Reference.
Name | Description |
---|---|
batch_index_initiate | Number of batch requests received. |
batch_index_queue | Number of batch requests and response buffers remaining on each batch queue. Format: <q1 requests> :<q1 buffers> ,<q2 requests> :<q2 buffers>,... |
batch_index_complete | Number of completed batch requests. |
batch_index_timeout | Number of timed-out batch requests. |
batch_index_error | Number of batch requests rejected because of errors. |
batch_index_unused_buffers | Number of available 128KB response buffers in the buffer pool. |
batch_index_huge_buffers | Number temporary response buffers created that exceeded 128KB. Huge buffers are created when one of the records is retrieved that is greater than 128KB. Huge records do not benefit from batching and can result in excessive memory thrashing on the server. |
batch_index_created_buffers | Number of 128KB response buffers created. Response buffers are created when there are no buffers left in the pool. If this number consistently increases and there is available memory, then batch-max-unused-buffers should be increased. |
batch_index_destroyed_buffers | Number of 128KB response buffers destroyed. Response buffers are destroyed when there is no slot left to put the buffer back into the pool. The maximum response buffer pool size is batch-max-unused-buffers . |
batch-index | Batch performance histogram. |
Batch sub transactions
The server also provides the following batch sub transactions for read, writes, deletes, udfs, and (Lua) language. Refer to Metrics Reference.
Name | Description |
---|---|
batch_sub_delete_success | Number of batch delete sub transactions that were completed. |
batch_sub_delete_error | Number of batch delete sub transactions that failed with an error. |
batch_sub_delete_timeout | Number of batch delete sub transactions that timed out. |
batch_sub_delete_not_found | Number of batch delete sub transactions that were not found. |
batch_sub_delete_filtered_out | Number of batch delete sub transactions that were filtered out. Transactions filtered out at the bin level by a filter expression. |
batch_sub_lang_read_success | Number of batch sub language read transactions that were completed. |
batch_sub_lang_write_success | Number of batch sub language write transactions that were completed. |
batch_sub_lang_delete_success | Number of batch sub language delete transactions that were completed. |
batch_sub_lang_error | Number of batch language sub transactions that failed with an error. |
batch_sub_udf_complete | Number of batch udf sub transactions that were completed. |
batch_sub_udf_error | Number of batch udf sub transactions that failed with an error. |
batch_sub_udf_timeout | Number of batch udf sub transactions that timed out. |
batch_sub_udf_filtered_out | Number of batch udf sub transactions that were filtered out. Transactions filtered out at the bin level by a filter expression. |
batch_sub_write_success | Number of batch write sub transactions that were completed. |
batch_sub_write_error | Number of batch write sub transactions that failed with an error. |
batch_sub_write_timeout | Number of batch write sub transactions that timed out. |
batch_sub_write_filtered_out | Number of batch write sub transactions that were filtered out. Transactions filtered out at the bin level by a predicate expression. |
batch_sub_proxy_complete | Number of proxied batch sub transactions that completed. |
batch_sub_proxy_error | Number of proxied batch sub transactions that failed with an error. |
batch_sub_proxy_timeout | Number of proxied batch sub transactions that timed out. |
batch_sub_read_error | Number of batch read sub transaction that failed with an error. |
batch_sub_read_not_found | Number of batch read sub transactions that resulted in not found. |
batch_sub_read_success | Number of successful batch read sub transactions. |
batch_sub_read_timeout | Number of batch read sub transactions that timed out. |
batch_sub_tsvc_error | Number of batch read sub transactions that failed with an error in the transaction service, before attempting to handle the transaction. For example protocol errors or security permission mismatch. |
batch_sub_tsvc_timeout | Number of batch read sub transactions that timed out in the transaction service, before attempting to handle the transaction. For example, protocol errors or security permission mismatch. |
retransmit_all_batch_sub_dup_res | Number of retransmits that occurred during batch sub transactions that were being duplicate resolved. Note this includes retransmits originating on the client as well as proxying nodes. |
retransmit_batch_sub_dup_res | Number of retransmits that occurred during batch sub transactions that were being duplicate resolved. Replaced with retransmit_all_batch_sub_dup_res as of version 4.5.1.5. |
early_tsvc_batch_sub_error | Number of errors early in the transaction for batch sub transactions. For example, bad/unknown namespace name or security authentication errors. |
from_proxy_batch_sub_delete_success | Number of records successfully deleted by batch sub delete transaction proxied from another node. |
from_proxy_batch_sub_delete_error | Number of records that were not deleted and failed with an error by batch sub delete transaction. |
from_proxy_batch_sub_delete_timeout | Number of records that were not deleted due to time out by batch sub delete transaction. |
from_proxy_batch_sub_delete_not_found | Number of records that were not deleted because that were not found by batch sub transaction. |
from_proxy_batch_sub_delete_filtered_out | Number of records that were not deleted because they were filtered out by batch sub transaction. Transactions filtered out at the bin level by a filter expression. |
from_proxy_sub_lang_read_success | Number of records that were completed by batch sub language read transactions. |
from_proxy_sub_lang_write_success | Number of records that were completed by batch sub language write transactions. |
from_proxy_sub_lang_delete_success | Number of records that succeeded by batch sub language delete transactions proxied from another node. |
from_proxy_sub_lang_error | Number of records that failed with an error by batch sub language transaction. |
from_proxy_sub_udf_complete | Number of records that were completed by batch sub udf transaction. |
from_proxy_sub_udf_error | Number of records that failed with an error by batch sub udf transaction. |
from_proxy_sub_udf_timeout | Number of records of batch udf sub transactions proxied from another node that timed out, before attempting to handle this transaction. |
from_proxy_sub_udf_filtered_out | Number of records of batch udf sub transactions proxied from another node that did not happen because the record was filtered out via a filter expression. |
from_proxy_sub_write_success | Number of records successfully written by batch sub transactions proxied from another node. |
from_proxy_sub_write_error | Number of batch write sub transactions proxied from another node that failed with an error. |
from_proxy_sub_write_timeout | Number of batch write sub transactions proxied from another node that timed out. |
from_proxy_sub_write_filtered_out | Number of batch write sub transactions proxied from another node that did not happen because the record was filtered out via a predicate expression. |
Batch log file histograms
Periodically, the Aerospike server writes histograms to the log file. Refer to Latency Monitoring for latency histograms of batch and its sub transactions. In addition, review the Batch Transaction Analysis.
Name | Description |
---|---|
batch_sub_write_master | Time taken for writing all the copies of a record to the master . This only applies for strong consistency enabled namespaces.\n |
batch_sub_udf_master | Time taken from partition reserved or after duplicate resolution to an actual master record applied.\n |
batch_sub_repl_write | Time taken from the master record written to replica(s) written.\n |