Capacity Planning for Specific Data Types
Summary
Type | In Memory | In Memory Indexing | On Disk | On Disk Metadata |
---|---|---|---|---|
Boolean | 0 | n/a | 1 | n/a |
Integer/float | 0 | n/a | 0-255: 1, 256-64K: 2, 64K-4B: 4, 64k-2^64: 8 | n/a |
String | string-len | n/a | string-len | n/a |
GeoJSON | string-len + 12 | n/a | string-len + 12 | n/a |
List | 10 + msgpack-array 1 | ⌊element-count / 128⌋ * 4 | msgpack-array | n/a |
Map | msgpack-map | msgpack-ext + 1 | msgpack-map | 4 2 |
HyperLogLog | 11 + hll | n/a | 11 + hll | n/a |
Note: All sizes are in bytes unless otherwise noted.
List
The list data type is serialized as a MessagePack array, with 1, 3 or 5 header bytes, and each element serialized as well.
Example
For a list of 3 integer elements [0, 1000, 255]
:
1 byte header for 3 elements
+1 byte for integer 0
+3 byte for integer 1000
+2 byte for integer 255
1 + 1 + 3 + 2 = 7 bytes.
If this list is stored in-memory, we need to add 10 bytes for metadata.
Map
The map data type is serialized as a MessagePack map, with 1, 3 or 5 header bytes, and with map-key/map-value pairs serialized as well.
On Disk Metadata
When Aerospike maps are stored on disk, there is a flat 4 byte cost to the associated metadata, unless the map is unordered. There is no advantage to choosing to use an unordered map, and key ordered has better performance. See Development guidelines and tips.
Example
A K-ordered map with 3 elements {a: 1, bb: 2000, ccc: 300000}
1 byte header for 3 pairs
2 bytes for 'a' and 1 byte for 0
3 bytes for 'bb' and 3 bytes for 2000
4 bytes for 'ccc' and 5 bytes for 300000
1 + 3 + 6 + 9 = 19 bytes for the data itself + 4 bytes metadata = 23 bytes.
In Memory Indexing
When Aerospike maps are stored in an in-memory namespace, an additional amount of memory storage is taken up by key and value indexes.
msgpack-ext = header + offset-index + value-index
index = element-count * size/element
element-count = number of elements in the map
Type | Indexes |
---|---|
unordered | None |
key ordered | offset |
key and value ordered | offset + value |
Index Size/Element
var3 | size/element |
---|---|
< 2^8 | 1 |
< 2^16 | 2 |
< 2^24 | 3 |
>= 2^24 | 4 |
HyperLogLog
The HyperLogLog data type has an array of 2^n_index_bits registers.
Each register contains 6 bits of HyperLogLog value and n_minhash_bits optional bits of MinHash value. Adding MinHash bits enables HyperMinHash functionality, a superset of HyperLogLog.
The storage size of the registers is rounded up to the nearest byte.
hll = 11 bytes + roundUpToByte(2^n_index_bits * (6 + n_minhash_bits))
Example
A HyperLogLog bin with 12 registers uses the following approximate memory, where 8 bits in a byte is the rounding factor:
11 bytes + ((2^12 * 6) bits / 8) = 3083 bytes.
2 No metadata if map is unordered. [↩](#ref2)
3 *var* is msgpack-size for offset-index and element-count for value-index. [↩](#ref3)