Skip to main content
Loading

Capacity Planning for Specific Data Types

Summary

TypeIn MemoryIn Memory IndexingOn DiskOn Disk Metadata
Boolean0n/a1n/a
Integer/float0n/a0-255: 1, 256-64K: 2, 64K-4B: 4, 64k-2^64: 8n/a
Stringstring-lenn/astring-lenn/a
GeoJSONstring-len + 12n/astring-len + 12n/a
List10 + msgpack-array 1element-count / 128 * 4msgpack-arrayn/a
Mapmsgpack-mapmsgpack-ext + 1msgpack-map4 2
HyperLogLog11 + hlln/a11 + hlln/a

Note: All sizes are in bytes unless otherwise noted.

List

The list data type is serialized as a MessagePack array, with 1, 3 or 5 header bytes, and each element serialized as well.

Example

For a list of 3 integer elements [0, 1000, 255]:

1 byte header for 3 elements
+1 byte for integer 0
+3 byte for integer 1000
+2 byte for integer 255

1 + 1 + 3 + 2 = 7 bytes.

If this list is stored in-memory, we need to add 10 bytes for metadata.

Map

The map data type is serialized as a MessagePack map, with 1, 3 or 5 header bytes, and with map-key/map-value pairs serialized as well.

On Disk Metadata

When Aerospike maps are stored on disk, there is a flat 4 byte cost to the associated metadata, unless the map is unordered. There is no advantage to choosing to use an unordered map, and key ordered has better performance. See Development guidelines and tips.

Example

A K-ordered map with 3 elements {a: 1, bb: 2000, ccc: 300000}

1 byte header for 3 pairs
2 bytes for 'a' and 1 byte for 0
3 bytes for 'bb' and 3 bytes for 2000
4 bytes for 'ccc' and 5 bytes for 300000

1 + 3 + 6 + 9 = 19 bytes for the data itself + 4 bytes metadata = 23 bytes.

In Memory Indexing

When Aerospike maps are stored in an in-memory namespace, an additional amount of memory storage is taken up by key and value indexes.

msgpack-ext = header + offset-index + value-index
index = element-count * size/element
element-count = number of elements in the map

TypeIndexes
unorderedNone
key orderedoffset
key and value orderedoffset + value

Index Size/Element

var3size/element
< 2^81
< 2^162
< 2^243
>= 2^244

HyperLogLog

The HyperLogLog data type has an array of 2^n_index_bits registers.

Each register contains 6 bits of HyperLogLog value and n_minhash_bits optional bits of MinHash value. Adding MinHash bits enables HyperMinHash functionality, a superset of HyperLogLog.

The storage size of the registers is rounded up to the nearest byte.

hll = 11 bytes + roundUpToByte(2^n_index_bits * (6 + n_minhash_bits))

Example

A HyperLogLog bin with 12 registers uses the following approximate memory, where 8 bits in a byte is the rounding factor:

11 bytes + ((2^12 * 6) bits / 8) = 3083 bytes.
1 [Msgpack specification](https://github.com/msgpack/msgpack/blob/master/spec.md) [↩](#ref1)
2 No metadata if map is unordered. [↩](#ref2)
3 *var* is msgpack-size for offset-index and element-count for value-index. [↩](#ref3)