Skip to main content

User-Defined Functions (UDF) Development Guide

An Aerospike User-Defined Function is a piece of code, written by a developer in Lua programming language (or C called from Lua) that runs inside the Aerospike database server.

There are two types of UDFs in Aerospike: Record UDFs and Stream UDFs. A Record UDF operates on a single record. A Stream UDF operates on a stream of records, and it can comprise multiple stream operators to perform very complex queries.

The complexity of UDFs can range from a single function that is only a few lines long, to a multi-thousand line module that contains many internal functions and multiple external functions.

When contemplating the construction of a new UDF, it is important to consider the application data model and the interaction between record bins. The general pattern (or life-cycle) for UDF development is:

  • Design the application data model
  • Design the UDFs to perform desired functions on the data model
  • Create/Test the UDFs
  • Register UDFs with the Aerospike database
  • Iterate function test/development cycle
  • System Test
  • System Deployment

UDF design and development is usually an iterative activity, where the first version is simple and then, potentially, evolves over time to something complex.

Developing UDFs in Aerospike

Probably the best way to get familiar with the UDF mechanism is to create the "Hello World" Record UDF. Following the instructions below, you will see how to build, register and execute your "Hello World" UDF.

function hello_world(rec)
return "Hello World!!"
end

This "Hello World" Record UDF is invoked when a specific record is referenced, and it simply returns the "Hello World!!" string to the caller.

In general, the section Developing UDF Modules covers the preparation of a UDF and the section Managing UDFs Guide covers the registration and execution the UDF.

Developing Record-Based UDFs

A Record UDF is invoked once for each record in the Aerospike Key-Value operation. Typically, excluding batch operations, only a single record is the target of a KV operation. In general, a Record UDF can do the following:

  • Create/delete bins in the record.
  • Read any/all bins of the record.
  • Modify any/all bins of the record.
  • Delete/Create the specified record.
  • Access parameters constructed and passed in from the client.
  • Construct results as any server data type (e.g. string, number, list, map) to be returned over the wire to the client.

Example: Record Create or Update

In this simple example (Annotated Record UDF), we show how the UDF can create or update an Aerospike record.

In addition UDF developers can use the following Aerospike commands ( Lua: aerospike Module ) to perform record operations:

status = aerospike:create( rec )
status = aerospike:exists( rec )
status = aerospike:update( rec )
status = aerospike:remove( rec )

Example: Basic Statistics Management

In this moderately complex example (Statistics Record UDF), we show how a UDF can manage some numerical statistics (Max Value, Min Value, Ave Value, Count) of values that are kept in various KV Record bins.

In addition, in this example we make use of the Lua: logging Module functions (which are explained in more detail in the UDF UDF Best Practices Guide). The logging functions write messages to the log or the console:

info(message)
debug(message)
trace(message)
warn(message)

These log functions will generate output in the log or console. The following Lua line:

trace("[ENTER]<%s>  Value(%s) valType(%s)",  meth, tostring(newValue), type(newValue));

will generate output as follows:

Sep 21 2013 22:25:22 GMT: DETAIL (ldt): (/home/aerospike/src/lua/udf_samples.lua:160) [ENTER]<unique_set_write()> Value(Map("B"->71, "A"->70)) valType(userdata)

The log functions are controlled by the log/console stanzas in the Aerospike server config file.

Aerospike Client Record UDF Invocation

The following language-specific links provide information on invoking Record UDFs from respective Aerospike Client.

Record UDF: More Detail

Record UDF Development section provides more details on Record UDF development. And Record UDF Examples show several Record UDFs that we've developed for both testing and documentation purposes.

Stream UDF

A Stream UDF is invoked on a stream of records rather than a single record.

Using secondary indexes on bins in a record, a subset of records matching the query criteria can be streamed out. A Stream UDF can be used to extract values in bins of records, get count of records or similar statistics on this extracted stream of records.

The process can be summarized as:

  • Create a Secondary Index
  • Run a Query on a Secondary Index
  • Apply stream UDF on results of a secondary index query.

The following language-specific links provide information on using Stream UDFs from respective Aerospike Client.

Example: Simple Statistics

In this example, Stream UDF – Simple Statistics, we calculate simple statistics information on a data set.

Example: Word Count

In this example, Stream UDF - Word Count, we count the number of times each word appears in a book.

Stream UDF: More Detail

Stream UDF Development section provides more details on Stream UDF development. And Stream UDF Examples show several Stream UDFs that we've developed for both testing and documentation purposes.

Functional Benefits of UDFs

When contemplating the use of a UDF, make sure to first review the extensive atomic operations for the List and Map data types. Multiple operations can be chained to execute in order on a single record, using the operate() command of the Aerospike client.

Such single record transactions are faster and scale better than UDFs. However, there are several situations where a UDF may be advantageous, compared to implementing similar functionality on the application side.

  • Extending the functionality of the collection data types. with new atomic operations, or to implement new collection types (see Record UDF examples).
  • A background UDF, a record UDF applied to multiple records via scan or query, can be used to carry out maintenance.
  • Implementing aggregate functions using a stream UDF.

More Information