CDT Indexing in Aerospike

For an interactive Jupyter notebook experience:

This tutorial describes the secondary index and query functionality pertaining to CDTs available in Aerospike Database 6.1+.

This notebook requires the Aerospike Database running locally with Java kernel and Aerospike Java Client. To create a Docker container that satisfies the requirements and holds a copy of Aerospike notebooks, visit the Aerospike Notebooks Repo.

Introduction

In this notebook, we will illustrate the CDT indexing and query capabilities in Aerospike Database 6.1+. The detailed description is available in the accompanying blog post Query JSON Documents Faster (and More) with CDT Indexing.

The specific topics covered in this notebook include:

New CDT indexing functionality
Code examples
- Non-collection elements
- Collection elements
- JSON documents

Prerequisites

This tutorial assumes familiarity with the following topics:

Setup

Ensure database is running

This notebook requires that Aerospike database is running.

import io.github.spencerpark.ijava.IJava;
import io.github.spencerpark.jupyter.kernel.magic.common.Shell;
IJava.getKernelInstance().getMagics().registerMagics(Shell.class);
%sh asd

Download and install additional components.

Install the Java client version 6.1.0 or above that supports the new CDT indexing capabilities.

%%loadFromPOM
<dependencies>
  <dependency>
    <groupId>com.aerospike</groupId>
    <artifactId>aerospike-client</artifactId>
    <version>6.1.0</version>
  </dependency>
  <dependency>
  <groupId>com.aerospike</groupId>
  <artifactId>aerospike-document-api</artifactId>
  <version>1.1.3</version>
 </dependency>
 <dependency>     
    <groupId>org.apache.commons</groupId>
     <artifactId>commons-io</artifactId>
     <version>1.3.2</version>
 </dependency>
</dependencies>

Initialize Client

Initialize the client.

import com.aerospike.client.AerospikeClient;

AerospikeClient client = new AerospikeClient("localhost", 3000);
System.out.println("Initialized the client and connected to the cluster.");;

Output:

Initialized the client and connected to the cluster.

Import Modules

// import all needed modules
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

import com.aerospike.client.AerospikeException;
import com.aerospike.client.Key;
import com.aerospike.client.Bin;
import com.aerospike.client.policy.WritePolicy;
import com.aerospike.client.policy.ClientPolicy;
import com.aerospike.client.Record;
import com.aerospike.client.Operation;
import com.aerospike.client.Value;
import com.aerospike.client.Value.ListValue;
import com.aerospike.client.Value.MapValue;
import com.aerospike.client.cdt.ListOperation;
import com.aerospike.client.cdt.ListPolicy;
import com.aerospike.client.cdt.ListOrder;
import com.aerospike.client.cdt.ListWriteFlags;
import com.aerospike.client.cdt.MapOperation;
import com.aerospike.client.cdt.MapPolicy;
import com.aerospike.client.cdt.MapOrder;
import com.aerospike.client.cdt.MapWriteFlags;
import com.aerospike.client.query.Statement;
import com.aerospike.client.query.Filter;
import com.aerospike.client.query.RecordSet;
import com.aerospike.client.Record;
import com.aerospike.client.policy.QueryPolicy;
import com.aerospike.client.exp.Exp;
import com.aerospike.client.query.IndexType;
import com.aerospike.client.query.IndexCollectionType;
import com.aerospike.client.task.IndexTask;
import com.aerospike.client.AerospikeException;
import com.aerospike.client.ResultCode;
import com.aerospike.client.cdt.CTX;

Define Constants and Helper Functions

Define constants for the namespaces test, set cdt-indexing, and helper functions createIndex, dropIndex, executeQueryAndPrintResults, and truncateTestData.

String Namespace = "test";
String Set = "cdt-indexing";

// convenience function to create an index - essentially a pass-through to the client API
void createIndex(String idxName, String binName, IndexType idxType, 
                 IndexCollectionType colType, CTX... ctx) {
    try {
            IndexTask task = client.createIndex(null,
                                       Namespace,
                                       Set,
                                       idxName,
                                       binName,
                                       idxType,
                                       colType,
                                       ctx);
            task.waitTillComplete(1000, 0);
        }
        catch (AerospikeException ae) {
            if (ae.getResultCode() != ResultCode.INDEX_ALREADY_EXISTS) {
                throw ae;
            }
        } 
        System.out.format("Created index %s on ns=%s set=%s bin=%s.\n", 
                                    idxName, Namespace, Set, binName);
}

// convenience function to drop an index - essentially a pass-through to the client API
void dropIndex(String idxName) {
    try {
        IndexTask task = client.dropIndex(null, Namespace, Set, idxName);
    }
    catch (AerospikeException ae) {
        if (ae.getResultCode() != ResultCode.INDEX_NOTFOUND) {
            throw ae;
        }
    } 
    System.out.format("Dropped index %s.\n", idxName);
}

// convenience function to execute a query using the input filter and print the results
void executeQueryAndPrintResults(Filter filter, String binName) {
    Statement stmt = new Statement();
    stmt.setNamespace(Namespace);
    stmt.setSetName(Set);
    stmt.setFilter(filter);
    stmt.setBinNames(binName);
    RecordSet rs = client.query(null, stmt);
    while (rs.next()) {
        Key key = rs.getKey();
        Record record = rs.getRecord();
        System.out.format("key=%s bins=%s\n", key.userKey, record.bins);
    }
    //System.out.println();
    rs.close();
}

// convenience function to truncate test data
void truncateTestData() {
    try {
        client.truncate(null, Namespace, null, null);
    }
    catch (AerospikeException e) {
        // ignore
    }
}

Open a Terminal Tab

You can execute shell commands including Aerospike tools like aql to examine data in the terminal tab. Open a terminal tab by selecting File->Open from the notebook menu, and then New->Terminal.

CDT Test Data

We will illustrate how to create indexes on CDTs and issue queries using two CDTs - a nested List and a nested Map that are populated in 10 records each.

A nested List in records with user_key (id) 1-10. The list is held in bin list_bin and has the following numeric, string, and List elements.
- Numeric elements: user_key to user_key+4.
  - For example, for user_key=1: 1, 2, 3, 4, 5.
- String elements: "s"+user_key to "s"+(user_key+4).
  - For example, for user_key=1: "s1", "s2", "s3", "s4", "s5"
- Nested List element: holds numeric elements 10*user_key+1 to 10*user_key+5.
  - For example, for user_key=1: [11, 12, 13, 44, 15].
So, the value in list_bin for the record with user_key=1 looks like:
[ 1, "s1", 2, "s2", 3, "s3", 4, "s4", 5, "s5", [ 11, 12, 13, 14, 15 ] ].
A nested Map in records with user_key (id) 101-110. The map is held in bin map_bin and has the following numeric, string, and List elements. Below, we have used i with a shorthand for i=user_key-100.
- "oid": i.
  - For example, for i=1: "oid": 1.
- "obj": a nested map with the following key-value pairs: {"attr1": 10*i+1, "attr2": "s"+10*i+2, "subobj": {"attr3": 100*i+1, "attr4": 100*i+2}}.
  - For example, for i=1: "obj": {"attr1": 11, "attr2": "s12", "subobj": {"attr3": 101, "attr4": 102}}.
- i: ["s"+i, "s"+i+1].
  - For example, for i=1: 1: ["s1", "s2"].
So, the value in map_bin for the record with user_key=101 (or, equivalently, i=1) looks like:
{ "oid": 1, "obj": {"attr1": 11, "attr2": "s12", "subobj": {"attr3": 101, "attr4": 102}}, 1: ["s1", "s2"] }

Populate Test Data

Execute the cell below to populate the test data described above.

final String ListBin = "list_bin";
final String MapBin = "map_bin";
WritePolicy wpolicy = new WritePolicy();
wpolicy.sendKey = true;

// create the list_bin with 
//   [i..i+4, "s"+i.."s"+4, [10*i+1..10*i+5]]

for (int user_key = 1; user_key <= 10; user_key++) {
    Key key = new Key(Namespace, Set, user_key);

    // append integer and string elements
    ArrayList<Value> list_val = new ArrayList<Value>();
    for (int n = user_key; n < user_key+5; n++) {
        list_val.add(Value.get(n));
        list_val.add(Value.get("s"+Integer.toString(n)));
    }

    // append the nested list element
    ArrayList<Integer> list_int = new ArrayList<Integer>();
    for (int n=user_key*10+1; n < user_key*10+6; n++) {
        list_int.add(n);
    }
    ListValue list_int_val = new ListValue(list_int);

    // create the list bin
    Record record = client.operate(wpolicy, key,
                                    ListOperation.clear(ListBin), 
                                    ListOperation.appendItems(ListBin, list_val), 
                                    ListOperation.append(ListBin, list_int_val));
}

// create map bin with 
//    "oid": i,
//    "obj": {"attr1": 10*i+1, "attr2": "s"+10*+i+2, "subobj": {"attr3": 100*1+1, "attr4": 100*1+2}}, 
//    i: ["s"+i, "s"+i+1]

for (int user_key = 101; user_key <= 110; user_key++) {
    Integer i = user_key - 100;
    
    Key key = new Key(Namespace, Set, user_key);
    
    // construct obj map value
    HashMap <String, Value> obj = new HashMap <String, Value>();
    obj.put("attr1", Value.get(10*i+1));    
    obj.put("attr2", Value.get("s"+Integer.toString(10*i+2)));   
    HashMap <String, Integer> subobj = new HashMap <String, Integer>();
    subobj.put("attr3", 100*i+1);    
    subobj.put("attr4", 100*i+2);    
    obj.put("subobj", new MapValue(subobj));   

    // construct arr list value
    ArrayList<String> arr = new ArrayList<String>();
    arr.add("s"+Integer.toString(i*10+1));
    arr.add("s"+Integer.toString(i*10+2));
    
    // create the map in map_bin
    MapPolicy mPolicy = new MapPolicy(MapOrder.UNORDERED, MapWriteFlags.DEFAULT);
    Record record = client.operate(wpolicy, key,
                                MapOperation.clear(MapBin), 
                                MapOperation.put(mPolicy, MapBin, 
                                            Value.get("oid"), Value.get(i)),
                                MapOperation.put(mPolicy, MapBin, 
                                            Value.get("obj"), new MapValue(obj)),
                                MapOperation.put(mPolicy, MapBin, 
                                            Value.get(i), new ListValue(arr))
                                ); 
}

Examine Test Data

View the test data by executing the following commands in a Jupyter terminal tab.

aql -c "select list_bin from test.cdt-indexing"

aql -c "select map_bin from test.cdt-indexing"

Non-Collection Indexes

We will illustrate how indexes on non-collection elements are created and used. We will start with results in mind, and then create the appropriate index, issue a query using the index, and show the results.

Equality queries

Get records with a specific integer or string value at a specific index, rank, or key position of a List or a Map.

Records with a specific value at index X of a List or Map

The top level List in list_bin has 3 at position 2 (0-indexed)

// Create an index for element at position 2 of the top List in list_bin. 
// idx_list_bin_top_pos_2_num
createIndex("idx_list_bin_top_pos_2_num", ListBin, IndexType.NUMERIC, IndexCollectionType.DEFAULT, 
            CTX.listIndex(2));

// issue the query and print results
System.out.println("Records with the value 3 at position 2 of the top List in list_bin:");
Filter filter = Filter.equal(ListBin, 3, CTX.listIndex(2));
executeQueryAndPrintResults(filter, ListBin);

Output:

Created index idx_list_bin_top_pos_2_num on ns=test set=cdt-indexing bin=list_bin.
Records with the value 3 at position 2 of the top List in list_bin:
key=2 bins={list_bin=[2, s2, 3, s3, 4, s4, 5, s5, 6, s6, [21, 22, 23, 24, 25]]}

The top level Map in map_bin has 7 at position 2

// Create an index for element at position 2 of the top Map in map_bin. Note, in the key ordered sequence, 
//   "oid" element is in position 2 after the numeric key and "obj".
// idx_map_bin_top_pos_2_num
createIndex("idx_map_bin_top_pos_2_num", MapBin, IndexType.NUMERIC, IndexCollectionType.DEFAULT, 
            CTX.mapIndex(2));

// issue the query and print results
System.out.println("Records with the value 7 at position 2 of the top Map in map_bin:");
Filter filter = Filter.equal(MapBin, 7, CTX.mapIndex(2));
executeQueryAndPrintResults(filter, MapBin);

Output:

Created index idx_map_bin_top_pos_2_num on ns=test set=cdt-indexing bin=map_bin.
Records with the value 7 at position 2 of the top Map in map_bin:
key=107 bins={map_bin={obj={attr2=s72, attr1=71, subobj={attr4=702, attr3=701}}, oid=7, 7=[s71, s72]}}

Records with a specific value at rank X of a List or Map

The nested List in list_bin has 35 in rank -1 (highest value)

// Create an index for element at rank -1 of the nested List in list_bin. Note the nested List is at the 
//   last poistion (index -1) in the top List.
// idx_list_bin_nested_rnk_-1_num
createIndex("idx_list_bin_nested_rnk_-1_num", ListBin, IndexType.NUMERIC, IndexCollectionType.DEFAULT, 
            CTX.listIndex(-1), CTX.listRank(-1));

// issue the query and print results
System.out.println("Records with the value 35 in rank -1 (highest value) in the nested List in list_bin:");
Filter filter = Filter.equal(ListBin, 35, CTX.listIndex(-1), CTX.listRank(-1));
executeQueryAndPrintResults(filter, ListBin);

Output:

Created index idx_list_bin_nested_rnk_-1_num on ns=test set=cdt-indexing bin=list_bin.
Records with the value 35 in rank -1 (highest value) in the nested List in list_bin:
key=3 bins={list_bin=[3, s3, 4, s4, 5, s5, 6, s6, 7, s7, [31, 32, 33, 34, 35]]}

The nested "subobj" Map in map_bin has 901 in rank 0 (lowest value)

// Create an index for element at rank 0 (lowest value) in the nested "subobj" Map in map_bin. 
// idx_map_bin_subobj_rnk_0_num
createIndex("idx_map_bin_subobj_rnk_0_num", MapBin, IndexType.NUMERIC, IndexCollectionType.DEFAULT, 
            CTX.mapKey(Value.get("obj")), CTX.mapKey(Value.get("subobj")), CTX.mapRank(0));

// issue the query and print results
System.out.println("Records with the value 901 in rank 0 (lowest value) in the nested \"subobj\" Map in map_bin:");
Filter filter = Filter.equal(MapBin, 901, 
            CTX.mapKey(Value.get("obj")), CTX.mapKey(Value.get("subobj")), CTX.mapRank(0));
executeQueryAndPrintResults(filter, MapBin);

Output:

Created index idx_map_bin_subobj_rnk_0_num on ns=test set=cdt-indexing bin=map_bin.
Records with the value 901 in rank 0 (lowest value) in the nested "subobj" Map in map_bin:
key=109 bins={map_bin={9=[s91, s92], obj={attr2=s92, attr1=91, subobj={attr4=902, attr3=901}}, oid=9}}

Records with a specific value at key X of a Map

The top level Map in map_bin has 8 at key "oid"

// Create an index for element with key "oid" in the top level Map in map_bin. 
// idx_map_bin_top_key_oid_num
createIndex("idx_map_bin_top_key_oid_num", MapBin, IndexType.NUMERIC, IndexCollectionType.DEFAULT, 
            CTX.mapKey(Value.get("oid")));

// issue the query and print results
System.out.println("Records with the value 8 at key \"oid\" in the top level Map in map_bin:");
Filter filter = Filter.equal(MapBin, 8, 
            CTX.mapKey(Value.get("oid")));
executeQueryAndPrintResults(filter, MapBin);

Output:

Created index idx_map_bin_top_key_oid_num on ns=test set=cdt-indexing bin=map_bin.
Records with the value 8 at key "oid" in the top level Map in map_bin:
key=108 bins={map_bin={8=[s81, s82], obj={attr2=s82, attr1=81, subobj={attr4=802, attr3=801}}, oid=8}}

The nested "obj" Map in map_bin has "s42" at key "attr2"

// Create an index for element with key "attr2" in the nested "obj" Map in map_bin. 
// idx_map_bin_obj_key_attr2_str
createIndex("idx_map_bin_obj_key_attr2_str", MapBin, IndexType.STRING, IndexCollectionType.DEFAULT, 
            CTX.mapKey(Value.get("obj")), CTX.mapKey(Value.get("attr2")));

// issue the query and print results
System.out.println("Records with the value \"s42\" at key \"attr2\" in the nested \"obj\" Map in map_bin:");
Filter filter = Filter.equal(MapBin, "s42", 
            CTX.mapKey(Value.get("obj")), CTX.mapKey(Value.get("attr2")));
executeQueryAndPrintResults(filter, MapBin);

Output:

Created index idx_map_bin_obj_key_attr2_str on ns=test set=cdt-indexing bin=map_bin.
Records with the value "s42" at key "attr2" in the nested "obj" Map in map_bin:
key=104 bins={map_bin={4=[s41, s42], obj={attr2=s42, attr1=41, subobj={attr4=402, attr3=401}}, oid=4}}

Range Queries

Get records having an integer value within a range at a specific index, rank, or key position of a List or a Map. Range queries are supported on integer values only.

We will use the indexes defined above for the range queries too.

Records with value in a specific range at index X of a List or Map

The top level List in list_bin is in range 3-5 at position 2 (0-indexed)

// issue the query and print results
System.out.println("Records with value in range 3-5 at position 2 in the top level List in list_bin:");
Filter filter = Filter.range(ListBin, 3, 5, CTX.listIndex(2));
executeQueryAndPrintResults(filter, ListBin);

Output:

Records with value in range 3-5 at position 2 in the top level List in list_bin:
key=3 bins={list_bin=[3, s3, 4, s4, 5, s5, 6, s6, 7, s7, [31, 32, 33, 34, 35]]}
key=2 bins={list_bin=[2, s2, 3, s3, 4, s4, 5, s5, 6, s6, [21, 22, 23, 24, 25]]}
key=4 bins={list_bin=[4, s4, 5, s5, 6, s6, 7, s7, 8, s8, [41, 42, 43, 44, 45]]}

The top level Map in map_bin is in range 5-7 at position 2

// issue the query and print results
System.out.println("Records with value in range 5-7 at position 2 of the top Map in map_bin:");
Filter filter = Filter.range(MapBin, 5, 7, CTX.mapIndex(2));
executeQueryAndPrintResults(filter, MapBin);

Output:

Records with value in range 5-7 at position 2 of the top Map in map_bin:
key=107 bins={map_bin={obj={attr2=s72, attr1=71, subobj={attr4=702, attr3=701}}, oid=7, 7=[s71, s72]}}
key=105 bins={map_bin={5=[s51, s52], obj={attr2=s52, attr1=51, subobj={attr4=502, attr3=501}}, oid=5}}
key=106 bins={map_bin={obj={attr2=s62, attr1=61, subobj={attr4=602, attr3=601}}, 6=[s61, s62], oid=6}}

Records with value in a specific range at rank X of a List or Map

The nested List in list_bin is in range 20-50 at rank -1 (highest value)

// issue the query and print results
System.out.println("Records with value in range 20-50 in rank -1 (highest value) in the nested List in list_bin:");
Filter filter = Filter.range(ListBin, 20, 50, CTX.listIndex(-1), CTX.listRank(-1));
executeQueryAndPrintResults(filter, ListBin);

Output:

Records with value in range 20-50 in rank -1 (highest value) in the nested List in list_bin:
key=3 bins={list_bin=[3, s3, 4, s4, 5, s5, 6, s6, 7, s7, [31, 32, 33, 34, 35]]}
key=2 bins={list_bin=[2, s2, 3, s3, 4, s4, 5, s5, 6, s6, [21, 22, 23, 24, 25]]}
key=4 bins={list_bin=[4, s4, 5, s5, 6, s6, 7, s7, 8, s8, [41, 42, 43, 44, 45]]}

The nested "subobj" Map in map_bin is in range 500-800 at rank 0 (lowest value)

// issue the query and print results
System.out.println("Records with value in range 500-800 in rank 0 (lowest value) in the nested \"subobj\" Map in map_bin:");
Filter filter = Filter.range(MapBin, 500, 800,
            CTX.mapKey(Value.get("obj")), CTX.mapKey(Value.get("subobj")), CTX.mapRank(0));
executeQueryAndPrintResults(filter, MapBin);

Output:

Records with value in range 500-800 in rank 0 (lowest value) in the nested "subobj" Map in map_bin:
key=107 bins={map_bin={obj={attr2=s72, attr1=71, subobj={attr4=702, attr3=701}}, oid=7, 7=[s71, s72]}}
key=105 bins={map_bin={5=[s51, s52], obj={attr2=s52, attr1=51, subobj={attr4=502, attr3=501}}, oid=5}}
key=106 bins={map_bin={obj={attr2=s62, attr1=61, subobj={attr4=602, attr3=601}}, 6=[s61, s62], oid=6}}

Records with value in a specific range at key X of a Map

The top level Map in map_bin has a value in range 4-6 at key "oid"

// issue the query and print results
System.out.println("Records with value in range 4-6 at key \"oid\" in the top level Map in map_bin:");
Filter filter = Filter.range(MapBin, 4, 6, CTX.mapKey(Value.get("oid")));
executeQueryAndPrintResults(filter, MapBin);

Output:

Records with value in range 4-6 at key "oid" in the top level Map in map_bin:
key=105 bins={map_bin={5=[s51, s52], obj={attr2=s52, attr1=51, subobj={attr4=502, attr3=501}}, oid=5}}
key=104 bins={map_bin={4=[s41, s42], obj={attr2=s42, attr1=41, subobj={attr4=402, attr3=401}}, oid=4}}
key=106 bins={map_bin={obj={attr2=s62, attr1=61, subobj={attr4=602, attr3=601}}, 6=[s61, s62], oid=6}}

Collection Indexes

Collection indexes are defined on List and Map elements with the collection type LIST (list values in the List), MAPKEYS (key values in the Map), or MAPVALUES (values in the Map). The indexes can be used for equality queries on integr and string elements, as well as range queries on integer elements. We will illustrate these variations below.

Equality Queries

Get records with a specific integer or string value in a List or Map.

Records with a List or Map containing a specific value.

The nested List in map_bin contains "s91".

// Create an index for the nested List in map_bin. Note, in key-ordered sequence, the numeric key is the first one.
// idx_map_bin_i_list_str
createIndex("idx_map_bin_i_list_str", MapBin, IndexType.STRING, IndexCollectionType.LIST, 
            CTX.mapIndex(0));

// issue the query and print results
System.out.println("Records with the value \"s91\" in the nested list \"arr\" in map_bin:");
Filter filter = Filter.contains(MapBin, IndexCollectionType.LIST, "s91", CTX.mapIndex(0));
executeQueryAndPrintResults(filter, MapBin);

Output:

Created index idx_map_bin_i_list_str on ns=test set=cdt-indexing bin=map_bin.
Records with the value "s91" in the nested list "arr" in map_bin:
key=109 bins={map_bin={9=[s91, s92], obj={attr2=s92, attr1=91, subobj={attr4=902, attr3=901}}, oid=9}}

The top level Map in map_bin contains a key 5.

// Create an index for keys in top level Map in map_bin. 
// idx_map_bin_top_mapkeys_num
createIndex("idx_map_bin_top_mapkeys_num", MapBin, IndexType.NUMERIC, IndexCollectionType.MAPKEYS);

// issue the query and print results
System.out.println("Records with key 5 in the top level Map in map_bin:");
Filter filter = Filter.contains(MapBin, IndexCollectionType.MAPKEYS, 5);
executeQueryAndPrintResults(filter, MapBin);

Output:

Created index idx_map_bin_top_mapkeys_num on ns=test set=cdt-indexing bin=map_bin.
Records with key 5 in the top level Map in map_bin:
key=105 bins={map_bin={5=[s51, s52], obj={attr2=s52, attr1=51, subobj={attr4=502, attr3=501}}, oid=5}}

The nested "obj" Map in map_bin contains a value "s12".

// Create an index for values in nested Map "obj" in map_bin. 
// idx_map_bin_obj_mapvals_str
createIndex("idx_map_bin_obj_mapvals_str", MapBin, IndexType.STRING, IndexCollectionType.MAPVALUES, 
            CTX.mapKey(Value.get("obj")));

// issue the query and print results
System.out.println("Records with the value \"s12\" in the nested Map \"obj\" in map_bin:");
Filter filter = Filter.contains(MapBin, IndexCollectionType.MAPVALUES, "s12", CTX.mapKey(Value.get("obj")));
executeQueryAndPrintResults(filter, MapBin);

Output:

Created index idx_map_bin_obj_mapvals_str on ns=test set=cdt-indexing bin=map_bin.
Records with the value "s12" in the nested Map "obj" in map_bin:
key=101 bins={map_bin={1=[s11, s12], obj={attr2=s12, attr1=11, subobj={attr4=102, attr3=101}}, oid=1}}

Range Queries

Get records with a range of integer values in a List or Map.

Records with a List or Map containing an integer in a given range.

The nested list in list_bin has a value in range 85-93.

// Create an index for the nested List in map_bin. Note, the nested list is the last element.
// idx_list_bin_nested_list_num
createIndex("idx_list_bin_nested_list_num", ListBin, IndexType.NUMERIC, IndexCollectionType.LIST, 
            CTX.listIndex(-1));

// issue the query and print results
System.out.println("Records with value in range 85-93 in the nested list in list_bin:");
Filter filter = Filter.range(ListBin, IndexCollectionType.LIST, 85, 93, CTX.listIndex(-1));
executeQueryAndPrintResults(filter, ListBin);

Output:

Created index idx_list_bin_nested_list_num on ns=test set=cdt-indexing bin=list_bin.
Records with value in range 85-93 in the nested list in list_bin:
key=8 bins={list_bin=[8, s8, 9, s9, 10, s10, 11, s11, 12, s12, [81, 82, 83, 84, 85]]}
key=9 bins={list_bin=[9, s9, 10, s10, 11, s11, 12, s12, 13, s13, [91, 92, 93, 94, 95]]}

The top level Map in map_bin has a key in range 5-7.

// use the index created earlier
// issue the query and print results
System.out.println("Records with key in range 5-7 in the top level Map in map_bin:");
Filter filter = Filter.range(MapBin, IndexCollectionType.MAPKEYS, 5, 7);
executeQueryAndPrintResults(filter, MapBin);

Output:

Records with key in range 5-7 in the top level Map in map_bin:
key=107 bins={map_bin={obj={attr2=s72, attr1=71, subobj={attr4=702, attr3=701}}, oid=7, 7=[s71, s72]}}
key=105 bins={map_bin={5=[s51, s52], obj={attr2=s52, attr1=51, subobj={attr4=502, attr3=501}}, oid=5}}
key=106 bins={map_bin={obj={attr2=s62, attr1=61, subobj={attr4=602, attr3=601}}, 6=[s61, s62], oid=6}}

The nested Map "subobj" in map_bin has a value in range 300-500.

// Create an index for values in nested Map "subobj" in map_bin. 
// idx_map_bin_subobj_mapvals_num
createIndex("idx_map_bin_subobj_mapvals_num", MapBin, IndexType.NUMERIC, IndexCollectionType.MAPVALUES, 
            CTX.mapKey(Value.get("obj")), CTX.mapKey(Value.get("subobj")));

// issue the query and print results
System.out.println("Records with value in range 300-500 in the nested Map \"subobj\" in map_bin:");
Filter filter = Filter.range(MapBin, IndexCollectionType.MAPVALUES, 300, 500, 
            CTX.mapKey(Value.get("obj")), CTX.mapKey(Value.get("subobj")));
executeQueryAndPrintResults(filter, MapBin);

Output:

Created index idx_map_bin_subobj_mapvals_num on ns=test set=cdt-indexing bin=map_bin.
Records with value in range 300-500 in the nested Map "subobj" in map_bin:
key=103 bins={map_bin={obj={attr2=s32, attr1=31, subobj={attr4=302, attr3=301}}, oid=3, 3=[s31, s32]}}
key=104 bins={map_bin={4=[s41, s42], obj={attr2=s42, attr1=41, subobj={attr4=402, attr3=401}}, oid=4}}

Indexing JSON Documents

We will illustrate how JSON documents can be indexed and accessed using CDT indexing.

Initialize Document Client

// Import modules
import org.apache.commons.io.FileUtils;
import com.fasterxml.jackson.databind.JsonNode;
import com.aerospike.documentapi.*;

// Initialize Document client from Aerospike client
try {
    client.close();
}
catch (Exception ae) {
    //ignore
}
ClientPolicy cPolicy = new ClientPolicy();
cPolicy.writePolicyDefault.sendKey = true;
AerospikeClient client = new AerospikeClient(cPolicy, "localhost", 3000);
AerospikeDocumentClient documentClient = new AerospikeDocumentClient(client);
System.out.println("Initialized document client from the Aerospike client.");;

Initialized document client from the Aerospike client.

String Namespace = "test";
String Set = "cdt-indexing";

// convenience function to create an index - essentially a pass-through to the client API
void createIndex(String idxName, String binName, IndexType idxType, 
                 IndexCollectionType colType, CTX... ctx) {
    try {
            IndexTask task = client.createIndex(null,
                                       Namespace,
                                       Set,
                                       idxName,
                                       binName,
                                       idxType,
                                       colType,
                                       ctx);
            task.waitTillComplete(1000, 0);
        }
        catch (AerospikeException ae) {
            if (ae.getResultCode() != ResultCode.INDEX_ALREADY_EXISTS) {
                throw ae;
            }
        } 
        System.out.format("Created index %s on ns=%s set=%s bin=%s.\n", 
                                    idxName, Namespace, Set, binName);
}

// convenience function to drop an index - essentially a pass-through to the client API
void dropIndex(String idxName) {
    try {
        IndexTask task = client.dropIndex(null, Namespace, Set, idxName);
    }
    catch (AerospikeException ae) {
        if (ae.getResultCode() != ResultCode.INDEX_NOTFOUND) {
            throw ae;
        }
    } 
    System.out.format("Dropped index %s.\n", idxName);
}

// convenience function to execute a query using the input filter and print the results
void executeQueryAndPrintResults(Filter filter, String binName) {
    Statement stmt = new Statement();
    stmt.setNamespace(Namespace);
    stmt.setSetName(Set);
    stmt.setFilter(filter);
    stmt.setBinNames(binName);
    RecordSet rs = client.query(null, stmt);
    while (rs.next()) {
        Key key = rs.getKey();
        System.out.format("key=%s\n", key.userKey);
        Record record = rs.getRecord();
        //System.out.format("key=%s bins=%s\n", key.userKey, record.bins);
        Map<String,?> jsonMap = (Map<String,?>) record.getValue(JsonBin);
        for (String field: jsonMap.keySet()) {
            System.out.printf("\t%s=%.200s\n", field, jsonMap.get(field));
        }
    }
    //System.out.println();
    rs.close();
}

// convenience function to truncate test data
void truncateTestData() {
    try {
        client.truncate(null, Namespace, null, null);
    }
    catch (AerospikeException e) {
        // ignore
    }
}

Example 1: Nobel Prizes JSON Dataset

Read JSON File and Populate Database

We will load JSON documents from the file "nobel_prizes.json", and store each prize entry in the database in a separate record with user_key starting at 1000 in bin "json_bin". You can view the file's contents by opening it from Jupyter's file browser.

String JsonFilePath = "/home/jovyan/notebooks/java/nobel_prizes.json";
String JsonBin = "json_bin";
Integer UserKeyOffset = 1000;

// Read the json document into a string.
String jsonString = FileUtils.readFileToString(new File(JsonFilePath));
System.out.format("Read JSON doc from: %s\n", JsonFilePath);;

// Convert JSON string to a JsonNode
JsonNode rootNode = JsonConverters.convertStringToJsonNode(jsonString);

// Add the prizes in JSON in separate records
JsonNode prizesNode = rootNode.path("prizes");

Integer prizeIndex = 0;
for (JsonNode prizeNode : prizesNode) {
    if (prizeIndex == 0) {
        System.out.format("Example prize schema: \n%s \n\n", prizeNode.toPrettyString());
    }
    Key jsonDocKey = new Key(Namespace, Set, UserKeyOffset+prizeIndex+1); 
    documentClient.put(jsonDocKey, JsonBin, prizeNode);   
    prizeIndex++;
}
System.out.format("Stored %d JSON documents in database \n", prizeIndex);;

Read JSON doc from: /home/jovyan/notebooks/java/nobel_prizes.json
Example prize schema: 
{
  "year" : "2021",
  "category" : "chemistry",
  "laureates" : [ {
    "id" : "1002",
    "firstname" : "Benjamin",
    "surname" : "List",
    "motivation" : "\"for the development of asymmetric organocatalysis\"",
    "share" : "2"
  }, {
    "id" : "1003",
    "firstname" : "David",
    "surname" : "MacMillan",
    "motivation" : "\"for the development of asymmetric organocatalysis\"",
    "share" : "2"
  } ]
} 

Stored 658 JSON documents in database 

Execute Query with Index

Find nobel prizes in a given year.

// Create an index on the year field.
// idx_json_bin_year_str
createIndex("idx_json_bin_year_str", JsonBin, IndexType.STRING, IndexCollectionType.DEFAULT, 
            CTX.mapKey(Value.get("year")));

// Find nobel prizes in year 1969.
System.out.println("Nobel prizes in year 1969:");
Filter filter = Filter.equal(JsonBin, "1969", CTX.mapKey(Value.get("year")));
executeQueryAndPrintResults(filter, JsonBin);

Output:

Created index idx_json_bin_year_str on ns=test set=cdt-indexing bin=json_bin.
Nobel prizes in year 1969:
key=1314
    category=economics
    year=1969
    laureates=[{firstname=Ragnar, share=2, id=677, surname=Frisch, motivation="for having developed and applied dynamic models for the analysis of economic processes"}, {firstname=Jan, share=2, id=678, surname=Tinb
key=1313
    category=chemistry
    year=1969
    laureates=[{firstname=Derek, share=2, id=237, surname=Barton, motivation="for their contributions to the development of the concept of conformation and its application in chemistry"}, {firstname=Odd, share=2, i
key=1315
    category=literature
    year=1969
    laureates=[{firstname=Samuel, share=1, id=643, surname=Beckett, motivation="for his writing, which - in new forms for the novel and drama - in the destitution of modern man acquires its elevation"}]
key=1318
    category=medicine
    year=1969
    laureates=[{firstname=Max, share=3, id=391, surname=Delbrück, motivation="for their discoveries concerning the replication mechanism and the genetic structure of viruses"}, {firstname=Alfred D., share=3, id=392
key=1317
    category=physics
    year=1969
    laureates=[{firstname=Murray, share=1, id=90, surname=Gell-Mann, motivation="for his contributions and discoveries concerning the classification of elementary particles and their interactions"}]
key=1316
    category=peace
    year=1969
    laureates=[{share=1, firstname=International Labour Organization, id=527, motivation="for creating international legislation insuring certain norms for working conditions in every country"}]

Find nobel prizes in a given category.

// Create an index on the category field.
// idx_json_bin_category_str
createIndex("idx_json_bin_category_str", JsonBin, IndexType.STRING, IndexCollectionType.DEFAULT, 
            CTX.mapKey(Value.get("category")));

// Find nobel prizes for peace.
System.out.println("Nobel peace prizes:");
Filter filter = Filter.equal(JsonBin, "peace", CTX.mapKey(Value.get("category")));
executeQueryAndPrintResults(filter, JsonBin);

Output:

Created index idx_json_bin_category_str on ns=test set=cdt-indexing bin=json_bin.
Nobel peace prizes:
key=1190
    category=peace
    year=1990
    laureates=[{firstname=Mikhail, share=1, id=552, surname=Gorbachev, motivation="for the leading role he played in the radical changes in East-West relations"}]
key=1561
    category=peace
    year=1920
    laureates=[{firstname=Léon, share=1, id=484, surname=Bourgeois, motivation="for his longstanding contribution to the cause of peace and justice and his prominent role in the establishment of the League of Natio
key=1401
    category=peace
    year=1952
    laureates=[{firstname=Albert, share=1, id=513, surname=Schweitzer, motivation="for his altruism, reverence for life, and tireless humanitarian work which has helped making the idea of brotherhood between men an
key=1386
    category=peace
    year=1955
    overallMotivation="No Nobel Prize was awarded this year. The prize money was allocated to the Special Fund of this prize section."
key=1591
    category=peace
    year=1914
    overallMotivation="No Nobel Prize was awarded this year. The prize money was allocated to the Special Fund of this prize section."
key=1646
    category=peace
    year=1903
    laureates=[{firstname=Randal, share=1, id=466, surname=Cremer, motivation="for his longstanding and devoted effort in favour of the ideas of peace and arbitration"}]
... (truncated)

Example 2: Space Companies JSON Dataset

Read JSON File and Populate Database

We will load JSON documents from the file "space_companies.json", and store each prize entry in the database in a separate record with user_key starting at 2000 in bin "json_bin". You can view the file contents by opening it from Jupyter's file browser.

String JsonFilePath = "/home/jovyan/notebooks/java/space_companies.json";
String JsonBin = "json_bin";
Integer UserKeyOffset = 2000;

// Read the json document into a string.
String jsonString = FileUtils.readFileToString(new File(JsonFilePath));
System.out.format("Read JSON doc: %s\n", JsonFilePath);;

// Convert JSON string to a JsonNode
JsonNode rootNode = JsonConverters.convertStringToJsonNode(jsonString);

// Add the companies in JSON in separate records
JsonNode companiesNode = rootNode.path("companies");

Integer companyIndex = 0;
for (JsonNode companyNode : companiesNode) {
    if (companyIndex == 1) {
        System.out.format("Example company schema: \n%s \n\n", companyNode.toPrettyString());
    }
    Key jsonDocKey = new Key(Namespace, Set, UserKeyOffset+companyIndex+1); 
    documentClient.put(jsonDocKey, JsonBin, companyNode);   
    companyIndex++;
}
System.out.format("Stored %d JSON documents in database \n", companyIndex);;

Output:

Read JSON doc: /home/jovyan/notebooks/java/space_companies.json
Example company schema: 
{
  "_id" : "58d163cd6ed7bb3c6ee62e63",
  "company" : {
    "_id" : "58d164516ed7bb3c6ee62e64",
    "about" : "Telespazio VEGA UK is a dynamic and experienced consulting, technology and engineering services business based in Luton, UK. Following our merger into the Telespazio Group in 2011 we now have access to a vast array of world leading facilities and services, which we provide both within the UK and to our export markets. With a strong history in the European space arena starting in 1978, we have rich relationships across the space industry, with staff working closely with the European Space Agency (ESA) on programs and projects for over 35 years. ",
    "backgroundImage" : "https://www.spaceindividuals.com/var/www/images/8907e7e06a094fe31098d167567fa1dc.jpg",
    "companyLocation" : "Capability Green, Luton LU1 3LU, UK",
    "companyName" : "Telespazio VEGA UK",
    "createdAt" : "2017-06-05T00:00:00.000Z",
    "culture" : "As an organisation that aspires to excellence, Telespazio VEGA recognises the need to appreciate our people for their achievements. Without the skills, dedication and enthusiasm of our people, we would not be where we are today. Telespazio VEGA continually strives to create a rewarding environment, offer ample opportunity for personal and professional development and support our people in balancing work and life.\nTo ensure that we remain relevant and of value to our clients, Telespazio VEGA has created an environment that encourages and rewards innovation and entrepreneurial spirit. This, offered with the financial stability afforded us as part of the Finmeccanica group of companies, means that valid business ideas are given the appropriate backing and support to enable them to succeed.\nWe also recognise that each individual has their own aspirations for personal development. Therefore, we offer a range of career options that enable consultants to enhance individual niche capabilities, embrace the challenge of business development, and take on the responsibility of a management role. These opportunities are made available through a structured career development programme which enables our people to look closely at themselves, where they are today, and where they see themselves in the future. We recognise and support all these ambitions and look to retain a rich mix of skills and capabilities that will continue to support our clients’ evolving business requirements.\n\nContinuous Learning\nAs a business, Telespazio VEGA places great emphasis on allowing our people to learn and develop, both personally and professionally. We share ideas with colleagues and clients, challenge each other and learn through our involvement with a variety of projects.\nTelespazio VEGA provides ample opportunity for more formal training. We encourage our people to undertake professional qualifications, and support those who undertake study which relates to their work\n",
    "email" : "careers@telespazio.com",
    "galleryImages" : [ "https://spaceindividuals.com/var/www/images/4feb7c40b8b3d5a3ea83b600d6731b94.jpg", "https://spaceindividuals.com/var/www/images/ab664d6614614b4b771c3fe99e55383b.jpg", "https://spaceindividuals.com/var/www/images/d3ab90c334abfe301c93b14b1ff0f2d4.jpg" ],
    "isPublished" : true,
    "kindOfEmployeeLookFor" : null,
    "logoURL" : "https://spaceindividuals.com/var/www/images/ca6d9b5d356a2186fb9144b6bb34ab03.png",
    "mission" : null,
    "paymentType" : "enterprise",
    "phone" : "+44 (0)1582 399000",
    "sizeCompany" : "100-250 employees",
    "specialities" : "We now serve a wide range of public and private commercial markets globally by providing efficient, skilled support, the most up to date space data and services, and fundamentally, understanding and integrating the needs of our customers.\n\nWe have developed and harnessed our knowledge and expertise to deliver unique complex solutions for:\nGEO Information, Satellite Systems and Applications &amp; Satellite Communications.\nWe offer robust and reliable ground segment systems for satellite missions and utilise space assets to develop downstream applications across a wide variety of sectors. \nWe provide world leading scientific services on earth observation and space science missions.\nOur valued and experienced knowledgeable experts deliver consulting and engineering support to space missions worldwide.",
    "testimonials" : null,
    "updatedAt" : "2021-01-20T09:45:41.731Z",
    "weOfferChallenges" : null,
    "weOfferKindOfChallenges" : "Telespazio VEGA’s culture is defined by our people, and drives the value we deliver through the relationships we have with our stakeholders.\n\nOur culture is distinct and discernible, mirroring the multi-national, multi-disciplinary environment in which we work. It is based on a practical engineering ethos that thrives on challenge, and engenders a rigorous, analytical, results-oriented approach to problem solving. We are open and honest, say what we believe, and challenge that which we do not.\n\nAt Telespazio VEGA, our culture is also based on a pursuit of excellence and thought leadership – not only in our understanding of the technologies with which we work and the market domains in which they are applied, but also in the quality of the ideas, methodologies, services and solutions we deliver.\n\nWe share with our clients an unparalleled enthusiasm for Space and its application in society. Therefore, we are able to empathise strongly, and invest this understanding and commitment into long-term relationships with our clients and stakeholders.\n",
    "weOfferKindOfTrainings" : null,
    "weOfferTrainings" : null,
    "webSite" : "http://telespazio-vega.com"
  },
  "createdAt" : "2017-06-05T00:00:00.000Z",
  "email" : "careers@telespazio.com",
  "jobs" : null,
  "numberOfJobs" : 0,
  "updatedAt" : "2021-01-20T09:47:36.694Z"
} 

Stored 154 JSON documents in database

Execute Query with Index

Find companies by number of job postings.

// Create an index on the numberOfJobs field.
// idx_json_bin_number_of_jobs_num
createIndex("idx_json_bin_number_of_jobs_num", JsonBin, IndexType.NUMERIC, IndexCollectionType.DEFAULT, 
            CTX.mapKey(Value.get("numberOfJobs")));

// Find companies with more than 10 job postings.
System.out.println("Companies with numberOfJobs greater than 10:");
Filter filter = Filter.range(JsonBin, 11, 100, CTX.mapKey(Value.get("numberOfJobs")));
executeQueryAndPrintResults(filter, JsonBin);

Output:

Created index idx_json_bin_number_of_jobs_num on ns=test set=cdt-indexing bin=json_bin.
Companies with numberOfJobs greater than 10:
key=2013
    createdAt=2018-03-04T14:00:04.000Z
    jobs=null
    numberOfJobs=11
    company={backgroundImage=https://spaceindividuals.com/var/www/images/971f13def35f0b21e379fd1555b3e10d.jpg, isPublished=true, companyName=AKKA Technologies, about=As a consulting and engineering Group and Euro
    _id=593fdfae9839ef0a13b6b92d
    email=kontakt@akka.eu
    updatedAt=2021-01-27T14:04:41.487Z
key=2070
    createdAt=2018-03-29T14:00:04.000Z
    jobs=null
    numberOfJobs=62
    company={backgroundImage=https://spaceindividuals.com/var/www/images/02873814215a7b781e0f5f10e726926b.jpg, companyName=CERN, about=Imagine taking part in the largest scientific experiment in the world. CERN n
    _id=59df28ff947f49c77101e477
    email=recruitment.service@cern.ch
    updatedAt=2021-01-27T09:28:02.836Z
key=2084
    createdAt=2018-04-01T14:00:04.000Z
    jobs=null
    numberOfJobs=30
    company={backgroundImage=https://spaceindividuals.com/var/www/images/81e0f8e35dbab0242b8729f795986047.jpg, isPublished=true, companyName=RHEA Group, about=25 years of experience in the professional engineerin
    _id=58d965bfde9c102b036040f7
    email=temp@rheagroup.com
    updatedAt=2021-01-27T10:16:40.854Z
key=2107
    createdAt=2018-11-21T15:48:41.941Z
    jobs=null
    numberOfJobs=95
    company={backgroundImage=/profile-pic/1574021440639.jpg, isPublished=true, companyName=European Space Agency - ESA, about=The European Space Agency (ESA) is Europe’s gateway to space. Its mission is to shape 
    _id=5bf57e594739c05820a92eee
    email=ZINEB.ELOMRI@ESA.INT
    updatedAt=2021-01-25T09:34:07.372Z
key=2071
    createdAt=2018-03-07T14:02:08.000Z
    jobs=null
    numberOfJobs=16
    company={backgroundImage=https://spaceindividuals.com/var/www/images/093fb01f43d429801105c6fc0b224d40.jpg, companyName=Terma Group, about=At Terma our people make the difference. We employ talented, dedicated
    _id=590a258c7cd117274a0ae681
    email=recruitment.de@terma.com
    updatedAt=2021-01-27T13:24:59.183Z
key=2083
    createdAt=2018-03-04T15:01:08.000Z
    jobs=null
    numberOfJobs=52
    company={backgroundImage=https://spaceindividuals.com/var/www/images/98181cdb3d430d1e90c695e44caeea90.jpg, isPublished=true, companyName=Serco Europe, about=The Serco Group is a FTSE 250 corporation with over
    _id=58f78ba25a0782e629c7b621
    email=tatiana.thiessen@serco.com
    updatedAt=2020-11-30T09:49:19.914Z
key=2051
    createdAt=2018-04-17T13:04:35.890Z
    jobs=null
    numberOfJobs=26
    company={backgroundImage=https://spaceindividuals.com/var/www/images/604d16af11bf4f83e8db7aa57a10c3d0.jpg, isPublished=true, companyName=OHB-System AG, about=3,2,1, lift off...

Find companies by a nested field such as the payment type.

The paymentType field may be truncated in the result output, but can be verified with the AQL query in the terminal tab: aql -c "select json_bin from test.cdt-indexing where PK=key".

// Create an index on the nested paymentType field.
// idx_json_bin_payment_type_str
createIndex("idx_json_bin_payment_type_str", JsonBin, IndexType.STRING, IndexCollectionType.DEFAULT, 
            CTX.mapKey(Value.get("company")), CTX.mapKey(Value.get("paymentType")));

// Find companies with "enterprise" payment type.
System.out.println("Companies with enterprise payment type:");
Filter filter = Filter.equal(JsonBin, "enterprise", 
            CTX.mapKey(Value.get("company")), CTX.mapKey(Value.get("paymentType")));
executeQueryAndPrintResults(filter, JsonBin);

Output:

Created index idx_json_bin_payment_type_str on ns=test set=cdt-indexing bin=json_bin.
Companies with enterprise payment type:
key=2013
    createdAt=2018-03-04T14:00:04.000Z
    jobs=null
    numberOfJobs=11
    company={backgroundImage=https://spaceindividuals.com/var/www/images/971f13def35f0b21e379fd1555b3e10d.jpg, isPublished=true, companyName=AKKA Technologies, about=As a consulting and engineering Group and Euro
    _id=593fdfae9839ef0a13b6b92d
    email=kontakt@akka.eu
    updatedAt=2021-01-27T14:04:41.487Z
key=2002
    createdAt=2017-06-05T00:00:00.000Z
    jobs=null
    numberOfJobs=0
    company={backgroundImage=https://www.spaceindividuals.com/var/www/images/8907e7e06a094fe31098d167567fa1dc.jpg, isPublished=true, companyName=Telespazio VEGA UK, about=Telespazio VEGA UK is a dynamic and exper
    _id=58d163cd6ed7bb3c6ee62e63
    email=careers@telespazio.com
    updatedAt=2021-01-20T09:47:36.694Z
key=2045
    createdAt=2018-03-24T14:00:04.000Z
    jobs=null
    numberOfJobs=8
    company={backgroundImage=https://spaceindividuals.com/var/www/images/0786a572d77f4a0b8a48c5b054766c54.jpg, isPublished=true, companyName=Telespazio Germany GmbH, about=Telespazio Germany is the first choice a
    _id=58e13089de9c102b03604295
    email=recruitment@telespazio.de
    updatedAt=2021-01-27T11:04:38.881Z
key=2031
    createdAt=2018-04-25T12:27:23.286Z
    jobs=null
    numberOfJobs=0
    company={paymentTypeOrdered=enterprise, backgroundImage=https://www.spaceindividuals.com/var/www/images/5207d3c4216068b9423ade06215f1529.PNG, isPublished=true, companyName=European GNSS Agency, about=Space is
    _id=5ae0742b4392412b350ca064
    email=jobs@gsa.europa.eu
    updatedAt=2020-11-13T17:25:53.261Z
key=2107
    createdAt=2018-11-21T15:48:41.941Z
    jobs=null
    numberOfJobs=95
    company={backgroundImage=/profile-pic/1574021440639.jpg, isPublished=true, companyName=European Space Agency - ESA, about=The European Space Agency (ESA) is Europe’s gateway to space. Its mission is to shape 
    _id=5bf57e594739c05820a92eee
    email=ZINEB.ELOMRI@ESA.INT
    updatedAt=2021-01-25T09:34:07.372Z
key=2084
    createdAt=2018-04-01T14:00:04.000Z
    jobs=null
    numberOfJobs=30
    company={backgroundImage=https://spaceindividuals.com/var/www/images/81e0f8e35dbab0242b8729f795986047.jpg, isPublished=true, companyName=RHEA Group, about=25 years of experience in the professional engineerin
    _id=58d965bfde9c102b036040f7
    email=temp@rheagroup.com
    updatedAt=2021-01-27T10:16:40.854Z
... (truncated)

Query Values Across Multiple Elements

When the elements are distributed in multiple places in the CDT, a List or Map is not readily available for indexing. A simple example of this is to find the nobel prize(s) by the winner's name.

{
  "year" : "2021",
  "category" : "chemistry",
  "laureates" : [ {
    "id" : "1002",
    "firstname" : "Benjamin",
    "surname" : "List",
    "motivation" : "\"for the development of asymmetric organocatalysis\"",
    "share" : "2"
  }, {
    "id" : "1003",
    "firstname" : "David",
    "surname" : "MacMillan",
    "motivation" : "\"for the development of asymmetric organocatalysis\"",
    "share" : "2"
  } ]
} 

Create List of Record Values to Query

To be able to find the nobel prize records by the winner's name, we will create a separate List of the names of the laureates, and index that List. The List is stored in its own bin laureate_names.

laureate_names: ["benjamin list", "david macmillan", ..]

// iterate over records and update the laureate_names bin
import com.aerospike.client.Bin;
import com.aerospike.client.ScanCallback;
import com.aerospike.client.policy.ScanPolicy;
import com.aerospike.client.exp.Exp.Type;

public class ScanParallel implements ScanCallback {
    public void scanCallback(Key key, Record record) {
        //System.out.format("key=%s bins=%s\n", key.userKey, record.bins);
        Object binval = record.bins.get(JsonBin);
        List<String> laureateNamesList = new ArrayList<String>();
        List<HashMap<String,String>> laureates = (List<HashMap<String,String>>)
                                                    ((HashMap<String,Value>)binval).get("laureates");
        if (laureates == null) {
            return;
        }
        for (HashMap<String,String> laureate: laureates) {
            //System.out.format("firstname: %s, surname: %s\n", laureate.get("firstname"), 
            //                    laureate.get("surname"));
            laureateNamesList.add(laureate.getOrDefault("firstname","").toLowerCase() + " " + 
                                    laureate.getOrDefault("surname","").toLowerCase());
        }
        //System.out.println(laureateNamesList);
        Bin laureateNamesBin = new Bin("laureate_names", laureateNamesList);
        client.put(null, key, laureateNamesBin);
    }
}
// expression filter 1001 <= user-key < 2000 is specifed in the policy
ScanPolicy policy = new ScanPolicy();
policy.filterExp = Exp.build(
                        Exp.and(
                            Exp.ge(Exp.key(Type.INT), Exp.val(1001)),
                            Exp.lt(Exp.key(Type.INT), Exp.val(2000))));
System.out.format("Scan with expression filter results:\n");
client.scanAll(policy, Namespace, Set, new ScanParallel(), JsonBin);

Find nobel prizes by winner name.

// Create an index on the numberOfJobs field.
// idx_laureate_names_list_str
createIndex("idx_laureate_names_list_str", "laureate_names", IndexType.STRING, IndexCollectionType.LIST);

// Find nobel prizes by winner name "albert einstein".
System.out.println("Prizes with laureate name \"marie curie\":");
Filter filter = Filter.contains("laureate_names", IndexCollectionType.LIST, "marie curie");
executeQueryAndPrintResults(filter, JsonBin);

Output:

Created index idx_laureate_names_list_str on ns=test set=cdt-indexing bin=laureate_names.
Prizes with laureate name "marie curie":
key=1647
    category=physics
    year=1903
    laureates=[{firstname=Henri, share=2, id=4, surname=Becquerel, motivation="in recognition of the extraordinary services he has rendered by his discovery of spontaneous radioactivity"}, {firstname=Pierre, share=
key=1604
    category=chemistry
    year=1911
    laureates=[{firstname=Marie, share=1, id=6, surname=Curie, motivation="in recognition of her services to the advancement of chemistry by the discovery of the elements radium and polonium, by the isolation of ra

Clean up

Remove tutorial data and close connection.

client.truncate(null, Namespace, null, null);
client.close();
System.out.println("Removed tutorial data and closed server connection.");

Output:

Removed tutorial data and closed server connection.

Takeaways

The tutorial illustrated with many examples the secondary index functionality with CDTs that is available in Aerospike Database 6.1.0.0+. The CDT indexing enables faster queries on JSON documents and other complex objects stored as CDTs.

Further Exploration and Resources

Here are some links for further exploration.

Resources:

Next steps

Visit Aerospike notebooks repo to run additional Aerospike notebooks. To run a different notebook, download the notebook from the repo to your local machine, and then click on File->Open, and select Upload.

CDT Indexing in Aerospike

For an interactive Jupyter notebook experience:

Introduction​

Prerequisites​

Setup​

Ensure database is running​

Download and install additional components.​

Initialize Client​

Import Modules​

Define Constants and Helper Functions​

Open a Terminal Tab​

CDT Test Data

Populate Test Data​

Examine Test Data​

Non-Collection Indexes

Equality queries​

Records with a specific value at index X of a List or Map​

The top level List in list_bin has 3 at position 2 (0-indexed)​

The top level Map in map_bin has 7 at position 2​

Records with a specific value at rank X of a List or Map​

The nested List in list_bin has 35 in rank -1 (highest value)​

The nested "subobj" Map in map_bin has 901 in rank 0 (lowest value)​

Records with a specific value at key X of a Map​

The top level Map in map_bin has 8 at key "oid"​

The nested "obj" Map in map_bin has "s42" at key "attr2"​

Range Queries​

Records with value in a specific range at index X of a List or Map​

The top level List in list_bin is in range 3-5 at position 2 (0-indexed)​

The top level Map in map_bin is in range 5-7 at position 2​

Records with value in a specific range at rank X of a List or Map​

The nested "subobj" Map in map_bin is in range 500-800 at rank 0 (lowest value)​

Records with value in a specific range at key X of a Map​

The top level Map in map_bin has a value in range 4-6 at key "oid"​

Collection Indexes

Equality Queries​

Records with a List or Map containing a specific value.​

The nested List in map_bin contains "s91".​

The top level Map in map_bin contains a key 5.​

The nested "obj" Map in map_bin contains a value "s12".​

Range Queries​

Records with a List or Map containing an integer in a given range.​

The nested list in list_bin has a value in range 85-93.​

The top level Map in map_bin has a key in range 5-7.​

The nested Map "subobj" in map_bin has a value in range 300-500.​

Indexing JSON Documents

Initialize Document Client​

Example 1: Nobel Prizes JSON Dataset​

Read JSON File and Populate Database​

Execute Query with Index​

Find nobel prizes in a given year.​

Find nobel prizes in a given category.​

Example 2: Space Companies JSON Dataset​

Read JSON File and Populate Database​

Execute Query with Index​

Find companies by number of job postings.​

Find companies by a nested field such as the payment type.​

Query Values Across Multiple Elements

Create List of Record Values to Query​

Find nobel prizes by winner name.​

Clean up

Takeaways

Further Exploration and Resources

Next steps​

Introduction

Prerequisites

Setup

Ensure database is running

Download and install additional components.

Initialize Client

Import Modules

Define Constants and Helper Functions

Open a Terminal Tab

Populate Test Data

Examine Test Data

Equality queries

Records with a specific value at index X of a List or Map

The top level List in list_bin has 3 at position 2 (0-indexed)

The top level Map in map_bin has 7 at position 2

Records with a specific value at rank X of a List or Map

The nested List in list_bin has 35 in rank -1 (highest value)

The nested "subobj" Map in map_bin has 901 in rank 0 (lowest value)

Records with a specific value at key X of a Map

The top level Map in map_bin has 8 at key "oid"

The nested "obj" Map in map_bin has "s42" at key "attr2"

Range Queries

Records with value in a specific range at index X of a List or Map

The top level List in list_bin is in range 3-5 at position 2 (0-indexed)

The top level Map in map_bin is in range 5-7 at position 2

Records with value in a specific range at rank X of a List or Map

The nested "subobj" Map in map_bin is in range 500-800 at rank 0 (lowest value)

Records with value in a specific range at key X of a Map

The top level Map in map_bin has a value in range 4-6 at key "oid"

Equality Queries

Records with a List or Map containing a specific value.

The nested List in map_bin contains "s91".

The top level Map in map_bin contains a key 5.

The nested "obj" Map in map_bin contains a value "s12".

Range Queries

Records with a List or Map containing an integer in a given range.

The nested list in list_bin has a value in range 85-93.

The top level Map in map_bin has a key in range 5-7.

The nested Map "subobj" in map_bin has a value in range 300-500.

Initialize Document Client

Example 1: Nobel Prizes JSON Dataset

Read JSON File and Populate Database

Execute Query with Index

Find nobel prizes in a given year.

Find nobel prizes in a given category.

Example 2: Space Companies JSON Dataset

Read JSON File and Populate Database

Execute Query with Index

Find companies by number of job postings.

Find companies by a nested field such as the payment type.

Create List of Record Values to Query

Find nobel prizes by winner name.

Next steps