Skip to main content

Look-Aside Cache for MongoDB

For an interactive Jupyter notebook experience: Binder

This is a sample notebook for using Aerospike as a read/look-aside cache

  • This notebook demonstrates the use of Aerospike as a cache using Mongo as another primary datastore
  • It is required to run Mongo as a separte container using docker run --name some-mongo -d mongo:latest

To test: Run the get_data(key, value) method once - to fetch from Mongo and populate Aerospike

Another run will fetch the data from Aerospike cache

Ensure that the Aerospike Database is running

!asd >& /dev/null
!pgrep -x asd >/dev/null && echo "Aerospike database is running!" || echo "**Aerospike database is not running!**"

Output:

Aerospike database is running!

Import all dependencies

import aerospike
import pymongo
from pymongo import MongoClient
import sys

Configure the clients

The configuration is for

  • Aerospike database running on port 3000 of localhost (IP 127.0.0.1) which is the default.
  • Mongo running in a separate container whose IP can be found by docker inspect <containerid> | grep -i ipaddress

Modify config if your environment is different (Aerospike database running on a different host or different port).

# Define a few constants

AEROSPIKE_HOST = "0.0.0.0"
AEROSPIKE_PORT = 3000
AEROSPIKE_NAMESPACE = "test"
AEROSPIKE_SET = "demo"
MONGO_HOST = "172.17.0.3"
MONGO_PORT = 27017
MONGO_DB = "test-database"
MONGO_COLLECTION = "test-collection"
#Aerospike configuration
aero_config = {
'hosts': [ (AEROSPIKE_HOST, AEROSPIKE_PORT) ]
}
try:
aero_client = aerospike.client(aero_config).connect()
except:
print("Failed to connect to the cluster with", aero_config['hosts'])
sys.exit(1)
print("Connected to Aerospike")

#Mongo configuration
try:
mongo_client = MongoClient(MONGO_HOST, MONGO_PORT)
print("Connected to Mongo")
except:
print("Failed to connect to Mongo")
sys.exit(1)

Output:

Connected to Aerospike
Connected to Mongo

Store data in Mongo and clear the keys in Aerospike if any

db = mongo_client[MONGO_DB]
collection = db[MONGO_COLLECTION]
def store_data(data_id, data):
m_data = {data_id: data}
collection.drop()
aero_key = ('test', 'demo', data_id)
#aero_client.remove(aero_key)
post_id = collection.insert_one(m_data)
store_data("key", "value")

Fetch the data. In this instance we are using a simple key value pair.

If the data exists in the cache it is returned, if not data is read from Mongo, put in the cache and then returned

def get_data(data_id, data):
aero_key = (AEROSPIKE_NAMESPACE, AEROSPIKE_SET, data_id)
#aero_client.remove(aero_key)
data_check = aero_client.exists(aero_key)
if data_check[1]:
(key, metadata, record) = aero_client.get(aero_key)
print("Data retrieved from Aerospike cache")
print("Record::: {} {}".format(data_id, record['value']))
else:
mongo_data = collection.find_one({data_id: data})
print("Data not present in Aerospike cache, retrieved from mongo {}".format(mongo_data))
aero_client.put(aero_key, {'value': mongo_data[data_id]})
get_data("key", "value")

Output:

Data retrieved from Aerospike cache
Record::: key value