Skip to main content

Local Cache

For an interactive Jupyter notebook experience: Binder

This notebook describes the pattern where a local in-memory cache is used in front of the Aerospike database. It shows with a simple model how a local in-memory cache can enhance performance for specific hit ratio and cache speed scenarios.

This notebook requires Aerospike database running on localhost and that python and the Aerospike python client have been installed (pip install aerospike). Visit Aerospike notebooks repo for additional details and the docker container.


Caches are ubiquitous. Aerospike is commonly and effectively deployed as a cache in front of a backend database that is remote, slow, and/or limited in throughput. This notebook illustrates use of a local cache on the client machine that sits in front of the Aerospike database, with specific scenarios when it can be beneficial. The pattern is applicable for a standalone Aerospike database as well as when Aerospike itself acts as a cache.

A cache provides faster access and improved throughput by having a copy of the data closer to where it is processed. Cache libraries, external caching servers, and distributed cache infrastructure are deployed for specific caching needs and solutions. Aerospike CacheDB is designed for fast, reliable, consistent, and cost-effective access across the globally distributed data infrastructure.

The notebook first illustrates a local in-memory cache fronting the Aerospike database through a simple interface, and then analyses the performance impact of a cache in various scenarios through a simple mathematical model.


This tutorial assumes familiarity with the following topics:


Ensure database is running

This notebook requires that Aerospike database is running. [Include the right code cell for Java or Python from the two cells below.]

!asd >& /dev/null
!pgrep -x asd >/dev/null && echo "Aerospike database is running!" || echo "**Aerospike database is not running!**"


Aerospike database is running!

Connect to database.

We need a client connected to the database.

# import the module
from __future__ import print_function
import aerospike

# Configure the client
config = {
'hosts': [ ('', 3000) ],
'policy' : {'key': aerospike.POLICY_KEY_SEND}

# Create a client and connect it to the cluster
client = aerospike.client(config).connect()
import sys
print("failed to connect to the cluster with", config['hosts'])
print('Client successfully connected to the database.')


Client successfully connected to the database.

Populate database with test data.

The following code populates the test data in set "local_cache_tutorial" in namespace "test". The data consists of a 10000 records with user keys 1-10000 and bins populated with random data.

namespace = 'test'
tutorial_set = 'local_cache_tutorial'
max_data_size = 10000
# Records are addressable via a tuple of (namespace, set, key)

import random
for i in range(max_data_size):
# Write the records
client.put((namespace, tutorial_set, 'id-'+str(i+1)),
{ 'age': random.randint(20,81),
'birth_month': random.choice(['Jan','Feb','Mar','Apr','May','Jun',
'gender': random.choice(['M', 'F']),
'favorite_colors': random.sample(['red','orange','yellow','green',
'blue','violet','black','white','grey'], k=3) } )
except Exception as e:
import sys
print("error: {0}".format(e), file=sys.stderr)

print('Test data populated.')


Test data populated.

Cache Design

We will illustrate a local cache fronting the Aerospike database. The key benefit of a local cache stems from its proximity hence speed advantage: Aerospike provides average access time of a millisecond or less for 99%+ requests, while local memory cache access can be in microsecond range. Even so, as we will see, limitations of memory size need to be taken into account as do cost consideration since a local cache needs to be implemented at each client host.

First, let us define a simple interface for the local cache.

Cache Interface

The cache consists of cache entries with the following key operations:

  • get(key)
    • Get from cache if available and TTL is not expired, else retrieve from the database and add to the cache.
  • update(key, data)
    • Update the database by replacing the current record, and also add or replace the cache entry.
  • add(key, entry)
    • If the cache is full, evict appropriate entry and add the cache entry.

Cache Eviction

A local cache can be implemented with TTL based garbage collection. Another alternative is to maintain a Least-Recently-Used (LRU) list and remove LRU entry when the cache is full to make room for a new entry. LRU can be implemented by maintaining a doubly linked list of cache entries - below, the classes CacheEntry and LocalCache allude to LRU links, but the implementation is not provided here.

A simplistic eviction scheme below for illustrative purpose selects an arbitrary entry for eviction, using dict.popitem(), although better randomized eviction is possible with implementations like randomdict.

Working with Aerospike TTL

Aerospike has TTL based eviction that allows for:

  1. No eviction or the record is permanent,
  2. At record creation a specific TTL is set,
  3. TTL may be updated any time.

Since the TTL can change at origin while the record resides in the local cache, the cache will sync up with the origin at the time of caching. This is no different from if the record is deleted at the origin.

import time


class CacheEntry:
_key = None
_ttl = 0
_data = None
# LRU not implemented
# _lru_prev = None
# _lru_next = None

def __init__(self, key, ttl, data):
self._key = key
self._ttl = ttl
self._data = data.copy()

class LocalCache:
_cache = {} # dict of key->cache_entry
_max_size = 0
# LRU not implemented
# _lru_head = None
# _lru_tail = None

def __init__(self, max_size):
self._max_size = max_size

def get(self, key):
entry = self._cache.get(key)
if entry is None or entry._ttl < time.time():
#print("a cache miss", key[2])
# get it from aerospike database
_, meta, bins = client.get(key)
if meta is None:
#print("entry expired at origin", key[2])
if entry:
return None
if entry: # in cache, replace
entry._ttl = time.time()+meta['ttl']
entry._data = bins
else: # not in cache, add
entry = CacheEntry(key, time.time()+meta['ttl'], bins)
self.add(key, entry)
#print("a cache hit", key[2])
return entry._data.copy()

def add(self, key, entry):
if len(self._cache) == self._max_size:
#print("cache full, evicting")
_ = self._cache.popitem() # remove a "random" entry
self._cache[key] = entry
#print("added entry", key[2])

def update(self, key, data):
# update aerospike database and extend TTL
meta = {'ttl': DEFAULT_TTL}
policy = {'exists': aerospike.POLICY_EXISTS_REPLACE}
_, meta, _ = client.operate(key, data, meta=meta, policy=policy)
print('failed to udpdate database')
entry = self._cache.get(key)
if entry is None:
entry = CacheEntry(key, time.time()+DEFAULT_TTL, data)
self.add(key, entry)
entry._ttl = time.time()+DEFAULT_TTL
entry._data = rec.copy()

Comparing Performance

There are two key factors that in general impact cache performace.

  1. Hit ratio H or the fraction of the requests served directly from the cache without having to go to the origin server. Hit ratio depends on factors like the cache size, invalidation (update) rate, and access pattern.
  2. Speed ratio S or how fast the cache access is as compared to the origin server. Or equivalently, the speed gain of a cache hit over a cache miss.

Below we run a simple test and compare execution times for "without cache" and "with cache" scenarios over a large number of requests. (Feel free to experiment with different values of data_size and cache_size to adjust the hit ratio. Speed ratio for this implementation can vary and is unknown.)

data_size = 5000
cache_size = 2000
print('Cache hit ratio: ', '%3.2f'%(cache_size/data_size))
num_requests = 10000

start_time = time.time()
for i in range(num_requests):
user_key = 'id-' + str(random.randint(1, data_size))
key = (namespace, tutorial_set, user_key)
_ = client.get(key)
time_without_cache = time.time() - start_time
print('Execution time without cache: ', '%5.3fs'%time_without_cache)

start_time = time.time()
cache = LocalCache(cache_size)
for i in range(num_requests):
user_key = 'id-' + str(random.randint(1, data_size))
key = (namespace, tutorial_set, user_key)
_ = cache.get(key)
time_with_cache = time.time() - start_time
print('Execution time with cache: ', '%5.3fs'%time_with_cache)

print('Speedup with cache: ', '%5.1f%%'%((time_without_cache/time_with_cache - 1) * 100))


Cache hit ratio:  0.40
Execution time without cache: 0.637s
Execution time with cache: 0.479s
Speedup with cache: 33.0%

Effective Local Cache Scenarios

The theoretical performance boost from a cache ranges from 0 (no improvement) when the hit ration (H) is 0 meaning all requests are served from the orgin server, up to the speed ratio (S) when H is 1 meaning all requests are served from the cache.

When Aerospike database is used as a cache, it provides performance and throughput gain through:

  • a robust hit ratio with its ability to scale to petabyte range and mechanisms to keep the cache in sync with the origin.
  • a huge speed ratio with its sub-millisecond predictable. Speedup over the origin database can range from 100 to 1000 or more.

So when can a local cache be beneficial in conjunction with Aerospike? A local cache can be effective if the data for local caching is targeted judiciously in order to keep the hit ratio high in spite of the local cache's size limitations. Thet is, targeting data that is frequently read, infrequently updated, and performance critical. The benefit will need to be balanced against the cost of deploying it across multiple client machines.

Cache Performance Model

We use a simple mathematical model to examine the theoretical performance. A simple model is used with two cache parameters, hit ratio and speed ratio, discussed above.

Speedup Formula

The following discussion is applicable to both a local cache in front of the Aerospike database, as well as Aerospike Cache in front of a backend database.

For N random accesses made directly to the origin database, the time needed is N * T, where T is the average access time for the origin database.

With a hit ratio of H and speed ratio of S:

  • N*H requests are directly served from the cache each with T/S access time. The total access time for cache served requests is N*H*T/S.
  • The remaining N*(1-H) requests are served from the origin in N*(1-H)*T time.
  • The total access time with a cache is the addition of the above two: N*H*T/S + N*(1-H)*T.
  • The speedup in time and throughput is the ratio of total time without a cache to the total time with a cache. That is, N*T / [N*H*T/S + N*(1-H)*T] or 1/(1 - H + H/S).

Note, as expected, as H approaches 1 (all requests served from the cache), the speedup approaches S, the cache speed ratio, and as H approaches 0, the speedup approaches 0 (no speedup).

The following code implements the speedup function with parameters H and S. These are computed below with various H and S values.

import math

def cache_speedup(hit_ratio, speed_ratio):
return 1.0/(1.0 - hit_ratio + hit_ratio / speed_ratio)

hr = [0.1, 0.3, 0.5, 0.7, 0.9]
sr = [10, 100, 1000]
print('S', 'H', 'Effective Speedup', sep='\t')
for s in sr:
for h in hr:
print(s, h, '%5.0f%%'%((cache_speedup(h, s)-1)*100),
math.ceil((cache_speedup(h, s)-1)*10)*'*', sep="\t")


S   H   Effective Speedup
10 0.1 10% *
10 0.3 37% ****
10 0.5 82% *********
10 0.7 170% ******************
10 0.9 426% *******************************************
100 0.1 11% **
100 0.3 42% *****
100 0.5 98% **********
100 0.7 226% ***********************
100 0.9 817% **********************************************************************************
1000 0.1 11% **
1000 0.3 43% *****
1000 0.5 100% **********
1000 0.7 233% ************************
1000 0.9 891% ******************************************************************************************

Thus, a cache can provide performance and throughput boost in both scenarios:

  • A local cache fronting the Aerospike database if a high hit ratio can be attained by targeting right data.
  • Aerospike Cache fronting a backend database because Aerospike provides a large cache size (hence hit ratio) and sub-miilisecond access time (hence high speed ratio).


The key takeaways are:

  • A local cache can augment performance and throughput of Aerospike database. A local cache can be expensive as it has to be deployed at each client host, but can be effective if high hit ratio is achievable with repeat access to a small subset of data.
  • Aerospike provides significant performance and throughput boost over the backend database due to its large cache size and automatic sync mechanisms(which typically transalate into a higher hit ratio), and a sub-millisecond access time.

Clean up

Remove data and close the server connection.

client.truncate(namespace, tutorial_set, 0)
# Close the connection to the Aerospike cluster
print('Removed tutorial data. Connection closed.')


Removed tutorial data. Connection closed.

Further Exploration and Resources

Explore further how Aerospike can be deployed as a cache.

Next steps

Visit Aerospike notebooks repo to run additional Aerospike notebooks. To run a different notebook, download the notebook from the repo to your local machine, and then click on File > Open, and select Upload.