Source: Photo by NASA on Unsplash
Data Modeling is the exercise of mapping application objects onto the model and mechanisms provided by the database for persistence, performance, consistency, and ease of access.
Aerospike Database is purpose built for applications that require predictable sub-millisecond access to billions and trillions of objects and need to store many terabytes and petabytes of data, while keeping the cluster size - and therefore the operational costs - small. The goals of large data size and small cluster size mean the capacity of high-speed data storage on each node must be high.
Aerospike pioneered the database technology to effectively use SSDs to provide high-capacity high-speed persistent storage per node. Among its key innovations are that Aerospike:
- Accesses SSDs like direct addressable memory which results in superior performance,
- Supports a hybrid memory architecture for index and data in DRAM, PMEM, or SSD,
- Implements proprietary algorithms for consistent, resilient, and scalable storage across cluster nodes, and
- Provides Smart Client for a single-hop access to data while adapting to the changes in the cluster.
Therefore, choosing Aerospike Database as the data store is a significant step toward enabling your application for speed at scale. By choosing the Aerospike Database today, it is possible for a company of any size to leverage large amounts of data to solve real-time business problems and continue to scale in the future while keeping the operational costs low.
Data design should take into account many capabilities that Aerospike provides toward speed-at-scale such as data compression, Collection Data Types (CDTs), secondary indexes, multi-op requests, batch requests, server-side operations, cluster organization, and more. We discuss them later in this post.
NoSQL Data Modeling Principles
Aerospike is a NoSQL database, and does not have rigid schema as required by relational databases, To enable web-scale applications, Aerospike has a distributed architecture, and allows applications to choose availability or consistency during a network partition per the CAP theorem.
Typically, NoSQL data modeling starts with identifying the patterns of access in the application, that is, how the application reads and updates the data. The goal is to organize data for the required performance, efficiency, and consistency. In some NoSQL databases, design of keys, which serve as handles for access, is an important consideration for collocating them using a common property value. More on this later.
In Aerospike, many key data modeling principles are applicable that are prevalent in NoSQL databases including the use of:
Denormalization: Allowing duplication of data, by storing it in multiple places, to simplify and optimize access.
Aggregates: Storing nested entities together in embedded form to simplify and optimize access.
Application joins: Performing joins in the application in rare cases when they are required, for example, to follow the stored references in many-to-many relationships.
Single record transactions: Storing data that must be updated atomically together in one record.
Modeling Object Relationships
Related objects can be modeled either by holding a reference to the objects or by embedding the objects. The choice involves the trade-offs in ease, performance, and consistency; and is governed by two key factors: 1) the cardinality - 1:1, 1:N, M:N - of relationships, and 2) access patterns, as described below. Data modeling requires striking the right balance of conflicting goals such as, for example, while related objects should be embedded for ease and performance of reads, embedding across multiple objects can adversely affect update performance and consistency.
The following factors will dictate whether to embed or to reference an object:
Shared or exclusive relationship
Exclusive relationships 1:1 or 1:N should be embedded. For example, these 1:1 relationships should be stored together: owner and car, citizen and passport, family and residence; and so should these 1:N relationships: account and transactions, person and properties, and company and brands.
Shared objects with M:N relationships should be stored independently. For example, students and courses, tourists and destinations, and donors and charities.
Being accessed together
If 1:1 or 1:N embedded objects and the parent object are accessed and updated independent of each other, they are candidates for storing separately. For example, owner and car, person and accounts can have different operations and access patterns. Aggregates are often not optimal when embedded objects would be frequently and independently updated. For example, user and sent- or received- message folder have very different update patterns.
If an M:N shared object does not change and also is accessed together with the referring object, it should be embedded with the referring object. For example, travelers and favorite destinations, students and completed courses.
The application may be able to tolerate temporary inconsistency in a shared object. If an M:N shared object is accessed together with the referring object, updated infrequently, and may remain slightly out-of-date while all its embedded copies are being updated, it is a candidate for embedding. For example, students and current course instructors.
Beyond the standard NoSQL modeling techniques and guidelines, data modeling in Aerospike involves additional considerations as discussed below.
Aerospike is a
record-oriented store. It's easy to view a key-value store as a special case of the record-oriented store, where a record holds just one (nameless) field.
In the Aerospike data model, a record is a schema-less list of
bins (fields), which means a record can hold a variable number of arbitrary bins. A bin is type-less, which means it can hold a value of any supported type. Aerospike supports scalar types like Integer, String, Boolean, and Double; Collection Data Types (
Map; and special data types like Blob (bytes), GeoJSON, and HyperLogLog (HLL).
Records are created in a
namespace. A record is typically assigned to a set (similar to a table) within the namespace. A database can have multiple namespaces, and each namespace has dedicated storage devices and policy for how indexes and data are stored, for example, hybrid DRAM and flash, all flash, and so on.
Aerospike Collection Data Types provide an efficient way to store hierarchical objects, including JSON documents. Application objects can be stored in multiple ways in Aerospike:
As a record: Object fields are stored in record bins
As a Map: Object fields are stored as key-value pairs.
As a List: Object field values are stored in the List in a specific order.
We will defer the discussion of CDTs to a future post.
Design of Record Keys
Records are accessed with a unique
key. Aerospike record key (or digest) is a hash of the tuple (set, user-key) and is unique within a namespace, where user-key is an application provided id.
Aerospike does not provide a way for records to be placed on the same node for locality of access through complex key design schemes as some other databases. Aerospike uniformly distributes records across nodes for load balancing, optimal resource utilization, and performance, and so no effort need be spent in designing keys for collocation.
At the same time, it is possible for the application to compose the key to quickly access related objects as described below.
Modeling Related Objects
There are multiple ways in which related object can be organized:
Sets provide a mechanism to keep records organized by some criterion, such as type of objects, metadata vs data, a logical mapping, and so on.
Related objects can be held in CDTs either in one record that has the group-id as its key, or multiple records whose keys are generated by appending sub-group ids to the group-id. For example, ticker for a stock can be organized in records by stock (group id) + date (sub-group id).
A List can be used to store a group of related objects as a List of Lists or a List of Maps.
A Map can also be used as a Map of Lists or a Map of Maps.
CDTs provide many advantages such as greater density of objects per node by reducing the per-record system overhead, powerful server-side operations, as well as element ordering. We will cover use of CDTs in a future post.
Understanding Transactions and Consistency
It is important to understand transactions in Aerospike to ensure data consistency and correctness. Aerospike allows a namespace to be configured for Availability or Strong Consistency (SC). Multiple read consistency levels are possible in the SC mode for the application to strike the right balance of performance and level of consistency.
In the SC mode, Aerospike provides transactional guarantees for single record operations. This includes multiple operations on a single record that can be performed in a single request. Therefore, data that needs to be updated atomically must be stored in one record. CDTs provide an easy way to store such objects in one record.
While transactional updates are currently not available across multiple records, delayed consistency across multiple records can be achieved through known schemes.
Managing Temporary Objects
Aerospike has useful mechanisms that should be leveraged to manage objects with a defined lifespan. Such records can be marked with an expiration time (or time-to-live, TTL; the default is no expiration). Expired objects are automatically removed and their space recovered through garbage collection. This mechanism provides a convenient and efficient way for the application to manage its temporary objects that have a specific lifetime and must be removed after that.
If data needs to be archived based on some age criterion to another location, sets and secondary indexes can be used to efficiently identify the records to archive.
A namespace’s index and data can be placed in different storage types with different speed, size, and cost characteristics such as DRAM. PMEM, and SSD. Applications can allocate data to different namespaces depending on the speed and size needs for different objects.
The “data in index” option is available for high speed counters: It stores a single numerical value, typically a counter, that is updated at high frequency and for which access speed as well as consistency are critical. For example, it is important to accurately read and update the number of seats available for a popular event when the tickets go on sale to avoid under-booking or over-booking. Similarly, fast objects can be stored in a PMEM or fast SSD namespace, and large low-cost “all-flash” namespace can store objects with less stringent access latency.
It is also possible to split an object across multiple namespaces with the same set and user-key and therefore, the digest of records, serving as an implicit reference. For example, one namespace may hold archived versions, and another the latest version.
Other namespace configuration options significant for data modeling decisions include maximum record size and choice of Availability versus Strong Consistency.
Maximum Record Size
A namespace is configured for a maximum record size, with the upper limit of 8MB, and represents the unit of transfer in a device IO operation. Record data cannot exceed the configured maximum record size, and is an important consideration for large object as well as multi-object record design. The application design may consider workarounds such as a multi-record map.
Using compression can significantly compact data and reduce the data storage requirements, thus increasing the data density per node, reducing the cluster size, and lowering the cost.
In addition to improving storage density, compression can also improve wire transfer speed for large objects. Compression can be enabled for efficient client-server data transfer.
To achieve optimal performance, many mechanisms are available in Aerospike including the following.
Scan operations use primary indexes on namespace and sets, whereas query operations use secondary indexes. Secondary indexes can be created on a bin’s Integer, String, and GeoJSON values. Secondary indexes improve query performance, but have a cost of keeping the index in sync when the underlying data is updated. Typically, a secondary index on a field works best for high query/update ratio and high selectivity of the index field.
In Aerospike Database 6.0+, the application can boost access throughput with hyper-parallel “partition-grained” secondary index queries, in addition to primary index queries from prior releases.
Prior to Aerospike Database 6.0, “read' or “exists” operations on multiple records could be batched in a single request for efficiency and speed. In 6.0, batch operations for write, delete, and UDF operations are also supported. Fast ingests, for example, for IOT streams, can get better throughput with batch writes.
Expressions and UDFs allow complex operations to be performed on the server, without having to retrieve data to the client first.
Expressions: Expressions offer a powerful way to define complex logic for server-side evaluation - either to filter records, or to retrieve data and store results.
UDF: User Defined Functions (UDFs) are defined in Lua for record- and stream-oriented computations. They are invoked through a client request, and executed on server.
Sets and Set Indexes
Related records can be organized in sets. To enable fast scans on a set, a set index can be defined. A set index can provide a big performance boost to small set scans as compared to the alternative of having to scan the entire namespace in the absence of a set index.
It is also a lot more efficient to truncate a set as opposed to deleting individual records when the data is no longer needed, and therefore such deletion cohorts may be organized in sets.
Additional Data Design Considerations
In addition to the data modeling aspects described above, there are Aerospike cluster design aspects that overlap with data design, and affect application performance, reliability, and ease of development. They are briefly described below.
Replication for Reliability and Performance
An Aerospike cluster holds a Replication Factor (RF) number of copies of data for reliability and performance. A RF of 2 is typical, and for higher resilience can be 3, but a larger RF adversely impacts both speed and scale.
Synchronous and Asynchronous Replication
An important design decision is whether the data is held in one tightly synchronized cluster across multiple sites or racks, or multiple loosely XDR synchronized clusters. The decision depends on the application's need for consistency, site autonomy, and data regulation requirements.
For fast local reads and availability, data is replicated in a rack aware fashion where all sites are similar and each site holds its own copy of the entire data.
The client directly connects to all server nodes, and there is no coordinator node to coordinate processing, and as such, operations like sorting and aggregation involve client side processing.
Data modeling is the exercise of mapping application objects and access patterns onto the database’s native data model and mechanisms for optimal performance, efficiency, and consistency. Aerospike Database is purpose built for speed at scale, and provides a path to companies of any size to leverage large data for real-time decisions without incurring huge operational cost, and also scale in the future. The blog post described data modeling considerations when designing for speed-at-scale applications with the Aerospike Database. In a future post, we will describe how CDTs can be used for data modeling.