In this article we focus on establishing connectivity between 2 Aerospike clusters. The goal is to use Aerospike's Cross Data Center Replication feature ( XDR ) to seamlessly send data from a source cluster to a destination cluster. The source cluster needs network visibility of all Aerospike service ports in the remote cluster, and this can present problems, particularly in a Kubernetes environment. Placing a proxy server in front of the private Kubernetes destination cluster can overcome this problem and achieve the desired goal. To demonstrate the solution we start by installing the Kubernetes Operator that will schedule our source and destination databases. In this example, we set up our replication in one direction. Aerospike is capable of supporting 'master/master' replication and provides a conflict resolution mechanism in the event of update clashes. This too could be supported using the XDR proxy.
14 posts tagged with "aerospike"
View All TagsParallelism with Fine-Grained Streams (Part 2)
Source: Photo by Clem Onojeghuo on Unsplash
While it is possible to process a data set using a large number of parallel streams, a higher degree of parallelism may not be necessarily optimal or even possible. This article explores how to think about parallelism, and discusses many bottlenecks that limit the level of parallelism. It also highlights the need to perform measurements in the target setup due to many factors that cannot be easily quantified.
Processing Large Data Sets in Fine-Grained Parallel Streams
Source: Photo by Dan Gold on Unsplash
Aerospike provides several mechanisms for accessing large data sets over parallel streams to match worker throughput in parallel computations. This article explains the key mechanisms, and describes specific schemes for defining data splits and a framework for testing them.
Block and Filesystem side-by-side with K8s and Aerospike
In this article we focus on side-by-side block and filesystem requests using Kubernetes. The driver for this is it will allow us to deploy Aerospike using Aerospike's all flash mode.
Storing database values on a raw block device with index information on file can bring significant cost reductions, especially when considering use cases for Aerospike’s All-Flash. To support such a workload, you can configure Aerospike to use NVMe Flash drives as the primary index store.
The REST Gateway 2.0 Gets a Re-Model (formerly, the REST Client)
Source: Photo by benjamin lehman on Unsplash
The REST Gateway provides you with a well known interface to your Aerospike Database, and a Swiss army knife-like solution to a variety of architectural problems where you might not want to integrate a full fledged Aerospike client into every application.
Accelerating SQL on Aerospike
Source: Photo by Julian Hochgesang on Unsplash
Aerospike Database is deployed by large-scale real-time applications in a wide range of verticals. Businesses need “as it happens” visibility over these systems - sometimes in near-real time - via notifications, ad-hoc queries, dashboards, and reports.
SQL is broadly used as a data access language for analytics, and Trino provides a powerful engine for SQL access to multiple data sources. Aerospike Trino Connector enables SQL access to Aerospike data through Trino, and more broadly, allows Aerospike to be used to expand fast analytics data accessible from Trino.
Aerospike k8s Volume Cleanup
When using the Aerospike Kubernetes Operator, the complexity of configuring a high performance distributed database is abstracted away, making the task of instantiating an Aerospike database incredibly easy. However, even though Kubernetes leads us towards expecting equivalent results regardless of platform, we need to be mindful of the peculiarities of individual frameworks, particularly if we are repeatedly iterating processes. This article focuses on AWS EKS provisioned storage which is dynamically created when using the Aerospike Kubernetes Operator. Ensuring that storage has been fully deleted and other redundant resources removed is a necessary housekeeping step if you are to avoid unwelcome AWS charges.
Of Queries and Indexes
Source: Photo by Jan Antonin Kolar on Unsplash
Queries, scans, indexes, pagination, and parallelism are common concepts in databases, but each database differs in specifics. It is vital to understand the specifics in order to get the most out of a database. In Aerospike, queries and indexes play a key role in realizing its speed-at-scale objective. The goal of this post is to help developers better understand the Aerospike capabilities in these areas.
Building Large-Scale Real-Time JSON Applications
Source: Photo by Wilhelm Gunkel on Unsplash
“Real-time describes various operations or processes that respond to inputs reliably within a specified time interval (Wikipedia).”
Real-time data must be processed soon after it is generated otherwise its value is diminished, and real-time applications must respond within a tight timeframe otherwise the user experience and business results are impaired. It is critical for real-time applications to have reliably fast access to all data, real-time or otherwise.
Query JSON Documents Faster (and More) with New CDT Indexing
Source: Photo by Cameron Ballard on Unsplash
The Collection Data Types (CDTs) in Aerospike are List and Map. They offer powerful capabilities to model and access your data for speed-at-scale. A major use of the CDTs is to store and process JSON documents efficiently. In the recent Aerospike Database 6.1 release, secondary index capabilities over the CDTs have been enhanced to make the CDTs even more useful and powerful for JSON documents in addition to other uses.