Chief Data Officers, heads-up: Astra Streaming is a data gamechanger
"There's a lot of valuable data in these messages and today that's just evaporating off..."
SPONSORED -- Chief Data Officers continue to struggle with blind spots in their data ecosystem.
“They've often got a good handle on relational data in their databases, they've got data warehousing technologies to get that into a cohesive view. But they're really missing a lot of the contextual information about actions that are taking place across their estate” Chris Latimer tells The Stack.
This issue is exacerbated by legacy data and application plumbing based on data messaging rather than streaming. Simply, both architectures pipe data to applications to take action based on it. Queuing furnishes a sequential list of data blocks to be processed. But streaming lets users apply complex operations on multiple input streams of data at the same time; it’s richer, faster and more sophisticated.
While message queue-based architectures destroy messages after they are used (to make way for new instructions) stream-based systems retain them for real-time or retrospective analysis.
Rip-and-replace of queuing-based applications is not the answer
“Historically, when people have looked at messaging, they've thought of it as an enterprise application integration concern: i.e. 'I just want to sort of exchange a message between two systems asynchronously’ says Latimer, VP of Product at open data stack specialist DataStax. “What they're realising now is there's a lot of valuable data in these messages -- and today that's just evaporating off, because most of these systems are ephemeral,” he notes, adding emphatically: “We're now seeing people leverage the data that's trapped within these sclerotic messaging queue (MQ)-based systems, by moving to streaming in general”.
DataStax’s Chris Latimer was speaking after the company launched Astra Streaming – a managed streaming service built on Apache Pulsar that provides streaming, pub/sub and queuing capabilities with low latency and the ability to operate all the way from developer sandbox through to massive enterprise scale.
DataStax works with some of the biggest companies out there supporting mission critical applications and architectures and Astra Streaming has spent a year in beta undergoing rigorous stress-testing.
But you don’t need to be building new streaming applications or architectures from scratch to use it…
The Frankenstack conundrum: Enter Starlight
Every CIO, CTO, CDO knows through painful experience – infrastructure and operations professionals know it viscerally too – that most businesses run on heterogeneous technology stacks loosely plumbed together after years of acquisition and rounds of procurement by often siloed IT teams; every stack is a Frankenstack.
For those seeking to use streaming – use cases can be as diverse as the underpinnings of equities trading, banking fraud analytics, multimedia content, advertising placement, industrial equipment monitoring, or simple newsletter subscription alerts – legacy technologies can be a particular drawback. Amidst this world of legacy data plumbing Astra Streaming aims to offer a “really fast route to modernisation”.
As DataStax notes, many organisations rely on traditional message brokers like JMS, or RabbitMQ, an open-source message queueing platform that is struggling to keep up with the scale and performance requirements of today’s technology demands. Creaking deployments of JMS (the standard messaging API for passing data between often legacy Java application components) and Kafka are also widespread.
Astra Streaming comes with JMS, Kafka, RabbitMQ integrations...
Cleverly, DataStax’s Astra Streaming lets users migrate existing JMS, Kafka and RabbitMQ applications and services to Pulsar without modifying the code using its “Starlight” framework. As Chris Latimer notes: “This allows us to support things like JMS, Kafka and RabbitMQ all on one platform.
"If you have an ageing legacy messaging platform or deployment, and you want to be able to modernise that, you don't necessarily want to sign up for a big multi-year rip-and-replace of hundreds of applications that are using JMS.
“But at the same time, you also are going to be struggling because those systems aren't keeping up anymore. And honestly, infrastructure and operations are feeling a lot of pain from legacy messaging platforms that are not keeping up.
"They’re getting yelled at when there are outages; when things are breaking or not performing and criticised for being bottlenecks, for slowing down development. With Astra Streaming you can essentially point your JMS applications at Astra and you can have a drop in replacement that's going to give you all of the performance and throughput and cloud native, Kubernetes-friendly architecture that comes with Pulsar.”
With Astra Streaming you can essentially point your JMS applications at Astra and you can have a drop in replacement
A side note: Unlike Kafka, Apache Pulsar delegates “persistence” or data storage to a separate low-latency storage service designed for real-time workloads and its “brokers” are stateless — they are not responsible for storing messages on their local disk – meaning they can be spun up or wound down flexibly; meaning compute and storage can also be scaled separately. With Kafka, adding more compute means extending storage too.
As Astra Streaming customer Michael Smith, Director of Engineering at investment platform Commonstock tells The Stack, he found “a lot of things that Kafka didn’t do out of the box that we’d have to think about.. [Unlike Kafka] in Pulsar I had out-of-the-box encryption between the client and storage.
When we looked into Pulsar the immediate thing that caught my attention though was the ability to scale brokers and storage independently – as well as tiered storage where your old data gets shuffled off to S3 but it’s still available through the same API; so seamless.
“We found out about Astra Streaming as a managed solution and we’re like, ‘yeah, that’s perfect’. Getting it up and running was super straightforward. It didn’t take a ton of developer time – because that’s the most important thing; the mantra really is ship or die – and they really went above and beyond to accommodate us, you know, VPC peering, posting it in a region that was flexible for us; they helped us out immensely with a cluster migration – the customer service aspect of it was super important.”
DataStax’s Chris Latimer is rightly proud of that focus on customer support and indeed compatibility.
As he puts it: “When we say we have compatibility we take it personally. There have been plenty of examples in the past of companies that claim compatibility and when you pull back the covers a little bit it's not really there. When we say we have a drop-in replacement we mean it; we can stand behind it.
"With our ‘Starlight for JMS’ for example we can push that up to around a million messages per second on Astra Streaming – which compared to what most enterprises are using JMS for is going to be more than enough” he says, hastening to add that “we can push beyond that: it was just our goal to get it up to a million per second with single digit latency. So you're talking about a drastic performance improvement for anyone that's using, you know, these legacy messaging platforms?
"In all, Astra Streaming is just a really fast lane to modernization.”
Sponsored by DataStax