Elastic’s newest query language deserves closer attention
A “new foundational technology" says CPO Ken Exner
Elastic has a new query language, ES|QL, and a new query engine. With four billion downloads of the company’s search analytics software and 20,000+ subscription customers, it could prove transformational for teams with growing observability and security requirements.
As a search analytics provider used by over 20,000 enterprise customers, Elastic has a potent reputation for search – across use cases from capital markets to ecommerce, telecommunications to technology companies.
Its powerful search API is easy to integrate and formidably fast. It’s a recognised “Leader” in a Magic Quadrant for that (“best-in-class”; “easy-to-understand pricing”), yet Elastic, which just reported revenues of $328 million in its last quarter, does more than just search, however.
Elastic is also a Magic Quadrant “Visionary” for Application Performance Monitoring and Observability (“can transform, analyze, visualize and gain insights from increasingly complex, heterogeneous datasets while owning data”) – and what CISA describes as an “Advanced” provider of visibility, threat hunting, automated detection, and Security Operations Center (SOC) workflows via its increasingly popular Elastic SIEM platform.
All highly positive stuff. Yet as Chief Product Officer (CPO) Ken Exner acknowledges frankly in an interview with The Stack, this expansion of capabilities has led to some sub-optimal design choices: “When we moved into use cases where people wanted to create complex queries, like compound queries, that was all bolted on to the original search API.
“We created a query DSL, which was a JSON representation of a complex query that then fed into the Search API. Then, because customers wanted more expressiveness in their language, we started supporting several third-party query languages. We had EQL, KQL, and some limited forms of SQL, which translated down to query DSL and then into Search API.
"It worked brilliantly. But it wasn't blazingly fast,” he admits.
“If you were trying to create a query that went across metrics, and, traces, and logs, you had to use different query languages… this is true not just for Elastic, but pretty much anywhere you go, like Grafana, for example: There's a query language for metrics, there's a query language for logs…”
New Elastic query language: Rebuilt from the ground up
Elastic CPO Ken Exner, who joined the company in 2022 after 16 years at AWS, including three years as General Manager, AWS Developer Tools, describes Elastic’s new query language and new query engine as “really good foundational technology for working across distributed datasets.
As he explains: “[We asked ourselves if we're going to] create a really fast query language and query engine that works across different data sets from the ground up, how would that work?
"ES|QL allows you to pull different datasets, and do joins and unions on the dataset, whether it's structured or unstructured data.
“This is important: if you look at SQL, and a lot of other analyst languages, they work on structured data only. But we wanted to create something that could also work on unstructured data, like logs; to be able to pull different datasets, do aggregations on them, create new fields on the fly, be able to do maths operations, do a bunch of different capabilities across disparate datasets at blazing fast speed. So that was what we delivered!”
Exner adds: “There are a number of reasons why this matters. I’ll give you an example: If you're a security analyst, and you are doing threat hunting, you are typically pulling from various different datasets and creating complex queries like 'I am looking for all login attempts from the past three hours from a particular IP range, from accounts that were created in the last two days…’
“That example would require a collection of work across different datasets: Some might be logs, some might be metrics, some might be business systems. To be able to do this on the fly and generate the resulting set in milliseconds is what we've aimed for with ES|QL. That's why our observability and security customers have loved the new query language, and we're really excited about it!”
The new Elastic query language, ES|QL, was released as a technical preview in November 2023. General Availability (GA) is slated for later in Q2 2024, subject to ongoing collaborative engagement with customers.
Elastic and Generative AI
The foundational transformation that generative AI offers has huge potential for observability and security use cases. Across its highly performant search offering, much recent growth has been driven by generative AI meanwhile.
The Elasticsearch Relevance Engine, or ESRE, (a collection of relevance tools for developing advanced search applications using ML/AI) has been increasingly widely adopted by customers looking to combine keyword matching with semantic search and integrations with generative AI.
As a highly mature search and noSQL database provider, Elastic is well-equipped for generative AI demands. Its vector search capabilities let customers store and index embeddings in Elasticsearch natively using the dense_vector field type and efficiently query those vector embeddings using modern kNN algorithms, like HNSW.
As CPO Ken Exner notes: “We prompted customers to use Elasticsearch as a vector database years ago. Last year, we introduced support for bringing your own transformer models. People want to connect to an LLM – finding and passing the most relevant information under a security model to a large language model [using Retrieval Augmented Generation or RAG]; we're seeing tonnes of customers wanting to [understand] how you pass the most relevant information to a LLM to ground it, to give it context. So if a customer has already indexed their data, and used Elastic for lexical search or text-based search, it should be push-button simple for them to turn on semantic search; push-button simple to turn on RAG.
“That’s also a big focus for us,” he says.
RAG is troubling many CIOs with the nuances of how to ensure sensitive proprietary data does not end up being inadvertently exposed.
As Exner notes: “Because we've been doing search for so long, we have a highly advanced maturity model; we have really robust permission systems with role-based access control and document-level permission.
The Elastic CPO adds: “Lots of native vector databases just try to index and query everything; they don't have the mature built-in permission systems. This is one of Elastic's core advantages: We’ve been doing secure search for so long... Finding the most relevant documents and applying the security at query time, at search time is central to what we do.”
Delivered in partnership with Elastic.