Search the site

OpenTelemetry promises run-time "profiling" as it guns for graduation

eBPF ftw, as Elastic, Splunk donate key technology...

The OpenTelemetry project promised a stable implementation of its profiling data model would become be an official part of its spec in a project update at Kubecon in Paris this week. (Profiling is a way to dynamically inspect the behavior of application code at run-time.)

The organization also confirmed it was applying for full graduate project status, and would be kicking off third party security audits with the CNCF over the coming months as part of this. (OpenTelemetry is a set of fully open-source APIs, SDKs, and integrations that are designed to create and manage telemetry data such as traces, metrics, and logs.)

In a statement, the project said its profiling signal “as a first for the industry, connects profiles with other telemetry signals from applications and infrastructure.” Elastic is donating its “proprietary” eBPF profiling agent, while Splunk is donating its .NET profiler to the profiling effort.

This means engineers will be able “to correlate resource exhaustion or poor user experience across their services with not just the specific service or pod being impacted, but the function or line of code most responsible for it.” i.e. They won't just know when something falls down, but why; something commercial offerings can provide but the project has lacked.

The Big Interview: Thomas Graf, CTO, Isovalent, on eBPF, cloud-native networking and why Cilium is so hot right now

OpenTelemetry governance committee member, Daniel Gomez Blanco, principal software engineer at Skyscanner, added the advances in profiling raised new challenges, such as how to represent user sessions, and how are they tied into resource attributes, as well as how to propagate context from the client side, to the back end, and back again. As a result it has formed a new specialist interest group to tackle these challenges.

Honeycomb.io director of open source Austin Parker, said: “We're right along the glide path in order to continue to grow as a mature project.”

As for the graduation process, he said, the security audits will continue over the summer along with work on best practices, audits and remediation. They should complete in the fall: “We'll publish results along these lines, and fixes ,and then we're gonna have a really cool party in Salt Lake City probably.”

He added: “Graduation is a strong signal to the rest of the world that this project is here for the long haul, it's stable and will be something that can rely on for many years into the future.”

Join peers following The Stack on LinkedIn

Also, in the update session, the team outlined other plans for the project. Trask Stalnaker, principal software engineer at Microsoft, and a member of the governance committee, said that, “now that logs traces metrics, declared stable, one of the next big frontiers is semantic conventions, which is defining the shape of the telemetry within those signals.”

He said, “A lot of work is going into semantic convention definitions. And finally, moving from into conventions to stability.”

The debut of profiling came as observability firm New Relic announced it would provide native support for OpenTelemetry and Prometheus-instrumented hosts and Kubernetes clusters

The company said this means that organizations could instrument Kubernetes Clusters and hosts with the OpenTelemetry collector and Prometheus Node Exporter “in a single step”. At the same time, it said, they would get access to dashboards and native UIs with standardized “golden metrics”.

New Relic said the move would massively simplify observability for engineers, reducing the amount of time they have to spend on instrumentation and setup, and the inevitable troubleshooting.

EMEA CTO Greg Ouillon said it had open sourced its own agents four years ago, “And three years ago, we really decided that we would treat OpenTelemetry as a first class citizen on the platform.”

The updates meant that “For engineers, they just deploy their Prometheus and OTEL on their clusters and their hosts, no New Relic agent on their side.”

“The data will get instantly connected to the right dashboarding user experience alerting and get the same clean 100% curated UI of New Relic, not a small, small zoo on the side for open telemetry.”

He added that New Relic had no intention of dropping its own agent technology, for now at least, as many customers loved the "out of box" approach, or weren't ready to switch to open architecture.

“Our idea is that today, the OpenTelemetry standards are not completely ready,” but, he continued, New Relic “absolutely acknowledges that OpenTelemetry will become more mature will become the de facto way of instrumenting. And rather than trying to escape that we are absolutely committing to it.”

See also: Redis slaps MongoDB’s SSPL licence on its OSS core, blames cloud, complexity