The birth of "Big Code" and how to deliver it with remote execution
"Historically the build phase of the cycle has been one of the slowest - and it’s also not easily parallelisable"
Big Data is big, obviously. But we only use the expression because it is intended to convey the now the mostly-forgotten notion that some datasets are so massive that they fail to fit comfortably into standard or traditional relational database management systems, writes Adrian Bridgwater.
In truth, the IT industry also quite liked to use the term Big Data because it sat nicely in line with the massiveness of the webscale cloud era… and perhaps because it sounded quite cool.
Big Code bottlenecks...
Now that we have progressed to a point where we think about big data as fairly normal element of cloud, we can ‘worry’ about new system requirements like real-time data streaming and container orchestration. We can also look to the scale and scope applications in this space and also consider them as massive megaliths. This is big code.
When applications get so huge that they become candidates for big code status, the developers running them start to find themselves in something of a bottleneck. Software engineers working at this level spend their days running and re-running ‘builds’ to compile new code commits and extensions alongside essential patches and other maintenance functions.
Quite suddenly, the big code application itself becomes something of a couch potato - big and weighty, but torpid and sluggish with an unhealthy predilection for too much popcorn.
Okay not the popcorn, but you get the point. What all of this has led to is the development of so-called remote execution services. Specialist technology firms working in this subset of the IT firmament are predominantly cloud-native and offer capabilities to distribute builds and tests across a cluster of machines while also remotely caching the results to make them faster.
Players in this space include EngFlow, Gradle and BuildBuddy to name a few.
Build automation
According to Helen Altshuler, CEO and co-founder at EngFlow, how fast an organisation can ship new features depends on various factors, including how fast the code can be compiled and tested. This is why firms in this space talk about ‘build automation’ i.e. a set of combined processes including the compilation and packaging, testing, deployment and publishing of code.
“Historically the build phase of the cycle has been one of the slowest - and it’s also not easily parallelisable. As code bases grew (organically and as a result of using large amounts of open source code), organisations started looking into ways to improve developer productivity and improvements in the developer experience. This is often described as big code and it’s why the term is now being popularised.”
Thanks, Bazel...
Google anticipated this growth internally and implemented Bazel, a multi-language build system that analyses code dependencies and creates a graph, which ‘reset’ developer expectations about the build system.
Happily, Bazel is engineered to deliver advanced local and distributed caching, optimised software code dependency analysis and parallel execution. This is big code evolving and getting faster at the same time.
Google also created a remote execution API allowing third-party platforms like EngFlow to parallelise builds and tests in the cloud. Yes, faster still.
Altshuler says the sum result of these developments means build jobs that took hours can now run in minutes… and in some cases in seconds. “This speeds up the code to the production cycle, enables companies to iterate faster and ship well-tested products,” she said. “It also significantly improves developer velocity and happiness.”
Join peers following The Stack on LinkedIn
As we know, developers hate wasting time and waiting for slow processes to complete. EngFlow is working on implementing its remote execution concept for other build systems and today provides it for Bazel, Chromium/Goma and Android Platform/Soong.
Gradle Enterprise is another solution that improves productivity and the developer experience by using a portfolio of modern performance acceleration technologies and machine learning, to speed up build and test times. Its ‘open’ build-system-agnostic platform supports Gradle Build Tool, Apache Maven, Bazel and shortly sbt (the most popular build system used to build Scala applications).
“It’s clear that anything you can do to get developers back to coding and spending less time waiting on slow build and test cycles or troubleshooting failures without adequate tooling, will make developers more productive and happy,” said Gradle founder and CEO, Hans Dockter.
“We take a ‘no dev team left behind’ approach to big code build automation by ensuring that all teams, regardless of build and language ecosystem, can avail themselves to a full range of developer productivity capabilities, such as build and test performance optimization technologies, efficient failure troubleshooting tools and robust analytics for improving toolchain reliability.”
Dockter is an advocate for the emerging practice of Developer Productivity Engineering (DPE). DPE focuses on addressing the build and test process bottlenecks and points of friction (locally and on CI) that most negatively impact the developer experience.
DPE is practiced by big code companies like AirBnB, Apple, Google, Microsoft, Netflix, LinkedIn and now several of the largest global banking institutions.
The rise of platform engineering
Allied to DPE, a growing trend in the industry is the creation of Developer eXperience (DX) metrics among platform engineering teams. These teams are implementing a consistent, scalable and performant platform for software developers in their company, so that they can focus on product features rather than developer infrastructure. Big code build automation (arguably) fits pretty well into these concepts and precepts.
"The build [process] is one of the most critical aspects of creating software.
"Traditionally it's been a massive time and cost sink for companies to get right. [Looking at EngFlow] this solution is able to tackle the most complex code bases and large infrastructure environments and offer savings in development time and costs,” said Martin Casado, general partner at Andreessen Horowitz.
Casado’s rosy-glow comments are no doubt influenced by the fact that his venture capital firm is an investor in this product, but we can see parallel developments happening elsewhere. Snap, Airbnb and BMW are some of the early pioneers of this practice, as is Israeli cloud software development company Wix.
Wix adopted Bazel in 2016, and was one of the earliest contributors, starting with rules_scala, a set of Bazel rules for Scala language programmers. Rules are important here because open source Bazel helps build applications from source code level using rules (essentially a kind of macro) written in Starlark, which is itself a dialect of the Python programming language.
Scaling while saving
Since Bazel was open sourced, software development teams have adopted this build system and looked for additional tools to scale - that's where remote execution comes in. The EngFlow remote execution platform claims to be able to save on cloud costs, because of its more efficient use of compute resources.
“Remote execution is only one part of a comprehensive developer platform and combined with caching and observability, allows for platform engineering and Developer eXperience teams to proactively analyse and improve their developer processes with better tools and practices,” said Altshuler.
The wider enterprise technology trends on show here are fairly clear; they run in line with the general push to balance cloud-driven complexity and scale with a commensurate dose of autonomous automation and, where possible, to replicate that power across parallelised workloads for the greater good. There’s even touchy-feely UX user-developer experience in here too with a healthy portion of open source and user empowerment.
First, we had big data, now we have big code; now various big AI bots are spawning too.