Gitar

Join us at AWS re:Invent 2024!

The Gitar wordmark
February 8, 2024
An open book icon 8 min read
Gitar, Inc
Gitar, Inc

Fantastic build speeds and where to find them

Gautam's journey that led him to start Gitar

Fantastic build speeds and where to find them illustration

Act I: Let the journey begin

Very early in my professional career, I learnt that build speeds were important to feel productive. I also learnt that exploiting parallelism is often the easiest path to improve build times.

My first introduction to any sort of professional software engineering was during college. We had a course on Java where we had to implement a library system to checkout and manage book inventory. It was about Object Oriented Programming and we were asked to use JEdit as the IDE. It felt great to iterate since the build times were really quick due to the simple nature of the software. However, the class names were very long and there was a lot of boilerplate.It made me realize how much I disliked programming in Java. So much so, that I told myself I would never want to work as a Java programmer after I graduated.

Fast forward, fresh out of grad school, my first job was an Android developer on the growth team at Lookout. I didn't realize Android development meant working with Java. I was bamboozled! My first task was to create an android live wallpaper to delight our users when they were not using the app. I am still not sure how this led to customers upgrading to the premium version, but we tried everything as growth hackers.

To try so many things, we had to iterate on features, A LOT. However, the tools at our disposal to make new changes in code and check how the app behaved were very slow. We used Eclipse, and Android Studio/Gradle would not be announced till the next year at Google I/O. Our CI builds running on Jenkins used to take almost an hour to process a change request on Gerrit, our code review system.

This was very frustrating as an engineer whose purpose was to experiment on product features rapidly. I decided to figure out if I could speed up our CI builds. I hacked up a way to run our unit tests in parallel (and there were a lot of them). This brought CI time down to 15 min. A 4x speedup. The gratification I felt was immense. More than when my growth hack led to a slight increase in Avg Revenue Per User. I had found my calling.

I want to make developer tools fast

I learnt several things in the process

  • Tests can be parallelized at various levels - test target, class, method. The typical caveat is to avoid shared mutable state to prevent flakiness - shared system ports, I/O to files (use temp files always).

  • Linters can be used to enforce code patterns in tests that make it possible to parallelize them - using a single executor thread pool, avoid sleeps.

  • If tests spin up external processes (containers etc.), their resource usage should be accounted for when deciding how many threads tests should use to run

  • If a test becomes flaky after being run in parallel, it is easier to run it serially and let the others run in parallel as a quick way to make progress

Act II: Android Gradle Plugin - Electric Boogaloo

I learnt that faster builds should not come at the expense of usability and developer experience as part of my experience with the Uber android codebase.

I joined Uber with a passion to make developers productive and build tools for them. Shortly after, we had decided to rewrite the Uber rider app using a new architecture. This decision brought to light a lot of gaps in the Android Gradle Plugin when building with many Gradle modules. I reported and fixed a number of issues with various components of the plugin from the manifest merger (my personal favorite) to the D8 Dexer (first contributor outside Google) to unblock the app rewrite. The great folks on the Android Studio team also sent me a nice gift for all the effort!

This is when I realized that building software reliably and fast at scale required a different approach. After looking at a few alternatives, I decided to migrate our builds to buck from Meta (due to its superior support for the android ecosystem at the time) and move all android repositories to a single monorepo. This gave us the benefit of a network cache of build artifacts and a different rule based approach that promised to exploit all available cores to build things fast.

This was not an easy engineering effort, and was a hard sell to leadership. If there was any way to build the app with off-the shelf tools, I would have not pursued this approach. I was able to show the benefits to developer time saved very clearly with the metrics we were already collecting to get past this hurdle. I was also the primary author of OkBuck which automatically migrated projects from Gradle to Buck with minimal effort. This made the transition a lot easier (check out my talk from DroidCon to learn more). Keeping the interaction surfaces with the build system simple for the end user was my greatest accomplishment to hide away the complexity and be able to innovate under the hood. That was when I realized…

I want developers to feel good about using their tools

In this process, I learnt that

  • After parallelization, caching is the second best thing to improve build speeds

  • A wrapper layer for developer tools is crucial for many use cases - capturing analytics, deprecating old features, changing infrastructure underneath and migrating between systems

  • Performance monitoring tools (java flight recorder, strace etc.) are a great investment that can inform the right areas to invest in to improve build speed

  • Observability into the developer lifecycle is necessary to justify large scale migrations. Gather metrics that can show return on investment for such migrations to be successful.

Act III: Monorepo, I believe in thee

I learnt that a monorepo can unlock a huge amount of developer productivity and improve developer sentiment when done right.

I was approached by a few other teams at Uber who saw the success we had with the migration to a new build system in the Android monorepo. They wanted a piece of the same delicious pie. Now I had to figure out how I could bring these benefits to everyone at Uber. I realized quickly that replicating the same setup came with some drawbacks - new repositories kept springing up and it was hard to keep everyone using our tools up-to-date. Monorepo was one way to solve this problem, but required a lot of effort upfront. There was much debate about monorepo vs micro-repo, but we realized whichever way we went, we had to build tooling to help development at Uber’s scale.

At this juncture, Ali joined Uber with the vision to unify the developer experience and brought teams working on developer tools and frameworks under one organization - Developer Platform. It was a much needed change to rein in the chaos of every team developing their own standards that led to inconsistency and inefficiency. in This was when the concept of “Internal Developer Platform” was starting to become more widespread the industry and we didn’t realize we were trailblazing. With Ali’s blessing, we migrated most code at Uber to a set of Monorepos - one per platform (Android, iOS, Java backend, Go Backend, and Python for ML). To do this seamlessly, we built a lot of automated tooling to preserve version control history, update build files automatically to match the golden-path we wanted developers to adopt and resolve conflicts in external dependency versions. These tools enabled us to move 600+ java repositories to a single monorepo in 3 weeks without a code freeze/pause in deployments.

Moving repositories en-masse had some side-effects. Because there were no standards around CI, testing, and enforcement of quality gates (e.g., code reviews, linters, code coverage, test failures, etc.), we uncovered many latent problems in the code. Monorepo enforcing code quality checks by default meant a lot of failures and flaky tests. To deal with the post-migration blues, we developed many different engineering systems - Incremental CI on every change, Merge queue, Flaky test management, Remote developer environments, etc. I also worked very closely integrating some amazing research tools Raj and his team worked on using program analysis to find defects earlier in the developer lifecycle and reduce tech debt by cleaning up code autonomously. Over time, we improved developer sentiment all the way from -50 to +8!

Using the monorepo as the central platform, we were able to ship improvements to all Uber developers seamlessly and rapidly. For example, to upgrade log4j due to a security vulnerability, it was a single change that was approved and merged in less than an hour for the entire company! Previously, a similar upgrade had taken us a couple of months to coordinate across many repositories with different tools and standards This led me to believe

I can deliver a great developer experience efficiently with a Monorepo

Working with a monorepo taught me that

  • Building a great developer experience with fast builds in a monorepo is a hard problem. Automated tooling is essential to make moving to a Monorepo painless.

  • All CI/CD operations have to be done incrementally based on what build targets were affected

  • It requires long term investments in tooling, processes, and developer training. With increase in code size, build systems, IDE and local developer experience need to adapt to be faster.

  • Monorepo enforces a high minimum standard across the entire code base. Any impact of flaky tests, however, impacts everyone so it's important to manage flakiness automatically.

Act IV: Building Gitar

Ali, Raj and I want to provide a unified developer experience as a reliable service so developers can benefit from fast builds without having to invest the effort in finding them. I am excited to start this new act and hope to write a follow up in the future with new lessons learned along the way!

We invite you to join our Slack community, where we continue to explore and discuss these topics further.

Join our mailing list

you’re on the list. get ready to jam together.Stay tuned for updates