We’ve been using Scala at Asana since 2013. Since we started using Scala, the number of engineers writing it has increased from a single team to almost every engineer at Asana (and Asana’s grown, too!) As we’ve scaled, we’ve had to carefully consider our tooling and best practices, and we thought it would be useful to share what we’ve learned.
Our first use of Scala at Asana (in 2013) was for a simple web server which maintains a pool of Node.js processes and routes incoming requests to them. This application consists of about 3,000 lines of code, and is still in use today whenever you use the Asana web application, with only minor changes. We originally built this web server using sbt, which worked great for a project of this size.
By 2014, we were increasingly feeling the pain of our architecture. We paired each client with a dedicated Node.js process on the backend, which simulated the behavior of the client to determine what data to send it. As our application grew more complex, performance got worse and worse, and it became harder for developers to reason about our code. We decided to do a full rewrite of our application, powered by a new data loading service.
We needed to decide which language to write our new service in, and decided to choose a language which we would be able to reuse for other backend services in the future. We wanted to be on the JVM due to its wide adoption, native threading model, and maturity as a production system. We chose Scala over Java for its concise syntax and for its flexible language design, which combines functional and object-oriented paradigms.
Originally, we used sbt to build Scala code. This worked great for the early days of Scala at Asana: only a single team building our new data loading service needed to write Scala, and our codebase was a manageable size.
While we rewrote our backend in Scala, we also began rewriting our frontend in TypeScript. We wanted to find a build system that could build both languages (including cross-language dependencies like Scala bindings for data model types defined in TypeScript). So in 2015, we adopted Bazel and wrote our own Bazel rules for Scala. We also built a custom dependency tool that we used to generate the dependencies for each project.
We’ve since switched to rules_scala, which lets us take advantage of Bazel persistent workers, and bazel-deps, which generates a single set of third-party dependencies that all our Scala projects share. We wrap our Scala rules with Bazel macros which choose appropriate options to pass to scalac. We default to strict compiler behavior, enabling many compiler warnings and erroring on warnings. When it makes sense, we opt out of this strict behavior. For example, many common Scala testing patterns are incompatible with the flag -Y-warn-value-discard, so we don’t use this flag for test code. We believe that strictness by default and the option to opt out strike the right balance between safety and ergonomics.
When it came time to upgrade from Scala 2.11 to 2.12, our tooling made it easy to cross-compile and incrementally convert our code. We expect to do the same to upgrade to Scala 2.13 soon.
For our core services, we maintain a single list of third-party libraries. We use bazel-deps to generate Bazel targets for these libraries and to resolve version conflicts in their transitive dependencies. This means that all our Scala projects use the same version of the same dependency. This allows us to reuse code across different services without needing to depend on multiple versions of the same library. On the other hand, the versions of all of our dependencies need to be compatible with each other, which can make upgrading dependencies difficult. This is manageable today, but we might need to do something different here in the future.
One of the things we exclude from our main set of third party dependencies is the set of provided dependencies for Spark. We run Spark on Amazon Elastic MapReduce, where the host provides the Spark library. In order to be able to build Spark code with Bazel, we use a java_library with the option neverlink=True, which targets that will eventually run on Spark depend on. The corresponding tests depend on the non-provided version of the dependencies. The Spark library also has dependencies that conflict with our canonical dependencies, so we have a custom Bazel rule that uses jarjar to remap package names (“shade” our dependencies) so that our code can depend on multiple versions of the same library.
As we’ve grown, more and more engineers have needed to write code in Scala. Many of these engineers are primarily experienced with writing TypeScript code for our web app, and experience slower velocity writing code in a different language on the server. We want to make it quick and easy for these engineers to write server code so that they can choose whether it is best for their code to live on the client or server without worrying about how hard it will be to write.
In order to facilitate this, we’re planning to allow engineers to write server-side business logic in TypeScript, while continuing to utilize Scala for our core server infrastructure. This allows us to continue to get the benefits of Scala for server-side development (such as performance and ease of refactoring) while allowing most product engineers at Asana to write code in a single language. This is an ambitious project and a big framework investment, but we believe that it will substantially increase our product development velocity.
Today, Scala is our language of choice for writing backend services. As our engineering team has grown and we’ve learnt from our mistakes, we’ve changed how we use Scala. In the future we will need to continue to do so. If you would be excited to help with that, we’re always looking to hire engineers who can improve our Scala code base further.
Special thanks to Mark Chua, Andrew Budiman, Kate Reading, and Steve Landey