December 13, 2021
It’s easy to take continuous integration (CI) and continuous delivery/deployment (CD) for granted these days, but these have been transformational concepts that have drastically changed the face of software development over the past thirty years. While many have called for stronger adherence to software development best practices in machine learning (ML) and artificial intelligence (AI) as well, today’s ML practitioners still lack simple tools and workflows to operate the ML deployment lifecycle on a level on par with software engineers. This article takes a trip down memory lane to explore the benefits of the CI/CD toolset and the detriment of their unfortunate absence in today’s ML development lifecycle.
Continuous Integration is largely credited to the works of Grady Brooch (Brooch Method) and Kent Beck (Extreme Programming -- sadly, never made it into the X Games) in the early '90s. If we can teleport back to that time period, strap on some flannel and ripped Levi jeans, chug some crystal clear Pepsi, and jam out to Nirvana’s Nevermind, we’d find a burgeoning tech industry that’s prime for explosion due to the onset of the World Wide Web and affordable personal computers. It’s a great time to be a software developer, as your skills are in high demand, your employer is flush with cash (let us not forget the fun times leading up to the dot-com bubble), and there’s very little penetration of software into most industries, so stumbling upon a million-dollar idea is only slightly more challenging than shooting fish in a barrel. What is quite challenging, however, is writing the software. Or, at least writing software that works.
The life of a software developer before CI/CD was extremely fraught due to a large gap between development and production. In particular, one could easily develop code locally in an IDE, but getting the code into “production” was often a time-consuming and patience-testing experience. Software is, to date, the most complicated thing that mankind has created, and when multiple people are simultaneously collaborating on the most complicated thing that has yet been created, there’s bound to be a lot of issues that arise. For example:
In reality … it’s actually a miracle we ever got any software running in this era. This doesn’t even get into the nightmare of deploying updates to the software. It was not unusual for software to only get updates once a year, or even more infrequently, and contain so much code change that it was inevitably full of bugs. We’re also assuming working with the source repository is an easy, error-free experience, which also hasn’t always been the case.
Brooch & Beck developed more sensible practices that laid the foundation for CI. A key observation is that the longer you develop your branch away from the main, or “master,” branch, the more difficult it will be to merge in your changes. As such, it is ideal to instead focus on much smaller changes, where each one is not affecting the system much at all and is easier to merge. Developers should do this many times a day or “continuously”. The CI server is then responsible for compiling code, running all tests, and, in continuous deployment, building artifacts and deploying them to production. Once software engineers have committed their code into their source repository the CI/CD system automates everything else. Gone are the days of laborious manual processes and software engineers now can focus much more time on just writing code.
It took roughly two decades for CI/CD to really bake into the ecosystem. CruiseControl (2001) was an early open source CI tool, and by 2011 things were pretty mainstream with the release of Jenkins and CircleCI and more or less convergence upon accepted best practices:
In the end, CI/CD gives software engineering teams a sane way to build product that minimizes the amount of “not fun” stuff you have to do: merge conflicts, debug dependency issues, fix broken builds, etc. It’s (relatively) easy to pinpoint what is wrong and why, resolutions are (relatively) quick, and this also opens up huge productivity and quality of life improvements for developers. So much so that any experienced developer would probably not even bother accepting a job at a company that didn’t have these practices already in place (unless, of course, they are engineer #1 at a startup and will be building it out themselves).
If you’re a practitioner of the ML arts, you can likely sympathize with the difficulty of getting something from development into production. The ML ecosystem is ripe with tools that are geared towards enhancing the development process, but few are making a real impact at simplifying the end-to-end production process. In a recent roundtable discussion at dbt’s Coalesce conference, Sarah Catanzaro mused on the current state of ML tooling:
“When I look at the ML ecosystem … there are tools and platforms for everything ranging from distributed training to model monitoring to experiment tracking … and you talk to ML teams they are still really struggling … the only answer can be that the tools are crap … They were not built with developer ergonomics in mind.”
And Sarah is absolutely correct. (In the context of the discussion, dbt is mentioned as the analytics tool that has elevated the analytics workflow into a robust production-grade workflow, and the panel speculates on whether ML will be the next discipline to follow suit -- we think so!) As discussed in a previous article, data scientists are not skilled in production, and software vendors have long pandered to data scientists by giving them cool development tools but not challenging them to actually adopt production best practices. High-tech companies have gone off and built their own systems, but this is not a sustainable strategy for most companies, and finding and hiring unicorn ML engineers to build and maintain such systems can be a tall order.
Which isn’t to say the solution is staring us right in the face. Over the last half-decade I’ve had dozens of conversations with data and ML leaders at fortune 500 companies who have wondered why their ML workflows are so complicated and why no one has essentially made a system that makes ML compatible w/ CI/CD (“I’d love to be able to commit changes to github and be done with it. This works for our software engineers, why can’t it work for our ML Engineers?”). I’ve seen several try to force other tools into this paradigm, mostly with poor results. The fact of the matter is that ML & AI are more complex than pure software systems and you can’t simply force notebooks into CI/CD scripts. The missing component is a platform that easily controls the full ML lifecycle and allows the CI/CD system to automate common ML tasks.
At Continual, our goal from Day 1 has been to provide a simple path to production with a new approach to operationalizing ML. Learn more and get started today at our blog, Getting started with CI/CD and Continual.
Model performance depends on the metric being used. Understanding the strengths and limitation of each metric and interpreting the value correctly is key to building high performing models. In part 1, we cover four evaluation metrics commonly used for regression problems and demonstrate how to use them when building models on Continual.
While many have called for stronger adherence to software development best practices in machine learning (ML) and artificial intelligence (AI) as well, today’s ML practitioners still lack simple tools and workflows to operate the ML deployment lifecycle on a level on par with software engineers. This article takes a trip down memory lane to explore the benefits of the CI/CD toolset and the detriment of their unfortunate absence in today’s ML development lifecycle.