Operationalizing AI: Lessons from the Field

Machine Learning

October 20, 2021

This article is based upon Jordan’s talk at DataOps Summit 2021. If you’re more inclined to watch a video than read an article, you can catch it on-demand here (Free registration is required).

A casual stroll through recent tech headlines in the past few years makes two things abundantly clear:  investment in AI is at an all-time high, and companies really struggle to get value out of AI technology. At first glance, these ideas seem to be at odds with each other: why consider investing in a field that hasn’t lived up to the hype? If you dig into the details, you’ll notice that a gap exists between the development and production use of AI in many companies. Simply put, few actually doubt the transformative power of AI, but it’s still very rare to find data teams that have pieced together a production strategy that makes the adoption of AI easy, straightforward, and impactful. This large hurdle is what we’ve come to know as ‘operationalizing AI’. In this blog post, we’ll look at some of the most common reasons why companies fail in their efforts to operationalize AI and propose a path forward for those looking for a light at the end of the tunnel. 

Why Companies Struggle to Operationalize AI

1. Not Focusing on Results

A few weeks ago I published a tongue-in-cheek article detailing the ML journey of an average company. A key theme in that piece is that companies struggle to separate research-oriented ML from production and, as a result, the road to operationalizing their AI use cases is long and arduous. In these instances, leadership is very reactive to the needs of their data scientists, rather than being prescriptive about their AI strategy. When the team needs better collaboration, leadership buys an enterprise notebook tool; when they need experimentation tracking, they evaluate experiment trackers; when they want more automation, an AutoML tool is acquired; when they need help with deployment, it’s MLOps; etc. This is a pattern I’ve seen in a lot of companies, and it creates a fire-fighting mentality that lacks a long-term strategic vision around architecture, users and use cases that really helps a business scale AI innovation without restraint. These companies slowly move from one step to the next along their ML Journey, but they are rarely focused on the results of the process; Needless to say, the results are the primary thing that the business cares about! Operationalization is often considered the desired outcome rather than a requirement for entry, and I believe this misstep is what dooms a lot of ML organizations from the start. 

The ML journey of a typical enterprise. Operationalization should be inherent to the process, not a desired end state.

It is hard to fault ML leaders for this mistake. A decade ago, few of us were actually working in this field in earnest, and today there is a need for thousands upon thousands of leaders. Few have the natural experience and the requisite backgrounds to make strategic decisions and end up overly leaning upon their team, who are the field experts working day in and day out in the trenches. There’s also no shortage of vendors or consultants with conflicting viewpoints, and there’s definitely no well-trodden path to follow either. ML Leaders have often had to forge their own path, but in the process I see many of them losing sight of the end goal: providing value back to the business. 

2. Building Your Own System is Challenging

Next, as companies begin to build out their ML organization, many decisions have to be made about what tooling and processes to use. The natural thing to do is to adopt several open-source systems and try to glue together a coherent workflow for the data science team to follow. This makes sense during the Cambrian explosion of ML tools where experimentation is the name of the game.  The issue with this approach is that many of these systems have complex requirements, either via infrastructure, integrations, coding, or all of the above. As a result, ML organizations need to either employ teams of experts to run and manage these systems, or rely on other teams in the company to help out. Additionally, despite the plethora of open source projects in the ML/AI ecosystem (or perhaps because of …), there’s no real consensus on what is the best approach or tooling to take when building your production AI processes. This means it is less of a paint-by-numbers exercise and more a choose-your-own-adventure. Results vary wildly. The ecosystem also evolves so rapidly that any solution fashioned upon it is in constant danger of becoming obsolete as newer tech rises into favor. 

In any case, it’s really only the most high-tech of companies that are able to make a bespoke system truly work. For your average enterprise, the complexity, time, cost, and expertise required to build and maintain a system completely undermines the potential of AI to drive business impact.  

3. Vendor ML Platforms Often Don’t Help 

We could consider buying all the tooling we need instead of building it ourselves, but as I discussed in my overview of ML Platforms, historically Gen 1 and Gen 2 ML platforms have not proven to be significantly more productive from an operational standpoint than building your own system. I’m not one to beat a dead horse, so I’ll refer you to the previous article if you want a deep dive into the analysis. We sincerely hope that the Gen 3 tooling focusing on operationalizing AI will lead to drastically different ROI from the Gen 1 and Gen 2 counterparts. 

4. Data Scientists Aren’t Skilled in Production

The majority of people working in data science have little experience with production systems. As such, expecting the team to operationalize their work is perhaps a little unrealistic. A quick look at the course work of data science or machine learning programs reveals that there’s often little to no focus on working outside a notebook, let alone learning how to productionalize one’s work. Similarly, a scan of data science or machine learning job postings usually illustrates that the top skills companies look for are research-oriented and not focused on using or maintaining production systems.

On the other hand, software engineers and devops engineers, who are whizzes at building and running production systems, generally have little experience with machine learning itself. Putting a model in a container might not be too hard, but operationalizing the end-to-end workflow is another matter. What results is a fragmented dynamic where parts of the production ML workflow are locked behind different domains of expertise and it’s rare to find individuals who can grasp the entirety of the solution. Any workflow requires a dance between different roles in an organization to make it into production, and this approach adds unnecessary barriers that make it more complicated and slower to operationalize work than it needs to be. It’s likely outside the scope of responsibilities to expect data scientists to be skilled in production workflows or software developers to be ML experts, which means that many organizations are stuck with inefficient processes. 

A Path for Succeeding at Operational AI 

Despite all various land mines set for ML practitioners, there is compelling evidence that some have managed to streamline their ML workflows and hit a very good stride in reliably getting ML use cases into production. Below we capture several insights from the field that are crucial for finding operational success with ML.  

1. Empower Those Who Understand Your Business 

A common trait for every system I’ve encountered that has touted success at moving ML use cases into production has been that it brings people into the ML workflow who understand the company’s business and data well. I think this will intuitively make sense for anyone who has worked in the data science field for any company. Data Scientists tend to be earlier in their careers, with less knowledge of the inner workings of the company, less experience with the company’s data sets, and more concern over the quality of their data science work than the outcomes it drives for the business. Meanwhile, the company has likely employed other data workers -- business analysts, data analysts, data engineers, analytics engineers, etc -- for years or decades. These people understand the business and the company’s data well, and likely have a much better understanding that the key to success is doing something that impacts the business. 

The goal in constructing an operational AI system is to empower the users who understand our data the best. By doing so, we’re democratizing the ML workflow and allowing more users to come in and contribute their knowledge. When you design a system that allows participants to use it at their level, with their skill set, and to benefit from automation to accelerate time to value, you'll find the system has a lot more participants than expected. By lowering the complexity of the system we’re actually increasing the availability of the system to a wider audience who are able to manifest their years of experience and expertise into business insights. 

By lowering the skills necessary to participate in the ML workflow, you’ll find more users who are able to contribute. 

This is not to say that data scientists are not useful. Far from it. In an ideal system, data scientists are able to build a foundation for others to leverage and oversee and approve work from less ML-savvy peers. Anyone who has worked on a ML use case knows that there are numerous trap doors along the path, from fighting data leakage and handling imbalanced data sets to removing bias and multicollinearity. Data scientists understand these problems well and should be able to easily remedy them and refresh workflows quickly. From the data scientist’s standpoint, the goal of the operational system should be to automate all the work they would do to evaluate the performance/validity of models/data in the system, allow them to easily review and put their stamp of approval on a product, and refresh workflows with minimal intervention. 

2. Build for the 95/5 Rule

Not to be confused with the Pareto Principle, the 95/5 Rule states that you can categorize 95% of an organization’s ML use cases as known or solvable, and 5% as difficult or impossible. The 95% of use cases also encompasses many of the business-critical use cases like churn, forecasting, fraud, market optimization, personalization, etc. These are use cases nearly every business has and should have a strategy for operationalizing. 

The “hard” part of these use cases is not figuring out which ML technique to apply – there are literally hundreds, if not thousands, of articles online that will walk you through the process – but rather, how to apply it to your business’s data. While the majority of ML systems focus on research-oriented development work, this manual approach to ML is ill-suited for the 95%. The goal should be to reduce friction in operationalizing these use cases as much as possible. Not only does this deliver value to the business faster, but it also frees up the data science team so they have more time to focus on the difficult 5% of problems, which actually do require a more hands-on approach. 

The key takeaway here is that companies often spend most of their time/resources building AI systems that are actually best suited to tackling a mere 5% of their problems. It’s either inefficient or overkill for the vast majority of their use cases, and they should instead think more about how best to operationalize the 95% of use cases that are already pretty well known. 

3. Avoid Frankenstein Solutions

Our next tip is something to avoid instead of something to do: avoid creating Frankenstein solutions. Frankenstein solutions stitch together open-source and commercial tools to build a highly customized platform. These can look attractive at first, as they appear to cross off many checkboxes in your list of requirements, but in reality the technical debt associated with these, in terms of cross component integration and pipeline construction, is too high for most companies. Your business should be focused on solving mission-critical use cases as quickly as possible, not troubleshooting Spark-on-Kubernetes jobs. Do more with less, and expect your data tooling to make your life easy -- this is essentially the credo of the modern data stack.  

It is also becoming more difficult to work in the field and ignore what’s happening around compliance and regulation. If we read the tea leaves, it’s sensible to conclude that the industry will become more regulated as time goes on, and there are probably not many (any?) modern countries that will not impose some sort of restrictions around the use of AI. As someone who has previously worked with technology in highly regulated industries, I can confidently say that Frankenstein solutions are absolutely a nightmare to work with once regulations go into effect. The absolute last thing you want to worry about is whether or not you’ll be able to meet the demands of regulators when they come to audit your systems. Imagine being unable to reliably reproduce a model that was used in the past or not fully understanding what data was used as input for certain predictions. This is just the tip of an iceberg, but when your operational stack is loosely stitched together, you can be subject to hefty penalties, not to mention bad press. It’s no coincidence that regulated industries often stick with end-to-end technologies, and I think the forward looking data leader will see that and recognize what’s coming down the pipe. 

4. Learn from Software Engineering Successes

As a discipline, software development has been around for decades, and there are many great tips to take from the field that you can apply to your data team. Too many to list in a short article, in fact, so I’ll just provide a couple that I find particularly relevant here. Note: this is not trying to advocate that people should consider machine learning as a software engineering problem. The two fields are different in many ways, and I think this is a common mistake that some make. Nonetheless, there are lots of great things that software engineers do well that we can also apply to machine learning teams, such as: 

  1. Agree on tooling: It’s not uncommon to find a variety of viewpoints on the best languages/tech on the data science team (Python vs R, PyTorch vs Tensorflow, etc), and I have frequently seen enterprise companies try to break down data science teams by their skill sets. Although this approach seems like it encourages creativity and problem-solving, it arguably hinders more than helps, leading to greater dysfunction when it comes time to operationalize ML work. Software engineering teams agree on a common set of tools and technologies used because the advantages in using a common stack greatly outweigh those gained by personal preference. 
  1. Agree on internal processes & deliverables:  It would be utter insanity to join anything but the earliest of software engineering teams and discover that there was no process for versioning, testing, and deploying code to production. This is one of the first things a tech lead will set up, as it’s crucial to have a good foundation for developers to work upon so they can focus on other tasks. Currently, it’s still pretty common for ML teams to lack well defined production processes or an understanding of deliverables outside of notebooks or model artifacts. It’s no wonder that so many companies fail to get value out of their ML efforts when neither the goal nor the path to get there are well known. 
  1. Separate Development and Production: Implicit in the last point is that the software engineering team makes a hard distinction between development and production. In data science, I often see a blurring of these two where teams don’t make a clear distinction between research and results. For example, in development we may use something like a notebook to build a model. Our goal should not be to operationalize the notebook (nor, really, even the model itself), but rather the model and prediction maintenance process. This is what is actually important. The analog in software engineering would be to have production processes based on operationalizing an IDE, which is rather silly. If a software engineer is building a user-facing application, his development process will surely involve using an IDE, running things locally, and iterating quickly. In production, he will likely do no more than commit his code into a code repository and have automated processes build, test, and update the application.  
This is funny because it’s absurd. Don’t do this if you like having a job. 
  1. Focus on Automation: For many software development teams: production is automated. I’ve known companies who have rules in place that no humans can log into production systems and everything must happen by automated scripting. For data scientists, automation is sometimes a scary concept: you hate the idea of your job being automated away and you can’t trust that good decisions are happening without your approval. For software engineers, automation is essential because it produces reliable, consistent results. Data scientists need to have this approach to operationalizing their work. Any review/validity checks need to be done prior to production (i.e. in development), and the production environment needs to focus on getting the system updated as quickly as possible. 
  2. Make things easy (so teams can focus on what they are good at): You’d be hard-pressed to find software engineers who love spending time maintaining production environments. What they love doing is building new stuff (let builders build). It’s of no surprise that they automate away most of the parts of the system that they don’t want to deal with so that they can spend more time doing what they love. Data scientists should take a similar approach towards operationalizing ML. Having a robust operational system at your fingertips means that you’re going to save hours and hours every week on these tasks that you can now allocate to thinking of a new predictive feature or starting new ML use cases—which is hopefully the fun part of being in data science. 

Declarative AI with Continual

We’ve provided a lot of tips on operationalizing AI in your organization, without getting too prescriptive in the exact solutioning. At Continual, we believe this all leads to a future where declarative approaches to operational AI replace the pipeline jungles of today.  It's still early days, but declarative operational AI has a similar transformative potential as declarative approaches to analytics (SQL and dbt), infrastructure (Terraform and K8s), and data integration (Fivetran, Census, or Hightouch). We've written more about this in a previous blog and launch post

So what’s Continual, exactly?  Continual offers the first declarative operational AI platform codesigned with the modern data stack and built specifically for modern data and analytics teams.  Unlike traditional ML engineering platforms which require complex engineering to operationalize even simple models, Continual sits directly on top of your existing cloud data warehouse and provides a declarative workflow that empowers any data professional to go from ideation to production in a day or less. Unlike point-and-click AI tools, Continual is a system that understands the intricacies of operationalizing AI and doesn’t sacrifice any of the rigor of traditional data science practices. If you’re looking for a new approach to operationalizing AI that empowers your entire team, get in touch to start a trial of Continual and see exactly what we’re talking about. 

Sign up for more articles like this

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Machine Learning
Choosing the Right Evaluation Metric

Model performance depends on the metric being used. Understanding the strengths and limitation of each metric and interpreting the value correctly is key to building high performing models. In part 1, we cover four evaluation metrics commonly used for regression problems and demonstrate how to use them when building models on Continual.

Feb 2, 2022
Machine Learning
Where's the CI/CD in ML?

While many have called for stronger adherence to software development best practices in machine learning (ML) and artificial intelligence (AI) as well, today’s ML practitioners still lack simple tools and workflows to operate the ML deployment lifecycle on a level on par with software engineers. This article takes a trip down memory lane to explore the benefits of the CI/CD toolset and the detriment of their unfortunate absence in today’s ML development lifecycle. 

Dec 13, 2021
Book a demo