Machine Learning
October 20, 2021
This article is based upon Jordan’s talk at DataOps Summit 2021. If you’re more inclined to watch a video than read an article, you can catch it on-demand here (Free registration is required).
A casual stroll through recent tech headlines in the past few years makes two things abundantly clear: investment in AI is at an all-time high, and companies really struggle to get value out of AI technology. At first glance, these ideas seem to be at odds with each other: why consider investing in a field that hasn’t lived up to the hype? If you dig into the details, you’ll notice that a gap exists between the development and production use of AI in many companies. Simply put, few actually doubt the transformative power of AI, but it’s still very rare to find data teams that have pieced together a production strategy that makes the adoption of AI easy, straightforward, and impactful. This large hurdle is what we’ve come to know as ‘operationalizing AI’. In this blog post, we’ll look at some of the most common reasons why companies fail in their efforts to operationalize AI and propose a path forward for those looking for a light at the end of the tunnel.
A few weeks ago I published a tongue-in-cheek article detailing the ML journey of an average company. A key theme in that piece is that companies struggle to separate research-oriented ML from production and, as a result, the road to operationalizing their AI use cases is long and arduous. In these instances, leadership is very reactive to the needs of their data scientists, rather than being prescriptive about their AI strategy. When the team needs better collaboration, leadership buys an enterprise notebook tool; when they need experimentation tracking, they evaluate experiment trackers; when they want more automation, an AutoML tool is acquired; when they need help with deployment, it’s MLOps; etc. This is a pattern I’ve seen in a lot of companies, and it creates a fire-fighting mentality that lacks a long-term strategic vision around architecture, users and use cases that really helps a business scale AI innovation without restraint. These companies slowly move from one step to the next along their ML Journey, but they are rarely focused on the results of the process; Needless to say, the results are the primary thing that the business cares about! Operationalization is often considered the desired outcome rather than a requirement for entry, and I believe this misstep is what dooms a lot of ML organizations from the start.
It is hard to fault ML leaders for this mistake. A decade ago, few of us were actually working in this field in earnest, and today there is a need for thousands upon thousands of leaders. Few have the natural experience and the requisite backgrounds to make strategic decisions and end up overly leaning upon their team, who are the field experts working day in and day out in the trenches. There’s also no shortage of vendors or consultants with conflicting viewpoints, and there’s definitely no well-trodden path to follow either. ML Leaders have often had to forge their own path, but in the process I see many of them losing sight of the end goal: providing value back to the business.
Next, as companies begin to build out their ML organization, many decisions have to be made about what tooling and processes to use. The natural thing to do is to adopt several open-source systems and try to glue together a coherent workflow for the data science team to follow. This makes sense during the Cambrian explosion of ML tools where experimentation is the name of the game. The issue with this approach is that many of these systems have complex requirements, either via infrastructure, integrations, coding, or all of the above. As a result, ML organizations need to either employ teams of experts to run and manage these systems, or rely on other teams in the company to help out. Additionally, despite the plethora of open source projects in the ML/AI ecosystem (or perhaps because of …), there’s no real consensus on what is the best approach or tooling to take when building your production AI processes. This means it is less of a paint-by-numbers exercise and more a choose-your-own-adventure. Results vary wildly. The ecosystem also evolves so rapidly that any solution fashioned upon it is in constant danger of becoming obsolete as newer tech rises into favor.
In any case, it’s really only the most high-tech of companies that are able to make a bespoke system truly work. For your average enterprise, the complexity, time, cost, and expertise required to build and maintain a system completely undermines the potential of AI to drive business impact.
We could consider buying all the tooling we need instead of building it ourselves, but as I discussed in my overview of ML Platforms, historically Gen 1 and Gen 2 ML platforms have not proven to be significantly more productive from an operational standpoint than building your own system. I’m not one to beat a dead horse, so I’ll refer you to the previous article if you want a deep dive into the analysis. We sincerely hope that the Gen 3 tooling focusing on operationalizing AI will lead to drastically different ROI from the Gen 1 and Gen 2 counterparts.
The majority of people working in data science have little experience with production systems. As such, expecting the team to operationalize their work is perhaps a little unrealistic. A quick look at the course work of data science or machine learning programs reveals that there’s often little to no focus on working outside a notebook, let alone learning how to productionalize one’s work. Similarly, a scan of data science or machine learning job postings usually illustrates that the top skills companies look for are research-oriented and not focused on using or maintaining production systems.
On the other hand, software engineers and devops engineers, who are whizzes at building and running production systems, generally have little experience with machine learning itself. Putting a model in a container might not be too hard, but operationalizing the end-to-end workflow is another matter. What results is a fragmented dynamic where parts of the production ML workflow are locked behind different domains of expertise and it’s rare to find individuals who can grasp the entirety of the solution. Any workflow requires a dance between different roles in an organization to make it into production, and this approach adds unnecessary barriers that make it more complicated and slower to operationalize work than it needs to be. It’s likely outside the scope of responsibilities to expect data scientists to be skilled in production workflows or software developers to be ML experts, which means that many organizations are stuck with inefficient processes.
Despite all various land mines set for ML practitioners, there is compelling evidence that some have managed to streamline their ML workflows and hit a very good stride in reliably getting ML use cases into production. Below we capture several insights from the field that are crucial for finding operational success with ML.
A common trait for every system I’ve encountered that has touted success at moving ML use cases into production has been that it brings people into the ML workflow who understand the company’s business and data well. I think this will intuitively make sense for anyone who has worked in the data science field for any company. Data Scientists tend to be earlier in their careers, with less knowledge of the inner workings of the company, less experience with the company’s data sets, and more concern over the quality of their data science work than the outcomes it drives for the business. Meanwhile, the company has likely employed other data workers -- business analysts, data analysts, data engineers, analytics engineers, etc -- for years or decades. These people understand the business and the company’s data well, and likely have a much better understanding that the key to success is doing something that impacts the business.
The goal in constructing an operational AI system is to empower the users who understand our data the best. By doing so, we’re democratizing the ML workflow and allowing more users to come in and contribute their knowledge. When you design a system that allows participants to use it at their level, with their skill set, and to benefit from automation to accelerate time to value, you'll find the system has a lot more participants than expected. By lowering the complexity of the system we’re actually increasing the availability of the system to a wider audience who are able to manifest their years of experience and expertise into business insights.
This is not to say that data scientists are not useful. Far from it. In an ideal system, data scientists are able to build a foundation for others to leverage and oversee and approve work from less ML-savvy peers. Anyone who has worked on a ML use case knows that there are numerous trap doors along the path, from fighting data leakage and handling imbalanced data sets to removing bias and multicollinearity. Data scientists understand these problems well and should be able to easily remedy them and refresh workflows quickly. From the data scientist’s standpoint, the goal of the operational system should be to automate all the work they would do to evaluate the performance/validity of models/data in the system, allow them to easily review and put their stamp of approval on a product, and refresh workflows with minimal intervention.
Not to be confused with the Pareto Principle, the 95/5 Rule states that you can categorize 95% of an organization’s ML use cases as known or solvable, and 5% as difficult or impossible. The 95% of use cases also encompasses many of the business-critical use cases like churn, forecasting, fraud, market optimization, personalization, etc. These are use cases nearly every business has and should have a strategy for operationalizing.
The “hard” part of these use cases is not figuring out which ML technique to apply – there are literally hundreds, if not thousands, of articles online that will walk you through the process – but rather, how to apply it to your business’s data. While the majority of ML systems focus on research-oriented development work, this manual approach to ML is ill-suited for the 95%. The goal should be to reduce friction in operationalizing these use cases as much as possible. Not only does this deliver value to the business faster, but it also frees up the data science team so they have more time to focus on the difficult 5% of problems, which actually do require a more hands-on approach.
The key takeaway here is that companies often spend most of their time/resources building AI systems that are actually best suited to tackling a mere 5% of their problems. It’s either inefficient or overkill for the vast majority of their use cases, and they should instead think more about how best to operationalize the 95% of use cases that are already pretty well known.
Our next tip is something to avoid instead of something to do: avoid creating Frankenstein solutions. Frankenstein solutions stitch together open-source and commercial tools to build a highly customized platform. These can look attractive at first, as they appear to cross off many checkboxes in your list of requirements, but in reality the technical debt associated with these, in terms of cross component integration and pipeline construction, is too high for most companies. Your business should be focused on solving mission-critical use cases as quickly as possible, not troubleshooting Spark-on-Kubernetes jobs. Do more with less, and expect your data tooling to make your life easy -- this is essentially the credo of the modern data stack.
It is also becoming more difficult to work in the field and ignore what’s happening around compliance and regulation. If we read the tea leaves, it’s sensible to conclude that the industry will become more regulated as time goes on, and there are probably not many (any?) modern countries that will not impose some sort of restrictions around the use of AI. As someone who has previously worked with technology in highly regulated industries, I can confidently say that Frankenstein solutions are absolutely a nightmare to work with once regulations go into effect. The absolute last thing you want to worry about is whether or not you’ll be able to meet the demands of regulators when they come to audit your systems. Imagine being unable to reliably reproduce a model that was used in the past or not fully understanding what data was used as input for certain predictions. This is just the tip of an iceberg, but when your operational stack is loosely stitched together, you can be subject to hefty penalties, not to mention bad press. It’s no coincidence that regulated industries often stick with end-to-end technologies, and I think the forward looking data leader will see that and recognize what’s coming down the pipe.
As a discipline, software development has been around for decades, and there are many great tips to take from the field that you can apply to your data team. Too many to list in a short article, in fact, so I’ll just provide a couple that I find particularly relevant here. Note: this is not trying to advocate that people should consider machine learning as a software engineering problem. The two fields are different in many ways, and I think this is a common mistake that some make. Nonetheless, there are lots of great things that software engineers do well that we can also apply to machine learning teams, such as:
We’ve provided a lot of tips on operationalizing AI in your organization, without getting too prescriptive in the exact solutioning. At Continual, we believe this all leads to a future where declarative approaches to operational AI replace the pipeline jungles of today. It's still early days, but declarative operational AI has a similar transformative potential as declarative approaches to analytics (SQL and dbt), infrastructure (Terraform and K8s), and data integration (Fivetran, Census, or Hightouch). We've written more about this in a previous blog and launch post.
So what’s Continual, exactly? Continual offers the first declarative operational AI platform codesigned with the modern data stack and built specifically for modern data and analytics teams. Unlike traditional ML engineering platforms which require complex engineering to operationalize even simple models, Continual sits directly on top of your existing cloud data warehouse and provides a declarative workflow that empowers any data professional to go from ideation to production in a day or less. Unlike point-and-click AI tools, Continual is a system that understands the intricacies of operationalizing AI and doesn’t sacrifice any of the rigor of traditional data science practices. If you’re looking for a new approach to operationalizing AI that empowers your entire team, get in touch to start a trial of Continual and see exactly what we’re talking about.
Model performance depends on the metric being used. Understanding the strengths and limitation of each metric and interpreting the value correctly is key to building high performing models. In part 1, we cover four evaluation metrics commonly used for regression problems and demonstrate how to use them when building models on Continual.
While many have called for stronger adherence to software development best practices in machine learning (ML) and artificial intelligence (AI) as well, today’s ML practitioners still lack simple tools and workflows to operate the ML deployment lifecycle on a level on par with software engineers. This article takes a trip down memory lane to explore the benefits of the CI/CD toolset and the detriment of their unfortunate absence in today’s ML development lifecycle.