Common reasons data science projects fail and how PMs save the day.
In 2019 year Gartner predicted that 80% of AI projects wouldn’t scale to provide business value – that the data science projects would fail. While numerous other posts (see here, here, and here) have provided differing reasons for data science failure, they mask the underlying cause: the lack of a data science product manager.
Most project failures in any business can be boiled down to two main causes: a failure of personnel or a failure of process (however, failures of process are typically caused by failures of personnel). As Ray Dalio stated in The Principles, “Getting the right people in the right roles in support of your goal is the key to succeeding at whatever you choose to accomplish.” Data science projects are no different. In this article I discuss some of the common reasons data science projects fail and how data science product managers are key to preventing these failures.
You Can’t Fail if You Don’t Try
Often business leaders in large organizations don’t have the technical expertise needed to identify valid data science projects. An overworked, overtired business leader identifies a semi-technical problem with a good amount of data behind it. Or, maybe they’re looking to automate an extremely repetitive process like visual quality control, and they immediately think that it’s a good data science project.
In projects that fail, the business leader asks the data science team to work on it, and the team jumps at the opportunity to be useful without appropriately scoping the project. Or, the data science team scopes the project, but since they lack a clear understanding of the business problem and the value they’re trying to unlock, they agree to a project with a much lower-value solution in mind. Either way, the project produces underwhelming results, and everyone is upset.
A good data science product manager prevents project failure by rejecting data science projects that are doomed to failure before they’re even started. The DSPM can explain to the business leader why a problem is technically difficult to solve without being overly technical. If a data science solution is possible, the DSPM can also clarify what business value the business leader expects to unlock and ensure that the solution actually provides that value. In this way, data science product managers prevent data science project failures before they even get started.
Keep the Scope from Creeping
So your team managed to align with the business leader on a project that is viable. You spend a month or two working on it and are nearing the finish line when the business leader says, “Wait, just one more thing…” before you know it, another few months have gone by and the end is nowhere in sight, the current problem you’re trying to solve looks nothing like the original problem, and chances are any progress you’ve made reminds you a little bit of a Jenga tower – haphazard and ready to collapse on itself at any second.
What happened? Your scope just creeped. A good data science product manager prevents this from happening by setting up a strict scope of work and holding the business leader accountable to that scope. If the requirements change, the DSPM decides whether or not the project is still viable. Good scoping and driving consensus on project outcomes up front should prevent scope creep anyway: another critical role of the DSPM.
Where’s the Data?
Great, so you actually have a well scoped problem that, at least from a technical standpoint, isn’t too difficult. You start off to work on it… until you realize that there’s no labeled training data, or if there is it’s tucked in a dozen different databases and even excel files. Yikes. Are you really going to ask data scientists to run around the office to various departments, asking who owns what DBs and what the best way to pull X is? Do the data scientists even know where to look or who to ask? Again, this is where a good data science product manager comes in to save the day. Since the DSPM is connected to the business they already have a good idea of where to look and who to ping. The DSPM can schedule meetings and data wrangle, keeping the data scientists focused on coding and training algorithms and creating value for the business.
Show Me the Money
In all of these examples, the Data Science Product Manager’s underlying goal is to ensure that the data science team is providing business value. Focusing and prioritizing business value is a skill, often one that data scientists (particularly junior data scientists) lack.
When companies recruit for data scientists they typically look for applicants coming straight from PhD and Master’s programs. While academics want to work on tough, intellectually and technically challenging problems, rarely ever are they faced with answering the question: what underlying value makes this problem worth solving? In fact, academics are often more likely to value a problem based on its intrinsic difficulty vs. any sort of extrinsic value. (I’m not saying this is a problem with academia, only a reality of the difference between academia and business).
To put it simply, the people most companies hire to be data scientists are more interested in solving hard problems than actually providing business value.
Data science product managers, on the other hand, are more interested in solving valuable problems, and prioritize easy ‘boring’ problems that are valuable over moonshot, high risk, ‘sexy’ problems. Your data science projects are much less likely to fail when you have a data science product manager ensuring that you’re actually working on valuable, solvable problems vs. a theoretical proof of concept that your team wants to present at a conference next year. *
*That’s not to say that I don’t think presenting at conferences is valuable too–only that you should focus on providing business value, then writing a paper about it, instead of working on an academic pursuit and then trying to justify it by finding business value.
Are We Even Speaking the Same Language?
Data science product managers make data science projects more successful by acting as a translator between the data science team and business stakeholders. When data science projects fail, it’s often due to a breakdown in communication between the data science teams and their stakeholders: just look at the examples above. If the data science team doesn’t properly understand the problem they’re trying to solve — the business value they’re trying to unlock — then they’re setting themselves up for failure. Similarly if the business leader doesn’t understand how difficult the problem is from a technical perspective, or can’t explain the business problem in a way the data science team can understand, then the project is doomed.
A good data science product manager is a translator: explaining difficult technical concepts in a way business stakeholders can understand, while also being able to convey business goals and problems to someone who has very little domain knowledge.
Who Makes a Good Data Science Product Manager?
So, you should start writing a job description for a data science product manager immediately, right? Well, maybe. The fact is that if you’re working on data science projects, you already have someone (or multiple people) performing the duties of a DSPM, just without the credit, accountability, or expertise. Often a data science manager or business analyst will play the role of the DSPM, but particularly in smaller companies or start ups the data scientists end up pulling double duty.
If you are at a smaller company, being a data science product manager isn’t necessarily a full time role, but it should be an explicit part of someone’s responsibilities, with clear expectations for performing duties related to data science product management.
If you decide you should hire a full time DSPM, here are some of the characteristics you should look for:
- A technical understanding of data science concepts. They don’t need to be able to code in python, but they should be able to understand which models are most applicable to certain situations (ie CNN’s for image classification). They should have a relative intuition for the quantity and quality of training data needed for problems their team is working on.
- Enough knowledge to propose workarounds / alternative
- The ability to articulate technical concepts to a nontechnical audience.
- The ability to prioritize opportunities based on business value.
- The ability to communicate why projects are important or valuable to technical teams that might not have the immediate business context.
- Intermediate SQL skills so that they’re able to pull and evaluate datasets on their own.
- An understanding of how the technical problems the data science team is solving will solve business problems.
In summary, a data science product manager needs to be able to straddle data science and the rest of the business, to have a foot in each camp and live in both worlds simultaneously.
Data science projects can fail for many reasons; solving the wrong problems, failing to gather the appropriate training data, and scope creep are just some of them. Luckily, all of these problems are preventable if you have your data science team set up correctly, and if you have the right person successfully performing the duties of a data science product manager.