The responsibilities of a data scientist or machine learning engineer can vary tremendously depending your industry, the company you work for, the type of projects you typically work on and what stage of your career you are at. An important and, I believe, commonly overlooked skill that is key to master if you want to progress in your data science career however is ‘teaching the client’.
I know that there will be many ways of interpreting what I mean by this so I am going to focus on a very specific problem often faced by data scientists who have to map out the problem with clients (either internal or external):
How do you communicate the ideas, concepts and potential utility of machine learning and AI to non-experts in a way that:
- Empowers them to make decisions based on fact and not hype,
- Helps them understand the what is required to successfully implement machine learning and AI,
- Manages their expectations.
This is no easy task, but in this article I am going to share some of the things I have learned from discussing, designing and executing machine learning projects with a variety of clients and managers. Hopefully there will be something in my experience that you can apply to your own work!
WTF is AI, WTF is ML?
The best place to start with someone who has heard only tangentially about machine learning or AI is to try and define it for them. The key is not to go all heavy in jargon like these from wikipedia:
“ In computer science AI research is defined as the study of “intelligent agents”: any device that perceives its environment and takes actions that maximize its chance of successfully achieving its goals. Colloquially, the term “artificial intelligence” is applied when a machine mimics “cognitive” functions that humans associate with other human minds, such as “learning” and “problem solving”
“Machine learning is a field of computer science that uses statistical techniques to give computer systems the ability to “learn” (e.g., progressively improve performance on a specific task) with data, without being explicitly programmed.”
That’s not to say there is anything wrong with these descriptions, they are just a bit … academic. Some of the ways I’ve tried to explain machine learning and AI to clients is by saying something like the following:
“You can think of the field of AI as simply the study of how we make computers behave, act and solve problems like humans or animals. Machine learning is a subset of AI where algorithms don’t have to be programmed with hard coded rules (‘if this is the case, then do that’) in order to solve a problem but they work out how to solve specific types of problem through exposure to data.”
Now admittedly this is quite similar to the above, but I feel that it’s a bit less intimidating for a non-scientist or someone new to the field (as most business managers and clients are likely to be!).
Another great tactic is to give examples/concrete use cases. For example, imagine you want to predict the number of umbrellas that you are going to sell in a shop because you want to make sure you have enough stock. You have a look at the data and it’s clear that when it rains, more umbrellas are sold. So you hard code a rule in your system that says ‘if weather forecast predicts rain, order 100 umbrellas’. That’s a very basic way to solve the problem but a client will clearly understand this as your “baseline”. You can then tell them that if you wanted to do this with machine learning, you take the data and tell a machine learning algorithm that the number of umbrella’s is the “target” and the other data are what can be used for predicting this target. The algorithm then takes in the data and produces a model so that when you feed it similar data to what you’ve fed it before (for example the day of the week, the weather forecast and how many umbrellas you sold over the past week) then it makes a prediction with a given accuracy. The first is you hard coding a solution, the second is a machine learning or data science solution.
This is a bit contrived, but I feel that a concrete example like this covers a lot of technical topics, like “targets”, “covariates/features”, “training/testing”, “lift” and even “autoregression” without splattering jargon everywhere. An example like this can be used in many contexts, and most people will get the transferability to any other prediction problem.
Can we get one of them AlphaGo’s? Managing expectations
The next thing you have to do is educate your client or partner on the reality of all of this cool capabilities. A lot of people will have seen AlphaGo beat Lee Sedol or computer vision software successfully count the number of faces in a crowd and think that your problem must be easy. It is very rarely the case.
To manage expectations, just tell the client what you think is feasible given what you know about their data and set up. If you have successfully completed projects with similar starting points before then you are onto a winner, since you can draw quite heavily on this experience and use it as an excellent example. Don’t panic if you haven’t though, managing expectations is still very doable.
First, always make sure people are aware that machine learning is really centred around doing one of the following:
- Classifying something — telling you what something is,
- Predicting something — telling you what is likely to happen,
- Grouping something — pointing out things which are similar.
The optional 4th point to add to this list is ‘Solving something — acting intelligently to achieve a goal’ (read ‘reinforcement learning’), but if the client is new to ML this could create more confusion than is necessary at this point.
Using these stripped back and simplified definitions of classification, regression and clustering then the client can hopefully see a bit more clearly mind than what ML is actually useful for, and they can reign in their expectations accordingly. They can also then hopefully see (with your guidance) that if an algorithm is good at classifying something (e.g computer vision counting faces), then it doesn’t mean the same algorithm can ‘predict something’ (forecast umbrella sales). This helps to highlight the fact that its ‘horses for courses’ when it comes to ML and there is ‘no free lunch’.
Secondly, always bring it back to the ultimate goal, which is to solve a given (business) problem for the minimum amount of investment (time, energy, money). If it is not going to be necessary to predict the number of umbrellas to 99% accuracy every day then it isn’t even worth thinking about. Reiterate the Pareto principle that in many cases ‘80% of the effects arise from 20% of the causes’, so past a certain point you’ll only get small gains for a lot more effort.
Finally, to help reign in expectations, it is sometimes important to highlight that when some company makes a big announcement about some new amazing machine learning system ,they are only showing you the sparkling whites and not their dirty laundry. Any of these highly publicised systems (computer vision as a service systems from cloud providers are one particular case) will have areas where it doesn’t apply, can produce erroneous results and often will have been the result of an army of people with a lot of resources focussed on producing this one particular tool. This doesn’t mean you can’t solve your client’s problem, it just means that they have to be aware that machine learning projects are like any other project, more ambitious goals will require more resource. It’s that simple.