After many months of writing, developing, designing and researching after work and at weekends, my book is now available for pre-order on Amazon!
I'm really excited as I've always wanted to write books and I had set myself a goal of writing at least one book by the time I was 30. So as long as the book is fully available for order in October as planned I would say challenge completed (my birthday is at the end of November).
Why did I write this book?
I have been working for a few years in a variety of roles in data science and machine learning (ML), and my career has now strongly went in the direction of focussing on productionization. A totally made up word that we all now use to mean 'taking ML proof-of-concepts and making them into working software solutions'. I've found this to be the hardest problem in industrial data science and machine learning, or at least the hardest problem that comes up often enough to justify focussing on it.
So given this focus and how much I know teams can struggle to understand some of the things they need to go from cool idea or draft model through to working solution, I decided to write this book. It's by no means perfect, but I hope it's a good collection of some of the ideas, tools and techniques that I think most important when it comes to ML engineering.
What's in it?
The book consists of 8 chapters:
- Introduction to ML Engineering
- The Machine Learning Development Process
- From Model to Model Factory
- Packaging Up
- Deployment Patterns and Tools
- Scaling Up
- Building an Example ML Microservice
- Building an Extract Transform Machine Learning Use Case
The book will kick off with more strategic and process directed thinking. This is where I talk about what I think ML engineering means and what is so different about building ML solutions vs traditional programming.
We then move onto learning about how to create models again and again by building training services and then how to monitor models for important changes like concept or data drift. I then discuss strategies for triggering retraining of your models and how this all ties together.
Moving on there's more of an emphasis about some important foundational pieces like how to create good Python packages that wrap your ML functionality and what architecture patterns you can build against.
The last piece of the book focusses on deep dives on a specific topic covers mechanisms for scaling up your solution, with a particular focus on Apache Spark and serverless infrastructure on the Cloud.
Finally, the book finishes with 2 chapters on worked examples that bring together a lot of what's been discussed earlier in the book, with a particular focus on how to make the relevant choices to be successful when executing a real-life ML engineering project.
First of all, I'm just super excited this is real. As I mentioned at the top of the article this has been a dream of mine for a long time and I think the topics are important ones to discuss. Hopefully the book can help people in data science, software development, machine learning and analytics roles be successful. That would make me happy too.
In terms of what's next I'm thinking that it would also be beneficial to expand the topics of the book into an online course. That way, people who would like the material structured in a slightly different way can also get the benefit. It would also allow me to expand on some of the material a bit more, giving a more conversational flavour to the material. I like the sound of that, hopefully you do too!
All in all, I just hope people enjoy the book and get benefit from it. I benefited immensely from technical books in this space when I was starting out (and I still do) so I'm really glad I can make my own small contribution to that body of learning.