Boss n' Data Podcast Appearance - 28th Oct 2022
Book Review: Comet for Data Science
Another fun book review!
Shifa Ansari and Packt were kind enough to send me a copy of Angelica Lo Duca's 'Comet for Data Science' recently. I've been making my way through it and deep diving into the bits I'm particularly keen on. I think Comet is a great tool and a nice piece of the MLOps stack.
- Lots of detail, really helpful in reading through the examples
- Has some nice touches like the treatment of feature engineering steps in pipelines as their own 'models' you should version and track (I'm stealing this!).
- Chapter 1-5 are really great for classic data science workflows, including covering how Comet can help with those presentations on model performance. Chapter 5 is titled 'Building a narrative in Comet', which is really helpful.
- Chapters 6 and 7 have some really good sections on DevOps, MLOps and how you can use Comet with CI/CD processes in GitLab (I am also stealing this!). Also an introduction to Kubernetes which was great to see. To be honest these were my favourite chapters, lots of great stuff in here. Very good chapters for ML and MLOps engineers.
- Chapters 8-11 have nice worked through examples to bring things to life, including examples with NLP and deep learning models.
Some other points
The way the chapters are split out (as I described above) does mean that if you are not a pure data scientist, you may not get much from the first parts of the book. Not a bad thing, just something to be aware of.Taken together, I thought it was an extremely useful book to have on the shelf.
Original post here:
MLOps Live write up of episode
Neptune.AI did a cool write up of our episode from a while back, which you can see below.
Driven By Data Podcast Appearance - 10th May 2022
Machine Learning Engineering with Python - Published on 5th Nov!
Not long to go now until Machine Learning Engineering with Python is published on 5th November, which happens to be Bonfire Night in the UK. I look forward to seeing lots of fireworks and pretending its to celebrate my book being finally out!
Ahead of publication I wanted to thank so many of you for your kind words of encouragement and for, of course, supporting the book. I really hope it does help people working in the machine learning space in some small way .
Ahead of publication, I wanted to share a little teaser from Chapter 1: Introduction to ML Engineering, where I talk about what I believe is important to consider when doing machine learning 'in the real world'. I hope you enjoy the snippet and that you enjoy the book!
"The majority of us who work in machine learning, analytics, and related disciplines do so for for-profit companies. It is important therefore that we consider some of the important aspects of doing this type of work in the real world.
First of all, the ultimate goal of your work is to generate value. This can be calculated and defined in a variety of ways, but fundamentally your work has to improve something for the company or their customers in a way that justifies the investment put in. This is why most companies will not be happy for you to take a year to play with new tools and then generate nothing concrete to show for it (not that you would do this anyway, it is probably quite boring) or to spend your days reading the latest papers and only reading the latest papers. Yes, these things are part of any job in technology, and especially any job in the world of machine learning, but you have to be strategic about how you spend your time and always be aware of your value proposition.
Secondly, to be a successful ML engineer in the real world, you cannot just understand the technology; you must understand the business. You will have to understand how the company works day to day, you will have to understand how the different pieces of the company fit together, and you will have to understand the people of the company and their roles. Most importantly, you have to understand the customer, both of the business and of your work. If you do not know the motivations, pains, and needs of the people you
are building for, then how can you be expected to build the right thing?
Finally, and this may be controversial, the most important skill for you being a successful ML engineer in the real world is one that this book will not teach you, and that is the ability to communicate effectively. You will have to work in a team, with a manager, with the wider community and business, and, of course, with your customers, as mentioned above. If you can do this and you know the technology and techniques (many of which are
discussed in this book), then what can stop you?"
Sun, Waves and Carbon - Quick Points on Energy in 2018
A quick look at some of the trends and issue to watch out for in 2018 in energy and environment:
Solar power has been expanding rapidly these past couple of years due to advances in technology and uptake meaning that it is now cheaper than traditional fossil fuels . In fact, the drop in price of solar (and wind) energy infrastructure has been such that a study in late 2017 pointed out that it is actually cheaper to install new solar and wind energy that it is to run (already built) coal and nuclear power plants.
Projections for 2018 suggest that this positive trend will continue, with new global solar installations expected to exceed 100 GW for the first time ever in 2018, with China dominating demand.
I also have to mention that it looks like the first commercial distribution contract for perovskite solar cells (which my PhD is on) has been signed between Saule technologies and Skanska group. This is an exciting step forward for a technology with huge potential which has continually had major questions asked about its commercialisation potential.
Although wind power often gets lumped into discussions about solar (I committed this same sin in the previous paragraph) it is important to note that the wind energy sector is different and has its own challenges and opportunities.
Among the opportunities are exciting developments like a report from the Global Wind Energy Council in Oct 2017, which suggested that wind could account for 20% of global energy capacity by 2030. There have also been some ambitious projects proposed like the super-sized North Sea wind farm Dutch Power are looking to build, with a possible generating capacity of 30GW and complete with its own artificial island.
I personally feel that the growth of offshore floating wind farms will be a very interesting trend, as explained in further detail in this report by Willis Towers Watson, a huge amount of potential wind energy is located above deep sea locations. For example, that report states that
"80% of the offshore wind resource in Europe is located in waters deeper than 60 meters and has a potential capacity of 4,000GW."
As a Scot, I'm also proud that the first offshore floating windfarm (a 30MW project run by Statoil) was opened off the North East coast of Scotland in Oct 2017. It will be exciting to see more projects like this come online both in Scotland, Europe and the rest of the world in 2018.
Finally, as I am currently reading "Earth: The Sequel" by Fred Krupp and Miriam Horn , which is an interesting though slightly outdated look at different enterprises in renewable technology and argues heavily for a carbon cap and trade scheme, I thought I'd finish with a point on carbon trading in 2018.
As this brief article on the David Suzuki Foundation website highlights, there are two clear ways to incorporate environmental damage due to release of carbon dioxide (or indeed other greenhouse gas emissions) into our current economic model that do not require governments attempting to choose specific technologies to back through subsidies: a carbon tax or a so called "cap and trade" scheme.
In a carbon tax model, every unit of CO2 emitted is taxed at some fixed rate. For example, in Sweden the rate is currently $150/tonne of CO2 emitted. This directly discourages the burning of fossil fuels and other activities which emit greenhouse gases and therefore stimulates the growth of the renewable sector. Of the two models, this is the easiest for governments to implement.
In a cap and trade system, the government sets an 'emissions ceiling' which the entire economy must fall below. Emissions quotas are then divided out among potential polluters, for example through auction, and polluters cannot exceed these quotas. If they do then they must buy pollution quotas from other parties who have spare quota to sell in order to cover the difference. As time progresses the government successively lowers the cap, thus reducing the overall emissions produced by the economy. This encourages polluters to reduce their carbon emissions so that they can potentially sell there remaining carbon quota on the market, the market of course also acting to set the price of these quotas. The cap and trade scheme allows businesses the freedom to choose how they reduce their emissions whilst also providing a direct economic incentive to do so.
Of these two I like the idea of cap and trade systems best, as the cap itself provides a certainty about the total amount of greenhouse gas emissions that will be emitted. To my mind this is the clearest way to lower carbon emissions as in line with the reductions agreed to in the Paris Climate Agreement.
Due to my interest in this, it was exciting to hear that China are moving to introduce a carbon cap and trade scheme which it is believed could reduce their peak emissions timescale from 2030 to even earlier. Since China having its peak emissions by 2030 or earlier will be key to the world meeting the Paris Climate Agreement targets of less than 2 degrees warming above pre-industrial levels, I think this is great news.