Book Recommendation: Math for Deep Learning

Continuing to learn and grow is very important for everyone. In the age of YouTube, Podcasts, and various forms of online material, formally written books can get pushed to the bottom of the list. I feel this can happen when a new technology trend gets attention. This can currently be seen with the very popular topic of artificial intelligence (AI).

If you look at how to build an AI Model, you will probably get a similar set of steps starting with gathering data into training a model to the final fine tuning of parameters before the release to production. Almost every step can be made with a pre-canned set of libraries and quite possibly using a fully built model. Although you learn how to efficiently to turn on an AI project, understanding is at the application level. The underlying knowledge of AI architecture is missing. That lower level of understanding becomes critical when you need to do more than follow the crowd.

The book I will be discussing is one volume of knowledge that can help deepen an understanding of AI. In particular, it deals with Deep Learning which is the multi-layered model of neural networks. With that, this is the book for review:

Math for Deep Learning

By Ronald T. Kneusel

ISBN-13: 978-1-7185-0190-4 (print)

ISBN-13: 978-1-7185-0191-1 (ebook)

Library of Congress Control Number: 2021939724

I found and read this book on the Apple Book Store. This allows me to use any of the iDevices and my laptop as a method to enjoy a good read.

Book Outline

Foreword
Acknowledgments
Introduction
Chapter 1: Setting the Stage
Chapter 2: Probability
Chapter 3: More Probability
Chapter 4: Statistics
Chapter 5: Linear Algebra
Chapter 6: More Linear Algebra
Chapter 7: Differential Calculus
Chapter 8: Matrix Calculus
Chapter 9: Data Flow in Neural Networks
Chapter 10: Backpropagation
Chapter 11: Gradient Descent
Appendix: Going Further
Index

This outline seems like a huge mountain of topics. Each chapter can be an entire lifetime study. Do not be put off by this but know you should have some introductory experience in each math referenced. The layout shows a progression from the core maths to the actual application where learning is done (ex: Gradient Descent requires calculus to determine points of minimum when tuning the learning rate). You might be able to skip some parts if you are an advanced mathematician but I recommend you at least skim across those pages.

Beyond the Outline

Diving deeper into the material, the book certainly delivers on its title. Make no mistake that the different maths will be discussed. If that does turn you off, you should be comforted to know that everything is presented with the purpose of understanding how to build your own learning model. This should actually excite many AI Enthusiasts! How great is it to be able to discuss the topic more than rely on code you downloaded off of a forum with a short explanation! Accomplishing that should build a bit of confidence in one’s own ability.

If you are someone who uses example code as a method of learning, this is a great book for you! One very nice and cool teaching instrument is the use of Python examples that demonstrate how the math works for this application. There are many code snippets along with a respectable description. There are also many examples that require plotting libraries for useful visualizations. Note that if you do follow along, a standard Python environment will be useful to keep close by. Unless I was at my desk, I used Pythonista² on the iPad to do most of the exploration work.

*Example Plot from Book – Gradient Field*

Starting from the beginning with the chapter on probability is an interesting choice. I applaud it as probability is a topic where the student is actually a practitioner without even understanding they are one. Life has so many choices to it. We all want to be 100% perfect but are rarely so. There are formulae in probability but there is an artistic aspect to using it properly. That ability to think in an abstract and mathematic way comes with experience.

The chapters do build on themselves using enough Linear Algebra, Calculus, and Differential Equations necessary to explain the topic of Deep Learning. The author seems to have thought this out nicely. As you get into the “Data Flow in Neural Networks” Chapter, it starts to come together with neural network topics from the field of Computer Science. The final chapters solidify the purpose of building-up using the different maths then adding in the Comp Sci applications.

A very nice feature the author added is the addition of references for more research. Even with the current level of advancement in AI, there is more of a distance to go. There are numerous books to continue the study with so do not skip those references!

What the Book is Not About

This book is not about the quick and dirty. That should seem obvious. You will need to spend some time with each chapter, sometimes reading it more than once. This is also not a book to get you up and running with your version of a chat bot using a Large Language Model overnight. Nor will it to show you the way using unproven approaches. There are other sources more geared for that desired path.

Final Thoughts

The topic of AI is quite vast and goes back to numerous early scientific works³ and fantasy stories⁴. This area of study has great attention right now and garners quite a lot of investment monies. The concern is the idea of a market bubble where everyone that has surface level knowledge jumped in and does not grow. A lot of money is being spent with only small pockets of progress.

A book such as this should be on every AI interested person’s reading list. You will need to spend some time, reading, experimenting, running code, and thinking about what is happening. Like all exercise of the mind and body, you will benefit in some fashion from doing so. Never sit still and stagnate when it comes to learning and sharing your ideas. You’re too good to let that happen!

Thank you and Have a Happy New Year! May your 2024 be blessed!

References

[1] Ronald T. Kneusel . Math for Deep Learning. No Starch Press. 2022.

[2] OMZ Software. Pythonista. Retrieved from: http://omz-software.com/pythonista/

[3] Rockwell Anyoha. History of Artificial Intelligence. Harvard University. August 28, 2017. Retrieved from: https://sitn.hms.harvard.edu/flash/2017/history-artificial-intelligence/

[4] Wikipedia. Artificial intelligence in fiction. Retrieved from: https://en.wikipedia.org/wiki/Artificial_intelligence_in_fiction

Book Recommendation: Math for Deep Learning

Share this: