Learn mathematics for data science and machine learning.

Improve your skills by learning using code and visualizations.

Learn with code
Practical examples
Visualizations to get more insights
No math background? We start from the basics.
Look inside!
cover

Buy before

29 February 2025

and benefit from a great reduction!

Use the offer code FEB2025

The Book

cover
You get the book in various electronic versions

PDF/EPUB

€40

€25 with the code FEB2025!

Complete Code

cover
+
You get the book and access to the private repo with all the code

PDF/EPUB
Access to the private repo with the complete code (notebooks)
Ask your questions as issues in the repo

€50

€35 with the code FEB2025!

SATISFACTION GUARANTEED

My content has helped thousands of people to improve their skill and knowledge in data science and machine learning.

However, if the book is not a good fit for you, give me your feedback within 30 days of your purchase and you'll get a refund.

For Developers
Learn the math with code examples. Get your hands dirty and get more insights.
For Data Scientists
Boost your DS and ML skills as a practitioner by knowing better what's under the hood.
For Students
Get all the math you need for Data Science in one place.

Do you need math?

The great libraries in the data science and machine learning ecosystem allow you to dive into the field without knowing much about the theory. I think that this top-down approach is a great way to start: take real data and run some algorithms.

Then, a lot of questions arise like: "Why don't I get the expected prediction performance on this dataset?", "How can I adapt this algorithm to my specific use case?" At this point, the lack of theory can be a limit to your skill growth. The solution is to dive a bit more into the theory and sharpen your understanding.

In this book, I'll introduce you to math concepts specifically targeted at increasing your understanding in data science and machine learning. I can assure you that even a preliminary exposition to math thinking will clear your vision of the field.

Because the audience of this book is people without a deep math background (e.g. junior data scientists, developers in a career move to data science), the approach is: no-jargon and more insights. You'll just need high school math notions (even rusted).

A preliminary exposition to Python (ideally with the library Numpy) is something that will allow you to make the most of this book. The level of programming skill needed is a bit higher in the hands-on projects at the end of each chapter.

Here are a few examples of why math are useful in the context of data science and machine learning:

- Understand the differences between algorithms and which tool is best in what situation.

- Make the most of machine learning libraries like Sklearn and be able to understand the documentation.

- Avoid misinterpreting the results of your analyses.

- Debug models that are not converging by diagnosing the issue.

- Create custom prediction functions and cost functions allowing to you to adapt the algorithm to the problem you want to solve.

... and a lot more!

Learn the Math You'll Need

It is crucial for your learning path to target the math concepts that you'll use in data science and machine learning. The topics selected in this book will give you exactly that. And at the right level of detail.

If you try to read math text books about topics like linear algebra, you might find it difficult and not so motivating because the level of detail is not necessarily suited to data science and machine learning.

Calculus
Learn about the two core Calculus concepts: derivatives and integrals.
Statistics and Probability
From probability distributions to bayesian statistics and information theory.
Linear Algebra
From scalars and vectors to eigendecomposition and Singular Value Decomposition.

Inside the Book

If you enter the field of data science and machine learning without a math background you might feel overwhelmed by the underlying concepts.

It is hard to find ressources that target exactly the math you'll need in data science and machine learning: you don't want to become a mathematician but better understand the concepts of data science.

In Essential Math for Data Science, I emphasize intuition over proofs and theorems. It is why visualizations and code are so useful in this context.

Inside. The book is designed to help you learn using code, visualizations and practical examples. The purpose is to give insights instead of proof and theorems. You'll see that code is a great way to experiment and gain more intuition about the theory.
400+ pages
200+ figures
300+ code blocks
10 hands-on projects
profileprofile
Readers. You'll get the book in various versions (PDF/EPUB) with no DRM (Digital Rights Management) allowing you to read it anywhere. Since most E-readers don't support Mathjax or Mathml, the equations have been converted to images. Feel free to ask for a specific version compatible with your E-reader. I also recommend to read the book in color for the illustrations.
Notebooks. If you choose to get the complete code, you'll also have access to the notebooks containing all the code. This is a great way to run the code while you read the book and be sure that you follow along and get all the steps (e.g. check the shape of the matrices, do interactive plots in the notebook, etc.).
jupyter
profile
Hands-on. You'll find one hands-on project at the end of each chapter. The goal is to show you how the math concepts relates to practical applications and to illustrate the theoretical notions through examples.
Updates. You'll get all the next releases and updates of the book and the associated content.
profile

Table of Contents

Explore the complete table of contents to see exactly what's inside the book. You'll find here all the core concept needed for data science and machine learning.

PART 1. Calculus

In this first chapter, you'll see the basics of calculus: derivatives and integrals. These notions are important to understand core machine learning concepts like gradient descent and model performance estimation (area under the curve).
Area under the curve

Ch01. Calculus: Derivatives and Integrals

1.1 Derivatives
1.2 Integrals And Area Under The Curve
1.3 Hands-On Project: Gradient Descent

PART 2. Statistics and Probability

To make sense of data, you have to deal with the uncertainty coming the data itself, from data that you don't have, or from the inherent stochasticity of your system. The goal of statistics and probability is to provide a framework to deal with this uncertainty.
Gaussian Distributions

Ch02. Statistics and Probability Theory

2.1 Descriptive Statistics
2.2 Random Variables
2.3 Probability Distributions
2.4 Joint, Marginal, and Conditional Probability
2.5 Cumulative Distribution Functions
2.6 Expectation and Variance of Random Variables
2.7 Hands-On Project: The Central Limit Theorem

Ch03. Common Probability Distributions

3.1 Uniform Distribution
3.2 Gaussian distribution
3.3 Bernoulli Distribution
3.4 Binomial Distribution
3.5 Poisson Distribution
3.6 Exponential Distribution
3.7 Hands-on Project: Waiting for the Bus

Ch04. Bayesian Statistics and Information Theory

4.1 Bayes’ Theorem
4.2 Likelihood
4.3 Information Theory
4.4 Hands-On Project: Bayesian Inference

PART 3. Linear Algebra

Linear algebra is a central topic in data science and machine learning. You'll learn in these chapters the major concepts of vector spaces that you'll need to understand machine learning algorithms more deeply. We'll start from the basics of vectors and matrices and finish with matrix decomposition like Eigendecomposition and Singular Value Decomposition.
L1 Regularization. Effect of Lambda.

Ch05. Scalars and Vectors

5.1 What Vectors are?
5.2 Operations and Manipulations on Vectors
5.3 Norms
5.4 The Dot Product
5.5 Hands-on Project: Regularization

Ch06. Matrices and Tensors

6.1 Introduction
6.2 Operations and Manipulations on Matrices
6.3 Matrix Product
6.4 Special Matrices
6.5 Hands-on Project: Image Classifier

Ch07. Span, Linear Dependency, and Space Transformation

7.1 Linear Transformations
7.2 Linear combination
7.3 Subspaces
7.4 Linear dependency
7.5 Basis
7.6 Special Characteristics
7.7 Hands-On Project: Span

Ch08. Systems of Linear Equations

8.1 System of linear equations
8.2 System Shape
8.3 Projections
8.4 Hands-on Project: Linear Regression Using Least Approximation

Ch09. Eigenvectors and Eigenvalues

9.1 Eigenvectors and Linear Transformations
9.2 Change of Basis
9.3 Linear Transformations in Different Bases
9.4 Eigendecomposition
9.5 Hands-On Project: Principal Component Analysis

Ch10. Singular Value Decomposition

10.1 Nonsquare Matrices
10.2 Expression of the SVD
10.3 Geometry of the SVD
10.4 Low-Rank Matrix Approximation
10.5 Hands-On Project: Image Compression

About the Author

profile

Hadrien Jean owns a Ph.D in cognitive science and currently works as a machine learning scientist at iAudiogram (My Medical Assistant SAS).

He wrote a series of tutorials as notes of the Deep Learning Book from Ian Goodfellow helping thousands of people to learn math for machine learning.

He's also working on speech processing and leads projects on biodiversity assessment using deep learning applied to audio recordings.

He concurrently teaches machine learning and deep learning in data science bootcamps at Le Wagon.

Frequently Asked Questions

What are the pre-requisites in terms of math and code to make the most of this book?

In terms of math, the book is designed for people without a deep math background. However, you should have some knowledge about basic algebra. For instance, I consider that you understand what equations and mathematical variables are.

In terms of code, the practical aspect of this book leverages the use of code to help you gain insights and mathematical intuition. If you don't have any experience in programming, you'll have trouble to follow along the examples. You could still skip these sections and focus on the plots and text but it is better if you can understand a bit of Python and Numpy.

What will I find in the book that is not on your blog?

I will share around 25% of the book as excerpts on my blog. The goal is to allow readers to check if the book is a fit or not.

I found another book called "Essential Math for Data Science" on Amazon. Is it the same book?

Yes. I started this book with the publisher O'Reilly but our paths diverged and the project aborted. I wanted this book to approach the theoretical concepts needed for data science. The good reviews by readers who got the early release convinced me to publish it by myself.

Due to delays in the update of their references, some book retailers like Amazon (e.g. Amazon France) still propose to pre-order "Essential Math for Data Science" even if it will not be available. I'm very sorry about this confusion.

Can I get help if I have trouble to understand the materials?

Yes, if you buy the "Complete Code" version you get access to the private Github repository and you'll be able to ask any questions as issues. I'll be here and do my best to help you and assist in your learning path.

Will you release a paperback version?

As for now, there is no current plan to do a paperback version of the book. However, tell me (contact@essentialmathfordatascience.com) if this is something you would like to have and I'll reconsider the option if many people are interested.

In what extents the content relates to data science and machine learning?

The math that you need for data science and machine learning have been carefully selected in "Essential Math for Data Science". However, the goal of the book is not to explain the machine learning algorithms themselves but to introduce you to the math you'll need to understand them. These math topics are quite general but the approach is focused on data science: e.g. the hands-on projects at the end of each chapter, the practical examples, the visualizations, etc.

Will I get the next updated versions for free?

Yes. You'll get access to the project anytime (the book that you can download on Gumroad and the notebooks on the private Github repository) and you'll get any future updates.

Is the payment secure?

Yes. I use Gumroad to process the payment and deliver the book and the access to the Github repo (if you buy the Complete Code version).

Do I need to have a Github account to buy the Complete Code version?

Yes. With the Complete Code version, you'll get access to a private Github repository where you'll find the book under the form of Jupyter notebooks. These notebooks (one per chapter) contain the whole code and the whole text of the book. To get the access, you'll need a Github account. Be sure to provide your right Github id (for instance, mine is "hadrienj") in the purchase form (and not the email address). If you don't have a Github account, you can create one easily!

What is Haliotis-Publishing?

I wrote this book as Jupyter Notebooks and found how it is useful to have text and code blocks with outputs like plots. I thus developed a set of tools to convert notebooks into PDF and ebooks and founded Haliotis-Publishing to allow people to write well formated content using notebooks.