Mathematics for Machine Learning

January 12, 2021

We need the equivalent of high school mathematics to understand the concepts used in Machine Learning (ML), such as linear algebra, probability, statistics, and multivariate calculus. Yet, a background in ML is necessary to understand the ML concepts and algorithms.

Introduction

Is it necessary to understand the mathematics behind ML?

Absolutely.

Machine learning is all mathematics. ML is built on mathematical prerequisites. The math helps you understand why some models are better than others.

The chart below shows the importance of each mathematical concept needed to master ML.

The chart shows the importance of each mathematical concept

It can be summarized below as:

  • Linear Algebra (35.0%)
  • Probability and Statistics (25.0%)
  • Calculus (15.0%)
  • Algorithms (15.0%)
  • Others (10.0%)

This article is not intended to go into details about the mathematical calculations in the various topics. Instead, it’s meant to give you a general overview of the math topics behind ML and how they are used.

Let’s dive into it.

Linear algebra

Linear algebra covers a significant part of the mathematical concepts used for Machine Learning. It is a mathematics sub-domain dealing with linear systems of equations and the way they are represented in vector spaces and through matrices.

The essential topics in linear algebra for understanding the methods used in machine learning include:

Some applications of linear algebra in ML

  • Principal Component Analysis (PCA) is used in machine learning for dimensionality reduction. This helps reduce the high-dimension data into low-dimension data, that is often easier to analyze.
  • Singular Value Decomposition is a commonly used algorithm for data processing. It is used in machine learning for data reduction and dimensionality reduction.
  • Latent Semantic Analysis is used in the Natural Language Processing (NLP) domain.
  • In Convolution Neural Networks (CNN), linear algebra helps us apply transformations on inputs such as images. It converts images into pixel data and performs convolution operations on them.
  • Eigen decomposition is used in Principal Component Analysis (PCA).

Multivariate calculus

Calculus is an important field in mathematics. It is used in machine learning to study the rate of change in quantities, such as the curves’ slopes. Unlike single variable calculus, where we use functions with single inputs, which gives us single outputs, multivariate calculus involves feeding functions with multiple input variables that give either single output or multiple output results.

There are a couple of topics in calculus that are essential for ML.

They include:

Some applications of calculus in ML

  • Optimization techniques such as the Gradient Descent, Adam, RMSProp, and Ada Delta methods use Calculus to help find the local and global minimum and maxima.
  • Backpropagation is the main algorithm used to train neural networks. It is achieved by using calculus, which utilizes concepts such as chain rule and partial derivatives.

Probability and Statistics

Probability is the study of the measure of uncertainty. There is a need to quantify uncertainty in the real world, as the information we work with is usually incomplete. Thus, probability helps us model elements of uncertainty, i.e., the probability of a user paying back a bank loan based on past transaction information.

It is important to note that the probabilities of all outcomes always sum up to 1. On the other hand, Statistics is a discipline in applied mathematics that involves gathering, explaining, and presenting data.

The topics in Probability and Statistics essential for understanding the methods used in machine learning include:

Some applications of Probability and Statistics

  • Maximum Likelihood Estimation (MLE) provides a framework for predictive modeling in machine learning. It is used to maximize the likelihood of a function, which results in finding parameters that explain observed data and probability distributions in a dataset. It is commonly used to train models in ML techniques such as linear regression and logistic regression.
  • Sampling is a method used in probability. In machine learning, datasets usually contain a lot of noise and bias. Sampling helps us solve this problem by obtaining samples, i.e., from different areas, instead of only using samples from one specific area, which may be biased and containing a lot of noise. Thus, sampling gives us complete coverage of the problem domain.
  • Probability forms the foundation to develop specific algorithms such as the Naive Bayes classifier.
  • Standard distributions such as the commonly used Gaussian distribution focus on a massive chunk in the field of statistics. These distributions provide functions to calculate the probability of a single observation from a collection of sample spaces.

Algorithms

Algorithms are instructions that enable a computer program to put together different information sources and eventually generate a result. Understanding how algorithms work is essential in understanding the best ways to scale our ML algorithms and exploit sparsity information in our datasets, i.e., why some range of values has no data.

Below is a list of the necessary algorithm topics needed to start with ML:

Others

This section includes topics that are not covered by the four main mathematical concepts, but are still essential to understand machine learning.

These topics include:

Resources

Here is a collection of books and videos that’ll get you started on your journey to understand the math used for machine learning.

The 3Blue1Brown is a great YouTube resource channel. It entertainingly explains mathematical concepts by using animations. It’s easy to understand and follow-through, especially for beginners.

Feel free to visit their website: https://www.3blue1brown.com if you want to ask questions, share interesting mathematical concepts, or discuss videos.

This book covers the mathematical literature that forms the basis for present-day machine learning. This book presumes that the reader has a mathematical knowledge at least equivalent to a high school graduate.

The book covers Linear Algebra, Analytical Geometry, Vector Calculus, and Probability and Distributions.

This course is hosted on YouTube by Coursera. It was designed to help you quickly build an intuitive understanding of linear algebra required for standard machine learning techniques.

You can visit their YouTube channel for more information.

The course is offered by two instructors, Dr. Sam Cooper and Dr. David Dye.

The course offers an introduction to the multivariate calculus used in machine learning. It covers machine learning algorithms such as backpropagation, a standard algorithm used to train neural networks that rely on Calculus.

Wrapping up

That’s it! That’s the mathematics used for Machine learning. Yes, you can do machine learning without the math, but you won’t understand what you’re doing. Spare some time and learn the math if you want to understand machine learning in-depth. I’ve compiled a list of book and video resources that will help you explore the math used further. I hope you’ve found this article helpful.

As a starter, this book is a great resource published by the Cambridge University Press. The book is for people who don’t have a mathematics degree but want to understand enough math to deploy and build ML algorithms. The book will help you master the fundamentals of mathematics and how they are used in ML. And it’s free. Feel free to check it out.

References

  1. Mathematics for Machine Learning
  2. Linear Algebra
  3. 3Blue1Brown
  4. Mathematics for Machine Learning - Linear Algebra
  5. Mathematics for Machine Learning - Multivariate Calculus
  6. Principal Component Analysis (PCA)

Peer Review Contributions by: Lalithnarayan C


About the author

Willies Ogola

Willies Ogola is pursuing his Master’s in Computer Science in Hubei University of Technology, China. His research direction is on Artificial Intelligence and Embedded Systems. He likes researching during his free time and is passionate about technology.

This article was contributed by a student member of Section's Engineering Education Program. Please report any errors or innaccuracies to enged@section.io.