Role of Mathematics in Machine Learning

Wasif Ekbal
7 min readJun 28, 2021

Today, we are going to understand what are the significance of math in machine learning.

Machine Learning is a field that intersects statistical, probabilistic, computer science and algorithmic aspects arising from learning iteratively from data and finding hidden insights which can be used to build intelligent applications. Despite the immense possibilities of Machine Learning, a thorough mathematical understanding of many of these techniques is necessary for a good grasp of the inner workings of the algorithms and getting good results.

There are many reasons why the mathematics of Machine Learning is important and I will highlight some of them below:

  1. Selecting the right algorithm which includes giving considerations to accuracy, training time, model complexity, number of parameters and number of features.
  2. Choosing parameter settings and validation strategies.
  3. Identifying under-fitting and over-fitting by understanding the Bias-Variance trade-off.
  4. Estimating the right confidence interval and uncertainty.

What Level of Maths Do You Need?

The main question when trying to understand Machine Learning is the amount of maths necessary and the level of maths needed to understand these techniques. The answer to this question is multidimensional and depends on the level and interest of the individual.

Research in mathematical formulations and theoretical advancement of Machine Learning is ongoing and some researchers are working on more advance techniques. I will state the minimum level of mathematics needed to be a Machine Learning Scientist/Engineer and the importance of each mathematical concept.

1.Linear Algebra : In ML, Linear Algebra comes up everywhere, it plays the most important role in Machine Learning. Its concepts are a crucial prerequisite for understanding the theory behind Machine Learning. This will help you to make better decisions during a Machine Learning system’s development. So if you really want to be a professional in this field, you will have to master the parts of Linear Algebra that are important for Machine Learning. In Linear Algebra, data is represented by linear equations, which are presented in the form of matrices and vectors. Therefore, you are mostly dealing with matrices and vectors rather than with scalars.

In ML, both Matrices and Vectors are used to represent the data to work with. Take an example of a tabular data which you can represent in the form of a matrices, if you have a single column then you can represent it using vector or if you have a colour photograph then you can make three consecutive matrices for red, blue and green pixels, you take the same scenario for a video. So, the main goal is to make it easy for an analyst to deal with the data. Therefore, you are mostly dealing with matrices and vectors rather than with scalars.

2. Probability : Most people have an intuitive understanding of degrees of probability and know that the probability of an event is some value between 0 and 1 which indicates how likely the event is to occur. Seems simple enough, Some of the fundamental Probability Theory needed for ML are Probability Rules & Axioms, Bayes’ Theorem, Random Variables, Variance and Expectation, Conditional and Joint Distributions, Standard Distributions (Bernoulli, Binomial, Multinomial, Uniform and Gaussian), Moment Generating Functions, Maximum Likelihood Estimation (MLE), Prior and Posterior, etc.

If you are really interested in ML, then you already know that The Naive Bayes algorithm which you use in ML uses a similar principle like Bayes’ Theorem. For example, if you have to cover a large data or suppose you have to say whether a person earning from 4L to 5L is men or women based on you previous data where you already know in that range men is 60% and women are 70% then you are supposed to take the help of probability to predict the result.

3. Statistics : Statistics is a field of mathematics that is universally agreed to be a prerequisite for a deeper understanding of machine learning. Although statistics is a large field with many esoteric theories and findings, the nuts and bolts tools and notations taken from the field are required for machine learning practitioners. With a solid foundation of what statistics is, it is possible to focus on just the good or relevant parts.

Statistics in Data Preparation : Statistical methods are required in the preparation of train and test data for your machine learning model. This includes techniques for:

  • Outlier detection.
  • Missing value imputation.
  • Data sampling.
  • Data scaling.
  • Variable encoding

Statistics in Model Evaluation : Statistical methods are required when evaluating the skill of a machine learning model on data not seen during training. This includes techniques for:

  • Data sampling.
  • Data re-sampling.
  • Experimental design.

Statistics in Model Selection : Statistical methods are required when selecting a final model or model configuration to use for a predictive modeling problem. These include techniques for:

  • Checking for a significant difference between results.
  • Quantifying the size of the difference between results.

This might include the use of statistical hypothesis tests.

Statistics in Model Presentation : Statistical methods are required when presenting the skill of a final model to stakeholders. This includes techniques for:

  • Summarising the expected skill of the model on average.
  • Quantifying the expected variability of the skill of the model in practice.

This might include estimation statistics such as confidence intervals.

Statistics in Prediction : Statistical methods are required when making a prediction with a finalised model on new data. This includes techniques for:

  • Quantifying the expected variability for the prediction.

This might include estimation statistics such as prediction intervals.

4. Calculus : Calculus is an important field in mathematics and it plays an integral role in many machine learning algorithms. If you want to understand what’s going on under the hood in your machine learning work as a data scientist, you’ll need to have a solid grasp of the fundamentals of calculus. The calculus is divided into differential and integral calculus.

Differential Calculus cuts something into small pieces to find how it changes.

Integral Calculus joins (integrates) the small pieces together to find how much there is.

You can refer the link below. It covers all the basic idea of calculus.

I hope you have understood the basics of differentiation and integration. A basic but very excellent example of calculus in Machine Learning is Gradient Descent.

Let’s us consider we have a dataset of users with their marks in some of the subjects and their occupation. Our goal is to predict the occupation of the person with considering the marks of the person.

In this dataset we have data of John and eve. With the reference data of john and eve, we have to predict the profession of Adam.

Now think of marks in the subject as a gradient and profession as the bottom target. You have to optimise your model so that the result it predicts at the bottom should be accurate. Using John’s and Eve’s data we will create gradient descent and tune our model such that if we enter the marks of john then it should predict result of Doctor in the bottom of gradient and same for Eve. This is our trained model. Now if we give marks of subject to our model then we can easily predict the profession.

To calculate and model, gradient descent requires calculus and now we can see importance of calculus in machine learning.

I hope you have understood the importance of mathematics in Machine Learning. If you have anything to say or have any issue just post in the comments below. I will be back with another interesting blog. ✌✌

Till then…. Happy coding :)

And Don’t forget to clap clap clap…

--

--

Wasif Ekbal

Hey! I am a college student, currently studying Computer Science and Engineering.