Bias and Variance are one of the most fundamental concepts in Machine Learning, before trying to understand them, it is important we go through how training is done in Machine Learning. The balance between bias and variance is a critical concept in machine learning, as achieving this balance is key to building models that generalize well to new data.
Let us paint a scenario- You want to predict your package after you get placed as a Machine Learning engineer, given a dataset of your alumni, which consists of the following features:
Now we would select an algorithm(let us say, Linear Regression, for the sake of simplicity) and try to fit a line on all these data points, a model(the line) that understands the patterns and when provided with new data point gives us a rough idea of our CTC.
The algorithm will try different line until the MSE(Mean Squared Error) also known as training loss, this is the training error, which is a part of the total error rate of the model.(there are other types of errors (loss functions) as well) is the minimum. In this case, the MSE would be calculated as,
where:
In each pass MSE is calculated and then in the next pass optimized, for how it is done visit this link.
Now we have a rough idea of how training is done, let us understand what bias is. Taking the same problem as above, let us take one parameter only(because i am human and can only think in 3d and draw in 2d), CGPA(input feature). This is what the CGPA vs Package graph might look like-
Now we apply Linear Regression to this data and we get an accuracy of 50%, why did this happen? The model simply was unable to understand the patterns in the data(model complexity was low). This is also called underfitting, thus we did not get accurate predictions An underfitted linear regression would look like-
Bias measures how well the model matches the training data. A high bias means the model makes strong assumptions about the data, leading to poor performance on the training data. This is often due to an overly simple model. Conversely, a low bias indicates the model closely matches the training data.
For a more statistical explanation of bias, let Y be the true value of a parameter, and let Ŷ be an estimator of Y based on a sample of data. Then, the bias of the estimator Ŷ is given by:
Bias(Ŷ) = E(Ŷ) - Y
where E(Ŷ) is the expected value of the estimator Ŷ. It measures how well the model fits the data. This equation reflects the systematic error introduced by the model.
Low Bias: Low bias value means fewer assumptions are taken to build the target function. In this case, the model will closely match the training dataset.
High Bias: High bias value means more assumptions are taken to build the target function. In this case, the model will not match the training dataset closely.
Variance is the measure of spread in data from its mean position. In machine learning variance is the amount by which the performance of a predictive model changes when it is trained on different subsets of the training data. More specifically, variance is the variability of the model that how much it is sensitive to another subset of the training dataset. i.e. how much it can adjust on the new subset of the training dataset.
Let us go back to our example, in a scenario where you get an accuracy of 99% on the linear model we just trained, do not just jump out of your chair yet! This accuracy is on the training set but if we calculate the model accuracy on data the linear regression model has not seen, if it comes out to be say 30%, this is what is called overfitting, meaning the model did so well on the training data it did not generalize the patterns and thus failed on new data(model complexity is too high). The delicate balance between bias and variance is crucial to avoid such scenarios.In our case an overfitted linear regression model would be-
This is called overfitting, a high variance means that the model is very sensitive to changes in the training data and can result in significant changes in the estimate of the target function when trained on different subsets of data from the same distribution. This is the case of overfitting when the linear regression model performs well on the training data but poorly on new, unseen test data. It fits the training data too closely that it fails on the new training dataset. This is called generalization error and thus leads to inaccurate predictions.
Low variance means that the linear regression model is less sensitive to changes in the training data and can produce consistent estimates of the target function with different subsets of data from the samedistribution. This is the case of underfitting when the linear regression model fails to generalize on both training and test data.Low variance indicates consistent model performance across different training sets.
Variance[f(x))=E[X^2]−E[X]^2
The goal is to find the right level of complexity in a machine learning model to minimize both bias and variance, achieving good generalization to new data. This balance is known as the Bias-Variance Tradeoff. An ideal model minimizes both bias error and variance error, achieving the lowest possible error rate. This is Bias-Variance tradeoff.
In terms of model complexity, we can use the following diagram to decide on the optimal complexity of our model.
To reduce high bias, one can use a more complex model, increase the number of features, reduce regularization, or increase the size of the training data. These methods aim to better capture the underlying patterns in the data, thereby improving model performance.
To reduce high variance, techniques such as cross-validation, feature selection, regularization, ensemble methods, simplifying the model, and early stopping can be employed. These methods help in building a model that generalizes well to new, unseen data.
This article was written by Zohair Badshah, a former member of our software team, and edited by our writers team.
🚀 "Build ML Pipelines Like a Pro!" 🔥 From data collection to model deployment, this guide breaks down every step of creating machine learning pipelines with top resources
Explore top AI tools transforming industries—from smart assistants like Alexa to creative powerhouses like ChatGPT and Aiva. Unlock the future of work, creativity, and business today!
Master the art of model selection to supercharge your machine-learning projects! Discover top strategies to pick the perfect model for flawless predictions!