Most commonly used metrics for Linear Regression Model
Imagine a Class with Test Score, suppose you are in a class and your teacher gives you and your friends a test. Following the test, your teacher wants to know how good the class actually did. To do so, she tries to predict how well everybody would perform based on the study time. Then, following the test, the teacher compares the predicted scores to the actual scores that you all attained.
The teacher wants to know how far off the predictions were
from the real scores. This is where MSE, RMSE, and MAE come in, they are ways of
measuring the errors or mistakes between predicted scores and real scores.
1. MSE (Mean Squared Error)
Suppose your teacher predicted that you would score 90 on
the test, but you actually scored 80. The error here is:
90 - 80 = 10
Now, we square it instead of using the error itself. That
means, multiply the error by itself. So:
10 * 10 = 100
Why square it? The reason is because squaring makes sure
that both positive and negative errors (supposedly your teacher predicts 80 and you scored 90, then error could be -10) count in the same way, and it also gives
more weight to bigger errors.
MSE is just the average of all those squared errors for all
the students. So, if there were many students, your teacher would:
1. Find the error for each student.
2. Square each of those errors.
3. Take the average of all those squared errors.
It's like a way to measure, on average how far off the
predictions are, but it punishes larger errors more because squaring makes
bigger errors even bigger!
2. RMSE (Root Mean Squared Error)
RMSE is just the square root of MSE. Sounds like a weird
name, but it's simple! Here's why we do it:
If we didn't take the square root, the errors would be in a
much bigger unit (since we squared them). So, to get things back to the same
scale as the actual scores, we just take the square root of MSE.
Example:
- If MSE is 100, the RMSE would be:
Square root of 100 is 10
The RMSE is still a measure of error, but it's now in the
same scale as the original scores. It's just easier to understand.
3. MAE (Mean Absolute Error)
MAE is an easier way to calculate errors. It's just the
average of absolute errors. So we do not square the errors like in MSE and we
do not take the square root, like in RMSE. Here's what we do instead:
1. Find the error for each student
2. Calculate the absolute value of the error. There are no
negative signs - only positive integers.
3. Then we calculate the mean of all the absolute errors.
For instance:
If you predicted 90 but you had a score of 80, the absolute
error is:
|90 - 80| = 10
MAE gives you a much easier way to know, on average, how far
off the predictions were, without having to worry about squaring your errors or
taking out the square roots.
4. R Squared Score
Imagine You’re Predicting Mangoes
Let’s say you’re trying to predict how many mangoes your friend will get from his mango tree based on how much he waters the tree. You’ve been keeping track of how much water he gives the tree and how many mangoes he gets each year.
What is R-squared (R²)?
- R-squared tells you how well you can predict the number of mangoes your friend gets based on how much water he gave the tree.
Example: - If your R-squared is 0.8 (or 80%), then 80% of the mangoes your friend gets can be explained by how much water he gives the tree. The other 20% might be because of other things like the tree's health, the weather, or the type of soil.
To put it in one sentence: R-squared is saying, "I can explain 80% of how many mangoes your tree will produce by knowing how much water you give it."
5. Adjusted R-squared
Now, that you want to add some other variable to your prediction. Perhaps you start looking at how much sunlight the tree gets, or how much fertilizer your friend uses. But more does not always mean better!
- Adjusted R-squared is an improved version of R-squared. It reduces the score if you add more irrelevant information that doesn't really make any difference to predict mangoes.
- If you add too many things that don't matter, like actually looking at the color of the leaves on that tree, Adjusted R-squared would discount that by lowering the score to let you know you're overcomplicating things but not giving you better predictions.
In short: Adjusted R-squared helps you know if adding more things, like sunlight or fertilizer, really makes your prediction more accurate, or if you're just making it more complicated.
Summary:
- MSE (Mean Squared Error): It is the average of all the
squared errors. Good for punishing big mistakes.
- RMSE (Root Mean Squared Error): Like the MSE, but we take
a square root to help us make sense of it since it is back in the same unit as
the scores.
- MAE (Mean Absolute Error): This is the average of all the absolute errors (no squaring, no square root), and it is the most straightforward method of measuring error
-R-squared tells you how much of the mango production can be explained by things you know (like water).
- Adjusted R-squared shows you if adding a lot of information such as sunshine or fertilizer really made a difference or if you're just overcomplicating stuff.
Happy using metrics!
Comments
Post a Comment