How to Calculate Mean Squared Error in Python

09/28/2021

Contents

In this article, you will learn how to calculate mean squared error in Python.

Calculating mean squared error

Mean Squared Error (MSE) is a popular evaluation metric used in various regression problems to assess the performance of a machine learning model. It measures the average squared difference between the predicted and actual values. In Python, MSE can be easily calculated using NumPy and scikit-learn libraries. Here’s how to do it:

First, import the necessary libraries:

import numpy as np
from sklearn.metrics import mean_squared_error

Assume that you have two arrays, y_true and y_pred, where y_true contains the actual target values, and y_pred contains the predicted target values. You can calculate the MSE as follows:

mse = mean_squared_error(y_true, y_pred)

This will return the MSE value, which is a single number representing the average squared difference between the predicted and actual values.

If you want to calculate the MSE manually, you can use the following formula:

mse = np.mean((y_pred - y_true) ** 2)

This formula subtracts the actual values from the predicted values and squares the differences. The mean of these squared differences gives the MSE.

Here’s a complete example:

import numpy as np
from sklearn.metrics import mean_squared_error

# Generate some random data
y_true = np.array([3, -0.5, 2, 7])
y_pred = np.array([2.5, 0.0, 2, 8])

# Calculate MSE using scikit-learn
mse = mean_squared_error(y_true, y_pred)
print("MSE:", mse)

# Calculate MSE manually
mse_manual = np.mean((y_pred - y_true) ** 2)
print("MSE (manual):", mse_manual)

Output:

MSE: 0.375
MSE (manual): 0.375

In this example, the MSE is 0.375, indicating that the predicted values are on average off by about 0.61 units (the square root of the MSE).