How to Calculate Moving Average with Python Pandas

09/14/2021

Contents

In this article, you will learn how to calculate moving average with Python Pandas.

Calculate moving average with Python Pandas

A moving average is a commonly used statistical calculation that helps to smooth out the fluctuations in time series data. It is calculated by taking the average of a specified number of previous values over a sliding window of fixed width.

In Pandas, you can calculate moving averages using the rolling() method. The rolling() method can be applied to a Pandas Series or DataFrame object and returns a new object with the same shape as the original, but with the moving average calculated over the specified window.

The rolling() method takes several parameters, including:

  • window: the number of values to include in the moving average window.
  • min_periods: the minimum number of non-null values required to calculate a moving average. The default is None, which means that all values in the window must be non-null.
  • center: a boolean flag indicating whether to align the moving average with the center of the window or the right edge. The default is False, which means that the moving average is aligned with the right edge of the window.

Here’s an example of how to calculate a moving average with a window of 5 using the rolling() method:

import pandas as pd

# create a sample data frame with a datetime index and a random column
data = {'date': pd.date_range('2021-01-01', periods=10), 'value': [4, 3, 5, 6, 4, 7, 8, 6, 9, 10]}
df = pd.DataFrame(data).set_index('date')

# calculate the 5-day moving average
ma = df.rolling(window=5).mean()

# print the original data frame and the moving average data frame
print(df)
print(ma)

The output of the above code will look like this:

            value
date             
2021-01-01      4
2021-01-02      3
2021-01-03      5
2021-01-04      6
2021-01-05      4
2021-01-06      7
2021-01-07      8
2021-01-08      6
2021-01-09      9
2021-01-10     10

            value
date             
2021-01-01    NaN
2021-01-02    NaN
2021-01-03    NaN
2021-01-04    NaN
2021-01-05    4.4
2021-01-06    5.0
2021-01-07    6.0
2021-01-08    6.2
2021-01-09    6.8
2021-01-10    7.6

As you can see, the moving average is calculated by taking the average of the previous 5 values for each row, starting from the right edge of the window. The first 4 rows have NaN values because there are not enough values in the window to calculate the moving average. The moving average is aligned with the right edge of the window because the center parameter is set to False by default.