How to Use the Python statistics.mean() Method

09/10/2021

Contents

In this article, you will learn how to use the Python statistics.mean() method.

Python statistics.mean() Method

The mean() method in the Python statistics module is used to calculate the arithmetic mean (also known as the average) of a numeric data set. Here is an example of how to use it:

import statistics

data = [1, 2, 3, 4, 5]
mean = statistics.mean(data)
print(mean)

# Output
# 3.0

Note that the mean() method only works with numeric data. If you pass in a non-numeric data set, you will receive a TypeError.

You can also use the mean() method with a data set stored in a file. For example, if you have a file named data.txt containing a list of numbers, you can calculate the mean as follows:

import statistics

with open('data.txt', 'r') as file:
    data = [int(line) for line in file]
mean = statistics.mean(data)
print(mean)

This will output the mean of the numbers in the data.txt file.

Here’s some additional information to help you use the mean() method effectively:

  • Handling missing values: By default, the mean() method assumes that all values in the data set are valid. If there are missing values in the data set, the method will raise a StatisticsError. To handle missing values, you can either remove them from the data set or replace them with a suitable substitute, such as the mean of the remaining values.
  • Handling data sets with non-finite values: If the data set contains non-finite values (such as NaN, Inf, or -Inf), the mean() method will raise a StatisticsError. To handle these values, you can either remove them from the data set or replace them with a suitable substitute, such as the mean of the remaining finite values.
  • Computing the mean of large data sets: When working with large data sets, it’s important to consider the memory limitations of your system. The mean() method computes the mean by adding up all the values in the data set and then dividing by the number of values. For very large data sets, this can be computationally expensive and consume a lot of memory. In such cases, you may want to consider using an alternative method, such as the running mean or the moving average.
  • Using weighted mean: If you want to compute a weighted mean, where different values in the data set are given different weights, you can use the weighted_mean() function from the statistics module. This function calculates the weighted mean by multiplying each value in the data set by its corresponding weight and then dividing by the sum of the weights.

Here’s an example of how to use the weighted_mean() function:

import statistics

data = [1, 2, 3, 4, 5]
weights = [1, 2, 3, 4, 5]
weighted_mean = statistics.weighted_mean(data, weights)
print(weighted_mean)

# Output
# 3.6666666666666665