How to Find the Correlation Coefficient in Python

09/16/2021

Contents

In this article, you will learn how to find the correlation coefficient in Python.

Find the correlation coefficient

There are several ways to find the correlation coefficient in Python. Here are two common methods:

Using the numpy library

import numpy as np

# create two arrays
x = np.array([1, 2, 3, 4, 5])
y = np.array([5, 7, 9, 11, 13])

# calculate the correlation coefficient
corr_coef = np.corrcoef(x, y)[0, 1]

print("Correlation coefficient:", corr_coef)

# Output:
# Correlation coefficient: 1.0

The corrcoef function from the numpy library returns the correlation coefficient matrix for the given arrays. We use [0, 1] to extract the correlation coefficient between the two arrays.

Using the pandas library

import pandas as pd

# create a dataframe
df = pd.DataFrame({'x': [1, 2, 3, 4, 5], 'y': [5, 7, 9, 11, 13]})

# calculate the correlation coefficient
corr_coef = df['x'].corr(df['y'])

print("Correlation coefficient:", corr_coef)

# Output:
# Correlation coefficient: 1.0

The corr method from the pandas library calculates the pairwise correlation of columns in a dataframe. We select the correlation coefficient between columns ‘x’ and ‘y’ using the syntax df[‘x’].corr(df[‘y’]).