KNOWLEDGE BASE

Finding the Pearson Correlation

Published: 15 Feb 2017

Question

How to find the Pearson correlation of two measures in Tableau Desktop.

Environment

Tableau Desktop

CLICK TO EXPAND STEPS

Step 1: Create a scatterplot

1. Drag Profit to Columns and Sales to Rows.
2. In the Analysis menu, uncheck Aggregate Measures
3. Right-click the view and choose Trend Lines > Show Trend Lines.
4. Right-click the view again and select Trend Lines > Describe Trend Model
5. Locate the R-Squared value in the Describe Trend Model dialog box. In this example, the R-Squared value is 0.229503.

Step 2: Calculate the Pearson correlation

You can use different options to find the Pearson correlation. For example:

1. Use a calculator or other program
1. Calculate the square root of the R-squared value. Which will be your correlation (r): √0.229498 = 0.4791
2. Rounded to two digits, the value in this example is 0.48.
2. Create a calculated field using the CORR function.
• Enter a formula similar to the following and click OK
`CORR([Profit], [Sales])`
• This formula returns the Pearson correlation coefficient of two expressions. The Pearson correlation measures the linear relationship between two variables. Results range from -1 to +1 inclusive, where 1 denotes an exact positive linear relationship, as when a positive change in one variable implies a positive change of corresponding magnitude in the other, 0 denotes no linear relationship between the variance, and −1 is an exact negative relationship.
3. Create a calculated field using the WINDOW_CORR function.
• Enter a formula similar to the following and click OK
`WINDOW_CORR(SUM([Profit]), SUM([Sales]))`
• Returns the Pearson correlation coefficient of two expressions within the window. The window is defined as offsets from the current row. Use FIRST()+n and LAST()-n for offsets from the first or last row in the partition. If start and end are omitted, the entire partition is used.
The new correlation and covariance functions were added to Tableau Desktop 10.2, for more information see What's new in Tableau Desktop.

To view the steps showed in the below video, please expand the above section.
Note: the video has no sound.

CLICK TO EXPAND STEPS
Option 2 - Using earlier versions of Tableau Desktop
The equivalent of the three calculations used in Option 1 can also be reproduced using the following formulas.

1. Select Analysis > Create calculated field
2. Name the calculated field
3. Enter the following formula and click OK
`COVAR([Profit], [Sales]) / (STDEV([Profit])*STDEV([Sales]))`
No offset specified.
1. Select Analysis > Create calculated field
2. Name the calculated field
3. Enter the following formula and click OK
```WINDOW_COVAR(SUM([Profit]), SUM([Sales]))/
(WINDOW_STDEV(SUM([Profit]))*WINDOW_STDEV(SUM([Sales])))```
With offsets.
1. Select Analysis > Create calculated field
2. Name the calculated field
3. Enter the following formula and click OK
```WINDOW_COVAR(SUM([Profit]), SUM([Sales]),-5,0)/
(WINDOW_STDEV(SUM([Profit]),-5,0)*WINDOW_STDEV(SUM([Sales]),-5,0))```

For more examples and advanced explanations, see Covariance, Trend Lines, Correlation Coefficient R and R-Squared in the Community.

• A correlation, r, is a single number that represents the degree of relationship between two measures. The correlation coefficient is a value such that -1 <= r <= 1.
• A positive correlation indicates a relationship between x and y measures such that as values of x increase, values of y also increase.
• A negative correlation indicates the opposite—as values of x increase, values of y decrease.
• The closer the correlation, r, is to -1 or 1, the stronger the relationship between x and y.
• If r is close to or equal to 0, there is a weak relationship or no relationship between the measures.
• As a general rule, you can interpret r values this way:
• +.70 or higher indicates a very strong positive relationship
• +.40 to +.69 indicates a strong positive relationship
• +.20 to +.39 indicates a moderate positive relationship
• -.19 to +.19 indicates no or a weak relationship
• -.20 to -.39 indicates a moderate negative relationship
• -.40 to -.69 indicates a strong negative relationship
• -.70 or lower indicates a very strong negative relationship

See supported data sources in Tableau Help when using the CORR() aggregate function.