KNOWLEDGE BASE

Finding the Pearson Correlation


Published: 15 Feb 2017
Last Modified Date: 05 Jun 2023

Question

How to find the Pearson correlation of two measures in Tableau Desktop.

Environment

Tableau Desktop
 

Answer

Step 1: Create a scatterplot

This example uses the Superstore sample data and is attached to this article. Open the workbook Pearson Correlation.twbx for more information.  
  1. Drag Profit to Columns and Sales to Rows.
  2. In the Analysis menu, uncheck Aggregate Measures
  3. Right-click the view and choose Trend Lines > Show Trend Lines.
  4. Right-click the view again and select Trend Lines > Describe Trend Model
  5. Locate the R-Squared value in the Describe Trend Model dialog box. In this example, the R-Squared value is 0.229503.

Step 2: Calculate the Pearson correlation

You can use different options to find the Pearson correlation. For example: 

  1. Use a calculator or other program
    1. Calculate the square root of the R-squared value. Which will be your correlation (r): √0.229498 = 0.4791
    2. Rounded to two digits, the value in this example is 0.48.
  2. Create a calculated field using the CORR function. 
    • Enter a formula similar to the following and click OK
      CORR([Profit], [Sales])
    • This formula returns the Pearson correlation coefficient of two expressions. The Pearson correlation measures the linear relationship between two variables. Results range from -1 to +1 inclusive, where 1 denotes an exact positive linear relationship, as when a positive change in one variable implies a positive change of corresponding magnitude in the other, 0 denotes no linear relationship between the variance, and −1 is an exact negative relationship.
  3. Create a calculated field using the WINDOW_CORR function. 
    • Enter a formula similar to the following and click OK
      WINDOW_CORR(SUM([Profit]), SUM([Sales]))
    • Returns the Pearson correlation coefficient of two expressions within the window. The window is defined as offsets from the current row. Use FIRST()+n and LAST()-n for offsets from the first or last row in the partition. If start and end are omitted, the entire partition is used.
  Note: the video has no sound.
 
For more examples and advanced explanations, see Covariance, Trend Lines, Correlation Coefficient R and R-Squared in the Community.

Additional Information

  • A correlation, r, is a single number that represents the degree of relationship between two measures. The correlation coefficient is a value such that -1 <= r <= 1.
  • A positive correlation indicates a relationship between x and y measures such that as values of x increase, values of y also increase.
  • A negative correlation indicates the opposite—as values of x increase, values of y decrease.
  • The closer the correlation, r, is to -1 or 1, the stronger the relationship between x and y.
  • If r is close to or equal to 0, there is a weak relationship or no relationship between the measures.
  • As a general rule, you can interpret r values this way:
    • +.70 or higher indicates a very strong positive relationship
    • +.40 to +.69 indicates a strong positive relationship
    • +.20 to +.39 indicates a moderate positive relationship
    • -.19 to +.19 indicates no or a weak relationship
    • -.20 to -.39 indicates a moderate negative relationship
    • -.40 to -.69 indicates a strong negative relationship
    • -.70 or lower indicates a very strong negative relationship
View the supported data sources for the CORR() aggregate function in Tableau Help
 
Did this article resolve the issue?