KNOWLEDGE BASE

Calculating a Running or Moving Count Distinct


Published: 07 Nov 2015
Last Modified Date: 07 May 2018

Question

How to calculate a running, or moving, count distinct of a dimension.

For example, the running total of unique customers who have made an order over all time, or a moving total of unique customers who have made an order within a window of time, such the last three weeks.

Environment

Tableau Desktop

Answer

The attached example workbook uses the sample data set Superstore to demonstrate the following directions.
CLICK TO EXPAND STEPS
Option 1: Use FIXED to find the running count distinct
Note: this method will not work to find a moving distinct count.
  1. Select Analysis > Create Calculated Field
  2. In the Calculated Field dialog box that opens, do the following, and then click OK:
    1. Name the calculated field. In this example, the calculated field is named "Customer's First Order"
    2. In the formula field, create a calculation similar to the following:
      { FIXED [Customer Name], [Order Date] : MIN(
      IF [Order Date] = { EXCLUDE [Order Date] : MIN([Order Date])}
      THEN 1
      ELSE 0
      END
      ) }
      
  3.  Create a calculated field with a name like "Accurate Running COUNTD of Customers" with a calculation similar to the following:
    RUNNING_SUM(SUM([Count Customer's 1st Order]))
  4. Right-click and drag [Order Date] into the view
  5. In the Drop Field dialog, select WEEK(Order Date) with the green calendar icon
  6. Drag [Accurate Running COUNTD of Customers] to the Rows shelf
CLICK TO EXPAND STEPS
Option 2: Use table functions to find the moving count distinct
Note: This solution will likely cause performance issues as the final view must contain all dimensions needed to compute the table calculation.
Step 1 - Create 3 calculated fields
  1. Select Analysis > Create Calculated Field
    1.  In the Calculated Field dialog box that opens, do the following, and then click OK:
    2. Name the calculated field. In this example, the calculated field is named "Every 3 Week Period a Customer is In"
    3. In the formula field, create a calculation similar to the following:
      	WINDOW_MAX( MAX(1), -2, 0 )
    • Tableau Calculations are heavily dependent on the view. So without context of the view, this calculation will first return the value 1 for every mark (e.g. row, point, bar, etc…) in the view, and then will return the maximum of those 1s from two marks back to this mark.
    • In this example, the final view has a mark for every week for every customer customer. So with that context, this calculation will return 1 if the customer had an order this week or the two weeks prior. If a customer had an order for every week, this calculation will still only return 1 because we are using WINDOW_MAX() rather than WINDOW_SUM().
    • This calculation alone will only return a 1 or 0 for each customer. Therefore we need to nest this calculation in a WINDOW_SUM() to add up all of the 1s for all of the customers. These calculations must be in separate calculated fields so that we can set how the table functions are computing differently.
  2. Create a calculated field with a name like "Moving Count of Customers for Every 3 Weeks" with a calculation similar to the following:
    WINDOW_SUM([Every 3 Week Period a Customer is In])
  3. Create a calculated field with a name like "First Filter" with a calculation similar to the following:
    FIRST() = 0
Step 2 - Build the View
  1. Right-click and drag [Order Date] to the Columns shelf
  2. In the Drop Field dialog, select WEEK(Order Date) with the green calendar icon
    1. Right-click WEEK(Order Date) on the Columns shelf and select Discrete
    2. The date field needs to be discrete otherwise the line graph will be broken by [Customer Name]
  3. Drag [Customer Name] to Detail on the Marks card.
    • The [Customer Name] field must be in the view so that the table calculations will compute correctly. We will later filter the view to look like there is only one line
  4. Drag [Moving Count of Customers for Every 3 Weeks] to the Rows shelf
Step 3 - Edit Table Calculations
  1. Right-click [Moving Count of Customers for Every 3 Weeks] on the Rows shelf and select Edit Table Calculation…
  2. In the Table Calculation dialog, do the following and close the dialog:
    1. Under Nested Calculations, select Every 3 Week Period a Customer is In
    2. Under Compute Using, select Specific Dimensions
    3. In the list of dimensions, check only Week of Order Date
    4. Under Nested Calculations, select Moving Count of Customers for Every 3 Weeks
    5. Under Compute Using, select Specific Dimensions
    6. In the list of dimensions, check only Customer Name
      • For [Every 3 Week Period a Customer is In] we want Tableau Desktop to count every 3 week period a customer is in per customer. In other words, we want Table Desktop to count weeks within each customer. When a dimension is checked, that means the table calculation will be run for every value of the dimension. When a dimension is unchecked, it means that Tableau Desktop will start the calculation ever for every value of that dimension.
      • Similarly, we want [Moving Count of Customers for Every 3 Weeks] to sum up all customers within each week.
  3. Drag [First Filter] to the Filters shelf, and click OK to close the Filter dialog
  4. Right-click [First Filter] on the Filters shelf and select Compute using > Customer Name.
  5. In the Filter dialog, check True and click OK.

Additional Information

Notes on Option 1:
  • This calculation is counting every customer once, the first time they appear in the data set. This means that this method will not work to find a moving distinct count, because the FIXED expression cannot find the first order date for a moving time period.
  • The calculation in step 2 does the following:
    • First the expression { EXCLUDE [Order Date] : MIN([Order Date])} finds the first order date per customer. Normally, we would use the expression { FIXED [Customer Name] : MIN([Order Date])} to find the first order. However, the scope of the EXCLUDE is set by the outer FIXED statement. In other words, the EXCLUDE statement starts with the dimensions  [Customer Name] and [Order Date] and then excludes [Order Date].
    • Then the IF statement returns 1 on the date that is the first order date for that customer.
    • The IF statement will return 1 on any row where the date matches the first order date. This means that if the underlying data has several records for that first order, each of those rows will be 1. As we only want to count each customer once, the minimum of the IF statement is fixed to [Customer Name].
    • We also need to add [Order Date] to the dimension declaration of the FIXED statement so that the final view will show the change over time.
Notes on Option 2:
  • Tableau Calculations are heavily dependent on the view. So without context of the view, the calculation in step 2 will first return the value 1 for every mark (e.g. row, point, bar, etc…) in the view, and then will return the maximum of those 1s from two marks back to this mark.
  • In this example, the final view has a mark for every week for every customer customer. So with that context, this calculation will return 1 if the customer had an order this week or the two weeks prior. If a customer had an order for every week, this calculation will still only return 1 because we are using WINDOW_MAX() rather than WINDOW_SUM().
  • This calculation alone will only return a 1 or 0 for each customer. Therefore we need to nest this calculation in a WINDOW_SUM() to add up all of the 1s for all of the customers.
  • These calculations must be in separate calculated fields so that we can set how the table functions are computing differently.
Did this article resolve the issue?