KNOWLEDGE BASE

Grouping consecutive values based on a condition


Published: 16 Feb 2018
Last Modified Date: 15 Aug 2023

Question

How to group consecutive values that meet a condition, or in other words all the values that belong to the same streak.

For example, creating groups of days based on the number of consecutive days that had a negative profit. If there are 3 days in a row with negative profit, those days would all belong to group "3"

Environment

  • Tableau Desktop
  • Optional: Tableau Prep

Answer

The following instructions, using the sample data set Superstore, are demonstrated in the workbook "running count groups_v2021.1.twbx" which is downloadable from the right-hand pane.

These directions create a highlight table that counts the number of streaks in each grouping. For example how many 3-day streaks of negative profit are in the data set? The workbook also includes instructions for creating a crosstab view and a Gantt chart view.

Option 1: Use PREVIOUS_VALUE() in Tableau Desktop

We can find the starting date for a group of consecutive days by checking if the current day meets the condition (has negative profit) and the previous day does not. Using the PREVIOUS_VALUE() function, we can repeat the starting date value for every day in the grouping. Then we can find end date with a similar process but reading the data from oldest to newest date.

Create the Calculations

  1. Optional: Create a calculated field with a name like "Profit no gaps" with a calculation similar to the following:
    ZN(LOOKUP(SUM([Profit]),0))
    
    
    Note: If the data set is missing days, then this step will allow us to treat missing days as having zero profit. If you skip this step use SUM([Profit]) rather than [Profit no gaps] in all following steps.
     
  2. Create a calculated field with a name like "Start Date" with a calculation similar to the following:
        IF [Profit no gaps] >= 0
    	THEN NULL
    	
    	ELSEIF LOOKUP([Profit no gaps],-1) >= 0
    	THEN MIN([Order Date])
    	
    	ELSE PREVIOUS_VALUE(#1/1/18#)
    	END
    
    
  3. Create a calculated field with a name like "Next Profit Value" with a calculation similar to the following:
    
        LOOKUP([Profit no gaps],1)
        
  4. Create a calculated field with a name like "End Date" with a calculation similar to the following:
    
    	IF [Profit no gaps] >= 0
    	THEN NULL
    	
    	ELSEIF [Next Profit Value] >= 0
    	THEN MIN([Order Date])
    	
    	ELSE PREVIOUS_VALUE(#1/1/18#)
    	END 
    
    
  5. Create a calculated field with a name like "Days in Streak" with a calculation similar to the following:
    	
        DATEDIFF('day', [Start Date], [End Date]) + 1
        
  6. Create a calculated field with a name like "# of Streaks" with a calculation similar to the following:
    
        IF [Days in Streak] = 1 THEN
    	    WINDOW_SUM(
    	    IF [Days in Streak] = 1
    	    THEN COUNTD([Order Date])
    	    END
    	    )
    	ELSEIF [Days in Streak] = 2 THEN
    	    WINDOW_SUM(
    	    IF [Days in Streak] = 2
    	    THEN COUNTD([Order Date])
    	    END
    	    )
    	ELSEIF [Days in Streak] = 3 THEN
    	    WINDOW_SUM(
    	    IF [Days in Streak] = 3
    	    THEN COUNTD([Order Date])
    	    END
    	    )
    	ELSEIF ...
    	END / [Days in Streak]    
     

    Note: The calculated fields [# of Streaks] and [Keep only one date per streak group] must include a condition for every possible number of days in a streak. For more explanation on why see the article Using an Aggregate Field to Define the Level of Detail in a Table Calculation

  7. Create a calculated field with a name like "Keep only one date per streak group" with a calculation similar to the following:
    
        MIN([Order Date]) = WINDOW_MIN(IF[Days in Streak] = 1 THEN [Start Date] END)
    	OR MIN([Order Date]) = WINDOW_MIN(IF[Days in Streak] = 2 THEN [Start Date] END)
    	OR MIN([Order Date]) = WINDOW_MIN(IF[Days in Streak] = 3 THEN [Start Date] END)
    	OR ...
    


Build the Highlight Table

  1. Drag [Category] to the Columns shelf
  2. Right-click and drag [Order Date] to Detail on the Marks card
  3. In the Drop Field dialog, select MDY(Order Date) and click OK
  4. Drag [Days in Streak] to Label on the Marks Card
  5. Right-click [Days in Streak] on Label and select Edit Table Calculation...
  6. In the Table Calculation dialog, do the following:
    1. Under Nested Calculations, select Start Date
    2. Under Compute Using, select Specific Dimensions
    3. Check only Month, Day, Year of Order Date
    4. Under Nested Calculations, select Profit no gaps
    5. Under Compute Using, select Specific Dimensions
    6. Uncheck all dimensions
    7. Under Nested Calculations, select Next Profit Value
    8. Under Compute Using, select Specific Dimensions
    9. Check only Month, Day, Year of Order Date
    10. Under Nested Calculations, select End Date
    11. Under Compute Using, select Specific Dimensions
    12. Check only Month, Day, Year of Order Date
    13. In the Sort Order dropdown, select Custom, Order Date, Minimum, Descending
  7. Ctrl drag [Days in Streak] from Label onto the Rows shelf
  8. Right-click [Days in Streak] on the Rows shelf and select Discrete
  9. Drag [Keep only one day per streak group] to the Filters shelf and click OK to close the Filter dialog
  10. Right-click [Keep only one day per streak group] on the Filters shelf and select Edit Table Calculation...
  11. In the Table Calculation dialog, do the following:
    1. Repeat steps 6-1 to 6-13
    2. Under Nested Calculations, select Keep only one date per streak group
    3. Under Compute Using, select Specific Dimensions
    4. Check only Month, Day, Year of Order Date
  12. Right-click [Keep only one day per streak group] on the Filters shelf and select Edit Filter. In the Filter dialog, check only True and click OK
  13. Ctrl drag [Days in Streak] from the Rows shelf to Color on the Marks card
  14. In the dropdown on the Marks card, select Square
  15. Click Size on the Marks card and adjust the slider all the way to the right
  16. Create a calculated field with a name like "Blank" with a calculation similar to the following:
    ""
    
    
  17. Drag one copy of [Blank] to the Rows shelf, and another one to the Columns shelf
  18. Right-click [Blank] on the Rows shelf and uncheck Show Header
 

Option 2: Use Tableau Prep

Creating the prep flow is more upfront work, but building the views from the prep output is much simpler compared to building the views with table calculations. The "Grouping Consequtive days.tflx" prep flow demonstrates the following steps and can be downloaded from the right-hand pane.

Create the Prep Flow

  1. Connect to Superstore and drag Orders into the canvas
  2. Optional: follow directions in Fill Gaps in Sequential Data or Fill Gaps in Sequential Data per Category to fill missing days in the data set
  3. Create a clean step named "Main Branch"
  4. Create an aggregate step named "Keep Only Fields"
  5. In the "Keep Only Fields" aggregate step, add Category and Order Date to Grouped Fields and SUM(Profit) to Aggregated Fields
  6. Create a clean step named "Row Number"
  7. In the "Row Number" clean step, do the following:
    1. Click Create Calculated Field... to create a new calculation
    2. Name the calculation "Condition"
    3. Add the formula for the condition being used to group days. For example,
      
      IF [Profit] < 0
      THEN 'negative'
      ELSE 'positive'
      END
      
      
    4. Create a new calculated field named "Row Number" with a formula similar to
      
      { PARTITION [Category], [Condition] : { ORDERBY [Order Date] ASC : RANK_DENSE()}}
      
  8. Create a clean step named "Row Number + 1"
  9. In the "Row Number + 1" clean step, do the following:
    1. Create a new calculated field named "Row Number + 1" with a formula similar to 
      
      [Row Number] + 1
      
      
    2. Remove [Row Number]
  10. Drag the "Row Number + 1" clean step on top of the "Row Number" clean step and drop on Join
  11. In the "Join 1" join step, do the following:
    1. Tableau Prep should automatically create a join clause on Condition = Condition, if not add this join clause
    2. Click the plus in the left-hand gray side panel in the Applied Join Clauses section
    3. In the "Row Number" menu select Category
    4. In the "Row Number + 1" menu select Category
    5. Add another join clause on Row Number = Row Number + 1
    6. On the venn diagram, select all of the "Row Number" circle to create a Left join
  12. Create a clean step named "Start Date"
  13. In the "Start Date" clean step, do the following:
    1. Click [Category] and then ctrl click [Category-1] to select both fields
    2. Click Merge Fields
    3. Merge [Condition] and [Condition-1]
    4. Create a new calculated field named "Start Date" with a formula similar to 
      
      IF ISNULL(DATEDIFF('day', [Order Date-1], [Order Date]))
      OR DATEDIFF('day', [Order Date-1], [Order Date]) > 1
      THEN [Order Date]
      END
      
      
    5. Click Filter Values...
    6. In the Add Filter dialog, add a formula similar to NOT ISNULL([Start Date]) and click Save
    7. Create a new calculated field named "Start Date Rank" with a formula similar to 
      { PARTITION [Category], [Condition] : { ORDERBY [Start Date] ASC: RANK_DENSE()}}
      
      
    8. Remove [Row Number], [Row Number + 1], [Order Date], [Order Date-1]
  14. Hover over the "Row Number" clean step and click the plus sign off the step (not the plus sign embedded in a line) to create a new clean step named "Row Number - 1" that will create a separate branch.
  15. Note: Steps 15-19 are very similar to steps 9-13, but with some differences to find the end date rather than the start date. Do not copy the calculated fields from the "Start Date" branch as there are differences. In the "Row Number - 1" clean step, do the following:
    1. Create a new calculated field named "Start Date Rank" with a formula similar to 
      
      [Row Number] - 1
      
      
    2. Remove [Row Number]
  16. Drag the "Row Number - 1" clean step on top of the "Row Number" clean step and drop on Join
  17. In the "Join 2" join step, do the following:
    1. Create join clauses on Row Number = Row Number - 1; Category = Category; Condition = Condition
    2. On the venn diagram, select all of the "Row Number" circle to create a Left join
  18. Create a clean step named "End Date"
  19. In the "End Date" clean step, do the following:
    1. Merge [Category] and [Category-1] 
    2. Merge [Condition] and [Condition-1]
    3. Create a new calculated field named "End Date" with a formula similar to
      
      IF ISNULL(DATEDIFF('day', [Order Date], [Order Date-1]))
      OR DATEDIFF('day', [Order Date], [Order Date-1]) > 1
      THEN [Order Date]
      END
      
      
    4. Add a filter with a formula similar to NOT ISNULL([End Date])
    5. Create a new calculated field named "End Date Rank" with a formula similar to 
      
      { PARTITION [Category], [Condition] : { ORDERBY [End Date] ASC: RANK_DENSE()}}
      
      
    6. Remove [Row Number], [Row Number + 1], [Order Date], [Order Date-1]
  20. Drag the "End Date" clean step over the "Start Date" clean step and drop onto Join
  21. In Join 3, create join clauses for Condition = Condition; Category = Category; End Date Rank = Start Date Rank
  22. Create a clean step named "Clean up"
  23. In the "Clean up" clean step, do the following:
    1. Merge [Category] and [Category-1]
    2. Merge [Condition] and [Condition-1]
    3. Create a new calculated field named "# of Continuous Days" with a formula similar to 
      
      DATEDIFF('day', [Start Date],[End Date])+1
      
      
    4. Remove [Start Date Rank] and [End Date Rank]
  24. Drag the "Clean up" clean step over the "Main Branch" clean step and drop on Join
  25. In Join 4, create join clauses for Category = Category; Order Date >= Start Date; Order Date <= End Date
  26. Create a clean step named "Final Clean up"
  27. In the "Final Clean up" clean step, do the following:
    1. Merge [Category] and [Category-1]
  28. Optional: Click drag to select all of the steps in the side branch, right-click any highlighted step, and select Group

Create the Highlight Chart

  1. Drag [Category] to the Columns shelf
  2. Drag [# of Continuous Days] to the Rows shelf
  3. Right-click SUM(# of Continuous Days) on the Rows shelf and select Dimension
  4. Right-click [# of Continuous Days] on the Rows shelf and select Discrete
  5. Right-click and drag [Start Date] to Text
  6. In the Drop Field menu, select CNTD(Start Date) and click OK
  7. Ctrl drag CNTD(Start Date) from Label to Color on the Marks card
  8. In the dropdown menu on the Marks card, select Square
  9. Drag [Condition] to the Filters shelf
  10. In the Filter dialog, check negative and click OK
  11. In the fit dropdown menu where it says 'Standard', select 'Fit Width'

Additional Information

Notes on Option 1 Using PREVIOUS_VALUE()
  • The [Start Date] calculation returns NULL if the condition (profit is negative) is not met. Then the calculation checks if the condition was not met on the previous day, if so it returns the order date value. Finally if the current day and the previous day both meet the condition, then the calculation uses PREVIOUS_VALUE() to return the date from this calculation on the previous row. In other words, as Tableau Desktop reads through each row (aka each day), this calculation will return either NULL, that value of order date, or will repeat the value of order date it returned previously.
  • All of the table calculations in this view must be computed correctly to show the expected values. See Transform Values with Table Calculations for more information on how compute using settings change the results of a table calculation
  • Unlike LOOKUP(), the function PREVIOUS_VALUE() will return the last value of this calculation
Notes on Option 2 Using Tableau Prep:
  • The prep flow output includes start dates for groupings that meet the condition (has negative profit) and groupings that do not meet the condition. A filter could be added in the "Row Number" clean step to keep only [Condition]='negative' to exclude groupings that do not meet the condition, otherwise the [Condition] filter needs to be added to the view.

     

Transform Values with Table Calculations
Table Calculation Functions
Did this article resolve the issue?