Published: 29 Mar 2022 Last Modified Date: 30 Mar 2022
Question
The "New Rows" step in Tableau Prep fills missing dates or numbers in a field across the entire data set, but what if you wanted to fill gaps in a date or number field per category?
For example, the field [Order Number] in the sample data set Superstore has the date #12/27/2021# and thus it would not be filled in by the "New Rows" step, but this date is missing for the "Furniture" category.
Environment
Tableau Prep
Answer
The following steps are demonstrated in the packaged prep flow "Fill missing dates per category.tflx," which can be downloaded from the right-hand pane of this article.
Connect to the sample superstore data set
From the input step, create an aggregation step and add Order Date to the Grouped Fields
From the "Aggregate 1" step, add a new rows step
In the "New Rows 1" step, select Values from one field and then select Order Date in the dropdown
From the "New Rows 1" step, add a clean step
In the "Clean 1" step, create a calculation called "Join Calc" with the formula "1"
From the input step, create a new aggregation step (this will create a new branch) and add Category to the Grouped Fields
From the "Aggregate 2" step, add a clean step
In the "Clean 2" step, create a calculation called "Join Calc" with the formula "1"
Drag "Clean 2" onto "Clean 1" and drop on Join
In "Join 1" join on Join Calc = Join Calc
Drag "Join 1" onto the input step and drop on Join
In "Join 2" do the following:
Set the join to a RIGHT join, which keeps all of the data from "Join 1"
Add join clauses on Order Date = Order Date and Category = Category
From the "Join 2" step, add a clean step
In "Clean 3" do the following:
Remove Join Calc and Join Calc-1
Merge Order Date and Order Date-1
Merge Category and Category-1
Additional Information
This prep flow first creates a complete list of all dates and joins that on 1=1 to a complete list of categories. The join on 1=1 forces duplication so that every category has every date. Joining back to the original data requires a RIGHT join because the complete list of all dates per all categories has more records than the original data.
Thank you for providing your feedback on the effectiveness of the article.