Published: 12 Apr 2017 Last Modified Date: 24 Aug 2022
Question
How to determine whether to join tables or use data blending.
Environment
Tableau Desktop
Answer
Assess your data and consider the following:
How many connections you have in your workbook
How the data in each source is related
The number of records you have in the data
The types of views you will need to create
In most scenarios, traditional table joins are the best choice as they tend to be most performant as the work is done by the database rather than the local machine. Generally cross-database joins are recommended over data blending as there are fewer limitations.
Tableau Desktop cannot join published data sources, most extract-only data sources, or cube data sources. You may consider moving the data to another data source, or creating a local copy of a published data source, in order to use a cross-database join, otherwise the data must be blended.
If the secondary table has a large amount of data then data blending may be faster, because data blending will aggregate the data first.
Table joins are better when tables have a 1:1 relationship (i.e. there is only one record for each value in the linking fields in each table). If the tables have a 1:many or many:many relationship this creates duplicated data after the join. See Over-counting values or duplicated data Note: Duplicated data can be resolved after the table join is created, but the increased amount of data may cause performance issues.
Data blending has several limitations due to design constraints. For more information about common issues with data blending, see Troubleshoot Data Blending as well as Related Links in the right-hand side of this article
Data blending can only be a LEFT join, which means the primary table should contain all possible values.
Data blending can be more flexible for quickly changing the linking properties on the fly for ad-hoc analysis. For example, data blending allows data sources to be linked on aliases.
Additional Information
If one of the data sources that you want to use for data blending is a cube, the cube must be the primary data source.