KNOWLEDGE BASE

Configuring Parallel Queries in Tableau Desktop


Published: 28 Oct 2016
Last Modified Date: 30 Dec 2016

Question

How to configure parallel queries in Tableau Desktop.

Environment

Tableau Desktop

Answer

Tableau Desktop can initiate multiple, parallel connections automatically when communicating with a data source. You can specify any and all of the following:
  • A global limit on the number of parallel queries for Tableau Desktop

  • Limits for a particular data source type, such as SQL Server

  • Limits for a particular data source type on a specific server

  • Limits for a particular data source type, on a specific server, when connecting to a specific database

These settings are managed by an xml file named connection-configs.xml which you create and save in the Tableau Desktop folder (by default, C:\Program Files\Tableau\Tableau <Version> on a Windows computer. On a Mac, the folder location is Applications/Tableau.app/Content. To access this folder from Finder, navigate to the Tableau application, press Control+Click, and choose Show Package Content.

A sample connection-configs.xml file is shown below.

<?xml version='1.0' encoding='utf-8' ?>
<connection-list>
   <connection class='sqlserver' dbname='SalesDB' server='SQLserver10'>
      <limit max='10'>
      </limit>
   </connection>
   <connection class='mysql' dbname='Inventory' server='MySQL_Public'>
      <limit max='10'>
      </limit>
   </connection>
   <connection class='teradata' schema='TravelData' server='TeradataProduction2'>
      <limit max='12'>
      </limit>
   </connection>
   <connection class='sqlserver'>
      <limit max='4'>
      </limit>
   </connection>
   <connection class='mysql'>
      <limit max='6'>
      </limit>
   </connection>
   <connection class='teradata'>
      <limit max='10'>
      </limit>
   </connection>
   <connection>
      <limit max='8'>
      </limit>
   </connection>
</connection-list>

Set a Global Limit

A <connection> tag with no information about data source type, server, or database defines the maximum number of queries for all connections not covered by any other, more specific, connection specification. For example:

   <connection>
      <limit max='8'>
      </limit>
   </connection>

This value overrides the default parallel query limit, which is 16 for all data sources types except Amazon Redshift (for which the default is 8).

Set Additional Limits

Connection settings in connection-configs.xml can work at different levels of specificity. When establishing a connection, Tableau parses the file from top to bottom, so if there is more than one entry that would apply to a given connection, the entry nearest the top of the file is used.

For this reason, it is also appropriate to order connection specifications from most specific to least specific.

Additional (non-global) limits are set in a <connection-list> section at the top of the connection-configs.xml file.

The parameters you can use in a <connection> tag are:

ParameterString
classThe data source type. See the “Class Strings” table, below, for the strings to use for all data source types.
serverThe server computer.
dbnameThe database name.
schemaThis setting can be useful with Teradata data sources.

You can specify parameters in any order and any combination—for example, you could specify a server and a dbname without specifying a class.

Class Strings

Use the following strings to identify data source types in connection-configs.xml.

Data SourceString
Actian Vectorwise"vectorwise"
Amazon EMR"awshadoophive"
Amazon Redshift“redshift”
Aster Database"asterncluster"
Cloudera Hadoop"hadoophive"
Extracts created from cube data sourcesNot configurable.
DataStax Enterprise"datastax"
EXASolution"exasolution"
Firebird"firebird"
Generic ODBC"genericodbc"
Google Analytics"google-analytics"
Google BigQuery"bigquery"
Hortonworks Hadooop Hive"hortonworkshadoophive"
HP Vertica“vertica”
IBM BigInsights“bigsql”
IBM DB2"db2"
JavaScript Connector"jsconnector"
MapR Hadoop Hive"maprhadoophive"
MarkLogic"marklogic"
Microsoft Access"msaccess”
Microsoft Analysis Services"msolap"
Microsoft ExcelNot configurable.
Microsoft PowerPivot"powerpivot"
Microsoft SQL Server"sqlserver"
MySQL"mysql"
IBM Netezza"netezza"
OData“odata”
Oracle"oracle"
Oracle Essbase“essbase”
ParAccel"paraccel"
Pivotal Greenplum"greenplum"
PostgreSQL"postgres"
Progress OpenEdge"progressopenedge"
SAP HANA"saphana"
SAP Netweaver Business Warehouse"sapbw"
SAP Sybase ASE"sybasease"
SAP Sybase IQ"sybaseiq"
Salesforce"salesforce"
Spark SQL"spark"
Splunk"splunk"
Statistical FileNot configurable.
Tableau Data Extract"dataengine"
Teradata“teradata”
Text file“csv”

Logging

When the connections are matched to entries in connection-configs.xml, Tableau logs the connection limit for debugging purposes.

Additional Information

Prior to version 9.0, Tableau Desktop maintained a single connection per data source, and would send all queries sequentially. With serial query processing, the elapsed time for executing all queries against a data source was equal to or greater than the sum of the elapsed time for each individual query execution.

But many database systems are designed to execute queries concurrently, creating the opportunity to reduce the total elapsed time spent running all the queries needed for a data source.

Did this article resolve the issue?