KNOWLEDGE BASE

Configuring Parallel Queries in Tableau Server


Product(s): Tableau Server
Version(s): 9.0, 9.1, 9.2
Last Modified Date: 18 Oct 2016

Prior to version 9.0, Tableau Server maintained a single connection per data source, and would send all queries sequentially. With serial query processing, the elapsed time for executing all queries against a data source was equal to or greater than the sum of the elapsed time for each individual query execution.

But many database systems are designed to execute queries concurrently, creating the opportunity to reduce the total elapsed time spent running all the queries needed for a data source.

You can set parallel query limits for Tableau Server either with tabadmin set commands or with a connection-configs.xml file. Once you configure Tableau Server for parallel queries, the feature is on automatically.

Using tabadmin set commands, you can set global limits as well as limits for a particular data source type. Using tabadmin set is the safest and simplest method. You cannot, however, set limits at any finer level of granularity (for example, per server or per schema), as you can with a connection-configs.xml file. The advantage of using tabadmin set commands is that it is not necessary to copy the configuration file to all the vizqlserver configuration directories on all worker machines.

Setting Limits with tabadmin set Commands

You can also set parallel query limits with tabadmin set commands. Changes are written to the tabsvc.yml file.

The format for setting parallel query limits using tabadmin set is:

tabadmin set native_api.connection.limit.<connection class> <n>

For example, to set a global limit:

tabadmin set native_api.connection.globallimit 8

Or to set a limit on a specific type of data source:

tabadmin set native_api.connection.limit.sqlserver 5

For a list of connection class strings, see “Class Strings,” below.

Setting Limits with a connection-configs.xml File

Tableau Server will initiate multiple, parallel connections automatically when communicating with a data source. Using a connection-configs.xml file, you can specify any and all of the following:

•A global limit on the number of parallel queries for Tableau Server

•Limits for a particular data source type, such as SQL Server

•Limits for a particular data source type on a specific server

•Limits for a particular data source type, on a specific server, when connecting to a specific database

These settings are managed by an xml file named connection-configs.xml which you create and save in in the config directory in the vizqlserver folder (for example: C:\ProgramData\Tableau\TableauServer\data\tabsvc\config\vizqlserver). You must copy this configuration file to all the vizqlserver configuration directories in all worker machines.

See the sample connection-configs.xml file at the end of this article.

Set a Global Limit

A <connection> tag with no information about data source type, server, or database defines the maximum number of queries for all connections not covered by any other, more specific, connection specification. For example:

   <connection>
      <limit max='8'>
      </limit>
   </connection>

This value overrides the default parallel query limit. For the vizqlserver process, the default limit is 16; for the dataserver process, the default limit is 1000000. However, for both proceses, the default is 2 for Amazon Redshift. See "Limits for Specific Server Processes," below.

Set Additional Limits

Connection settings in connection-configs.xml can work at different levels of specificity. When establishing a connection, Tableau parses the file from top to bottom, so if there is more than one entry that would apply to a given connection, the entry nearest the top of the file is used.

For this reason, it is also appropriate to order connection specifications from most specific to least specific.

Additional (non-global) limits are set in a <connection-list> section at the top of the connection-configs.xml file.

The parameters you can use in a <connection> tag are:

ParameterString
classThe data source type. See the “Class Strings” table, below, for the strings to use for all data source types.
serverThe server computer.
dbnameThe database name.
schemaThis setting can be useful with Teradata data sources.

You can specify parameters in any order and any combination—for example, you could specify a server and a dbname without specifying a class or schema.

Limits for Specific Server Processes

You can also set limits on a per-connection basis for the vizqlserver and dataserver processes in Tableau Server, to limit how many simultaneous connections these processes can create.

You specify process limits by adding a process attribute within the <limit> element.

To set a connection limit for one of the processes, create a connection configuration like this:

<connection class='sqlserver'>
   <limit max='6' process='vizqlserver'>
   </limit>
</connection>

The vizqlserver process is limited to six simultaneous connections. Because no limit is specified for the dataserver process, the connection limit is determined by the global limit.

Alternately, you could create a connection element like this:

<connection class='sqlserver'>
   <limit max='6' process='vizqlserver'>
   </limit>
   <limit max='12'>
   </limit>
</connection>

As before, the vizqlserver process is limited to six simultaneous connections. But the dataserver process now uses the connection-specific limit for this connection class (12), rather than the global limit. When no process attribute is specified for a <limit> element, that limit applies to any server processes not specified above it within the connection element.

Finally, you can set separate specific limits for each process:

<connection class='sqlserver'>
   <limit max='6' process='vizqlserver'>
   </limit>
   <limit max='12' process='dataserver'>
   </limit>
</connection>

This is equivalent to the previous example, where dataserver is not specified in the second <limit> element. The syntax is designed to eventually be able to support additional processes.

If you want to set a limit for both server processes in a connection, you can either explicitly list them both (with a comma delimiter) in a single <limit> element, or just use a <limit> element with no processes specified. Thus, the following two limit elements are equivalent:

   <limit max='10' process='dataserver, vizqlserver'>
   <limit max='10'>

Process limits are only relevant for configuring parallel queries on Tableau Server. If you use a connection-configs.xml file that contains process limits with Tableau Desktop, any <limit> element that specifies limits for either or both processes is ignored.

Example connection-configs.xml File

<?xml version='1.0' encoding='utf-8' ?>
<connection-list>
   <connection class='sqlserver' dbname='SalesDB' server='SQLserver10'>
      <limit max='10'>
      </limit>
   </connection>
   <connection class='mysql' dbname='Inventory' server='MySQL_Public'>
      <limit max='10'>
      </limit>
   </connection>
   <connection class='teradata' schema='TravelData' server='TeradataProduction2'>
      <limit max='12'>
      </limit>
   </connection>
   <connection class='sqlserver'>
      <limit max='4'>
      </limit>
   </connection>
   <connection class='mysql'>
      <limit max='6'process='vizqlserver'>
      </limit>
      <limit max='10' process='dataserver'>
      </limit>
   </connection>
   <connection class='teradata'>
      <limit max='10'>
      </limit>
   </connection>
   <connection>
      <limit max='8'>
      </limit>
   </connection>
</connection-list>

Class Strings

Use the following strings to identify data source types in connection-configs.xml.

 

Data SourceString
Actian Vectorwise"vectorwise"
Amazon EMR"awshadoophive"
Amazon Redshift“redshift”
Aster Database"asterncluster"
Cloudera Hadoop"hadoophive"
Extracts created from cube data sourcesNot configurable.
DataStax Enterprise"datastax"
EXASolution"exasolution"
Firebird"firebird"
Generic ODBC"genericodbc"
Google Analytics"google-analytics"
Google BigQuery"bigquery"
Hortonworks Hadooop Hive"hortonworkshadoophive"
HP Vertica“vertica”
IBM BigInsights“bigsql”
IBM DB2"db2"
JavaScript Connector"jsconnector"
MapR Hadoop Hive"maprhadoophive"
MarkLogic"marklogic"
Microsoft Access"msaccess”
Microsoft Analysis Services"msolap"
Microsoft ExcelNot configurable.
Microsoft PowerPivot"powerpivot"
Microsoft SQL Server"sqlserver"
MySQL"mysql"
IBM Netezza"netezza"
OData“odata”
Oracle"oracle"
Oracle Essbase“essbase”
ParAccel"paraccel"
Pivotal Greenplum"greenplum"
PostgreSQL"postgres"
Progress OpenEdge"progressopenedge"
SAP HANA"saphana"
SAP Netweaver Business Warehouse"sapbw"
SAP Sybase ASE"sybasease"
SAP Sybase IQ"sybaseiq"
Salesforce"salesforce"
Spark SQL"spark"
Splunk"splunk"
Statistical FileNot configurable.
Tableau Data Extract"dataengine"
Teradata“teradata”
Text file“csv”

 

 

Did this article resolve the issue?