KNOWLEDGE BASE

Tableau Server Backups Are Failing At Verifying Disk Space in 2022.3.6 and 2023.1+


Published: 16 May 2023
Last Modified Date: 04 Aug 2023

Issue

On a multi-node cluster with two or more Filestore processes, Tableau Server backups will fail during estimation of disk space on the backup job as demonstrated below:

tsm maintenance backup -f TableauServer_Backup

Job id is '54', timeout is 1440 minutes.
7% - Starting the Active Repository instance, File Store, and Cluster Controller.
14% - Waiting for the Active Repository, File Store, and Cluster Controller to start.
21% - Installing backup services.
28% - Estimating required disk space failed.
35% - Stopping the Active Repository if necessary.
42% - Waiting for the Active Repository to stop if necessary.
50% - Uninstalling backup services.

Tableau Server could not verify that there is sufficient disk space for backup. If you believe there is enough space, you may skip the disk space check and force creation of a backup by using the '--override-disk-space-check' option. The backup will fail if it runs out of space.

See 'C:\ProgramData\Tableau\Tableau Server\data\tabsvc\logs\tabadmincontroller\tabadmincontroller_*.log' on Tableau Server nodes running the Administration Controller process for server log information.


Note: Using the --override-disk-space-check option will not allow the job to complete.  
 

Environment

  • Tableau Server versions:
    • 2021.2.24 (or below)
    • 2021.3.23 (or below)
    • 2021.4.18 (or below)
    • 2022.1.14 (or below)
    • 2022.3.6 (or below)
    • 2023.1.2 (or below)
  • Multi-Node Cluster with two or more Filestore processes

Resolution

Option 1: Upgrade Tableau Server to one of the following versions:

  • 2021.3.25+
  • 2021.4.20+
  • 2022.1.16+
  • 2022.3.8+
  • 2023.1.4+


Option 2: Configure the Filestore connection timeout value:

Within an administrative Command Prompt on the Controller node, apply the following configuration change:

tsm configuration set -k filestore.client.sockettimeoutms -v 0
tsm pending-changes apply (requires a restart)

This change will remove the timeout entirely, which is the previous default for the filestore.client.sockettimeoutms key.  After the configuration change has been made, confirm the issue is resolved by creating a Tableau Server backup.
 

Option 3: Reduce the Filestore instances down to one node:

If possible, keeping the remaining Filestore process on the initial node where Tabadmincontroller resides will ensure maximal performance for the backup job.

See steps in the "Decommissioning and removing an instance of file store" section of the below Product Guide for details on how to remove a File Store instance:
 

Cause

This issue is now fixed. More information can be found on the Salesforce Known Issue website with ID W-13476908.
 

Additional Information

Identifying a match for this issue:

1. Confirm Tableau Server version is 2022.3.6 or 2023.1.2

2. Confirm that there is enough space for the backup, per this section of the Product Guide: Disk Requirements > Backup and Restore Processes

3. The Tabadmincontroller log for the backup job will display a Read Time Out error, either while estimating disk space (shown here) or in the BackupDataEngine step (if --override-disk-space-check was used in the backup command):
2023-05-05 09:50:06.074 -0500 pool-22-thread-1 : ERROR com.tableausoftware.tabadmin.webapp.asyncjobs.JobStepRunner - Running step EstimateRequiredSpace failed
java.lang.RuntimeException: java.lang.RuntimeException: BackupRestoreException(message:org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out)

The Backuprestore log will also display a read time out error at the same time, while attempting to connect to the Filestore service at Method:forceSync to File Store:

2023-05-05 09:50:03.048 -0500 TThreadPoolServer WorkerProcess-%d : ERROR com.tableausoftware.tdfs.client.FileStoreService - Method:forceSync to File Store on host:<HOSTNAME>:<FILESTORE PORT> failed
java.util.concurrent.ExecutionException: org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
Did this article resolve the issue?