Tableau is a powerful collaboration platform. One can easily create, share and distribute the workbooks and dashboards set up in Tableau. Since people across the organization depend on Tableau, it’s essential to keep it performant and scalable.
We need to understand one important thing about the performance of Tableau that it is only as fast as your data source. In case if the data source is responding slower to queries, then Tableau, in turn, would have delayed response. Here are few points to consider on Tableau performance optimization:
- Database Optimization: There are few things which we need to ensure are properly tuned when it comes to the database.
- Making sure all the necessary indexes are in place for all the columns used in joins
- Similarly ensuring that indexes are created for all the columns used in filters
- Run the queries on DBMS environments and look for performance recommendations suggested by tool
- Ensuring tables have Primary keys if possible
- Implement table level partitioning in case If the data volume is huge
- Create pre-aggregated summary tables in database for large tables, besides, apply all the calculated columns at DB level to improve the performance
Tableau Data Extracts (TDE):
Extracts are much faster to work with than a live data source. The extract is columnar store database which compresses the data internally. There are ways to improve the extract performance by ensuring
- Database level queries are optimized
- Select the fields which are necessary for Tableau. Hide all the unused fields
- Reduce the number of records in extract by applying the filters
- Tableau is capable of creating an extract that can be aggregated to the level of details needed. This can be achieved by selecting an aggregate data for visible dimensions
Context filters can be faster than extract. In this case, we still leverage the speed, power, and optimization of the database instead of relying on the client machine. There is also no need to re-extract to get the current data. The thumb rule for a context filter to work effectively is to use it in cases where we are filtering more than 90% of the data.
- If we have only if-else condition in calculations, it is recommended to use boolean calculations
- Use, Case function if we are using groups in visualization. This would return the same output but would be much faster as compared to groups
- Use ELSEIF instead of ELSE-IF, as in case of else-if condition the if is treated as the second if statement whereas for ELSEIF treats it as part of the first statement
For Large Datasets:
- While handling large data, Table Partitioning is a better option. In most situations, no query result is meaningful unless constrained by time. Consider partitioning at the date/time level.
Visualization and layout:
- Selecting proper visualization is also one way of optimizing the dashboard performance.
- Excess worksheets on a dashboard can impact performance. Better to hide or delete unused sheets in the dashboard.
Lastly, Tableau Performance Recorder is an excellent utility which records key events as we interact with a workbook. This would help us view the performance metrics that Tableau creates to analyze and troubleshoot different events such as:
- Query execution
- Data Source connections
- Extract Generation
- Layout computation
- Blending data
Tableau is architected in such a way that it maps very well to relational algebra and can splendidly turn a graphical interface into a query language. While translating the graphical interface into query language, It does not need the user to extract/import the data from the database into an analytical silo. Tableau permits the users to explore data quickly.
Get more insights on YASH Data Science & Analytics services.
Gaurav Mittal is Principal Consultant Tableau, Business Intelligence @YASH Technologies