Checking for Tables that Need Routine Maintenance
The following views can help identify tables that need routine table maintenance (VACUUM and/or ANALYZE).
The VACUUM or VACUUM FULL command reclaims disk space occupied by deleted or obsolete rows. Because of the MVCC transaction concurrency model used in Greenplum Database, data rows that are deleted or updated still occupy physical space on disk even though they are not visible to any new transactions. Expired rows increase table size on disk and eventually slow down scans of the table.
The ANALYZE command collects column-level statistics needed by the query optimizer. Greenplum Database uses a cost-based query optimizer that relies on database statistics. Accurate statistics allow the query optimizer to better estimate selectivity and the number of rows retrieved by a query operation in order to choose the most efficient query plan.
gp_bloat_diag
This view shows tables that have bloat (the actual number of pages on disk exceeds the expected number of pages given the table statistics). Tables that are bloated require a VACUUM or a VACUUM FULL in order to reclaim disk space occupied by deleted or obsolete rows. This view is accessible to all users, however non-superusers will only be able to see the tables that they have permission to access.
| Column | Description |
|---|---|
| bdirelid | Table object id. |
| bdinspname | Schema name. |
| bdirelname | Table name. |
| bdirelpages | Actual number of pages on disk. |
| bdiexppages | Expected number of pages given the table data. |
| bdidiag | Bloat diagnostic message. |
gp_stats_missing
This view shows tables that do not have statistics and therefore may require an ANALYZE be run on the table.
| Column | Description |
|---|---|
| smischema | Schema name. |
| smitable | Table name. |
| smisize | Does this table have statistics? False if the table does not have row count and row sizing statistics recorded in the system catalog, which may indicate that the table needs to be analyzed. This will also be false if the table does not contain any rows. For example, the parent tables of partitioned tables are always empty and will always return a false result. |
| smicols | Number of columns in the table. |
| smirecs | Number of rows in the table. |