Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Applies to:
Databricks SQL
Databricks Runtime 11.3 LTS and above
Reorganize a Delta Lake table by rewriting files to purge soft-deleted data, such as the column data dropped by ALTER TABLE DROP COLUMN, or by performing Delta Lake checkpointing to improve metadata management.
Syntax
REORG [ TABLE ] table_name { [ WHERE predicate ] APPLY ( PURGE ) |
APPLY ( UPGRADE UNIFORM ( ICEBERG_COMPAT_VERSION = version ) |
CHECKPOINT ) }
For Databricks Runtime versions before 15.4 TABLE is a mandatory keyword.
Note
APPLY (PURGE)only rewrites files that contain soft-deleted data.APPLY (UPGRADE)may rewrite all files.REORG TABLEis idempotent, meaning that if it is run twice on the same dataset, the second run has no effect.- After running
APPLY (PURGE), the soft-deleted data may still exist in the old files. You can run VACUUM to physically delete the old files. APPLY (CHECKPOINT)requires the table to have the V2 Checkpoint table feature enabled to prevent corruption caused by race conditions.
Parameters
-
Identifies an existing Delta table. The name must not include a temporal specification or options specification.
WHEREpredicateFor
APPLY (PURGE), reorganizes the files that match the given partition predicate. Only filters involving partition key attributes are supported.APPLY (PURGE)Specifies that the purpose of file rewriting is to purge soft-deleted data. See Purge metadata-only deletes to force data rewrite.
APPLY (UPGRADE UNIFORM ( ICEBERG_COMPAT_VERSION = version ))Applies to:
Databricks SQL
Databricks Runtime 14.3 and aboveSpecifies that the purpose of file rewriting is to upgrade the table to the given Apache Iceberg version.
versionmust be either1or2.APPLY (CHECKPOINT)Applies to:
Databricks Runtime 16.3 and abovePerforms Delta checkpointing on the table’s latest Delta version.
Examples
> REORG TABLE events APPLY (PURGE);
> REORG TABLE events WHERE date >= '2022-01-01' APPLY (PURGE);
> REORG TABLE events
WHERE date >= current_timestamp() - INTERVAL '1' DAY
APPLY (PURGE);
> REORG TABLE events APPLY (UPGRADE UNIFORM(ICEBERG_COMPAT_VERSION=2));
> REORG TABLE events APPLY (CHECKPOINT);