Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
APPLIES TO:
Azure Cosmos DB for PostgreSQL (powered by the Citus database
extension to PostgreSQL)
Azure Cosmos DB for PostgreSQL includes features beyond standard PostgreSQL. Below is a categorized reference of functions and configuration options for:
- Parallelizing query execution across shards
- Managing sharded data between multiple servers
- Compressing data with columnar storage
- Automating timeseries partitioning
SQL functions
Sharding
| Name | Description | 
|---|---|
| alter_distributed_table | Change the distribution column, shard count or colocation properties of a distributed table | 
| citus_copy_shard_placement | Repair an inactive shard placement using data from a healthy placement | 
| citus_schema_distribute | Turn a PostgreSQL schema into a distributed schema | 
| citus_schema_undistribute | Undo the action of citus_schema_distribute | 
| create_distributed_table | Turn a PostgreSQL table into a distributed (sharded) table | 
| create_reference_table | Maintain full copies of a table in sync across all nodes | 
| citus_add_local_table_to_metadata | Add a local table to metadata to enable querying it from any node | 
| isolate_tenant_to_new_shard | Create a new shard to hold rows with a specific single value in the distribution column | 
| truncate_local_data_after_distributing_table | Truncate all local rows after distributing a table | 
| undistribute_table | Undo the action of create_distributed_table or create_reference_table | 
Shard rebalancing
| Name | Description | 
|---|---|
| citus_add_rebalance_strategy | Append a row to pg_dist_rebalance_strategy | 
| citus_move_shard_placement | Typically used indirectly during shard rebalancing rather than being called directly by a database administrator | 
| citus_set_default_rebalance_strategy | Change the strategy named by its argument to be the default chosen when rebalancing shards | 
| get_rebalance_progress | Monitor the moves planned and executed by rebalance_table_shards | 
| get_rebalance_table_shards_plan | Output the planned shard movements of rebalance_table_shards without performing them | 
| rebalance_table_shards | Move shards of the given table to distribute them evenly among the workers | 
Colocation
| Name | Description | 
|---|---|
| create_distributed_function | Make function run on workers near colocated shards | 
| update_distributed_table_colocation | Update or break colocation of a distributed table | 
Columnar storage
| Name | Description | 
|---|---|
| alter_columnar_table_set | Change settings on a columnar table | 
| alter_table_set_access_method | Convert a table between heap or columnar storage | 
Timeseries partitioning
| Name | Description | 
|---|---|
| alter_old_partitions_set_access_method | Change storage method of partitions | 
| create_time_partitions | Create partitions of a given interval to cover a given range of time | 
| drop_old_time_partitions | Remove all partitions whose intervals fall before a given timestamp | 
Informational
| Name | Description | 
|---|---|
| citus_get_active_worker_nodes | Get active worker host names and port numbers | 
| citus_relation_size | Get disk space used by all the shards of the specified distributed table | 
| citus_remote_connection_stats | Show the number of active connections to each remote node | 
| citus_stat_statements_reset | Remove all rows from citus_stat_statements | 
| citus_table_size | Get disk space used by all the shards of the specified distributed table, excluding indexes | 
| citus_total_relation_size | Get total disk space used by the all the shards of the specified distributed table, including all indexes and TOAST data | 
| column_to_column_name | Translate the partkeycolumn ofpg_dist_partitioninto a textual column name | 
| get_shard_id_for_distribution_column | Find the shard ID associated with a value of the distribution column | 
Server parameters
Query execution
| Name | Description | 
|---|---|
| citus.all_modifications_commutative | Allow all commands to claim a shared lock | 
| citus.count_distinct_error_rate | Tune error rate of postgresql-hll approximate counting | 
| citus.enable_repartition_joins | Allow JOINs made on non-distribution columns | 
| citus.enable_repartitioned_insert_select | Allow repartitioning rows from the SELECT statement and transferring them between workers for insertion | 
| citus.limit_clause_row_fetch_count | The number of rows to fetch per task for limit clause optimization | 
| citus.local_table_join_policy | Where data moves when doing a join between local and distributed tables | 
| citus.multi_shard_commit_protocol | The commit protocol to use when performing COPY on a hash distributed table | 
| citus.propagate_set_commands | Which SET commands are propagated from the coordinator to workers | 
| citus.create_object_propagation | Behavior of CREATE statements in transactions for supported objects | 
| citus.use_citus_managed_tables | Allow local tables to be accessed in worker node queries | 
Informational
| Name | Description | 
|---|---|
| citus.explain_all_tasks | Make EXPLAIN output show all tasks | 
| citus.explain_analyze_sort_method | Sort method of the tasks in the output of EXPLAIN ANALYZE | 
| citus.log_remote_commands | Log queries the coordinator sends to worker nodes | 
| citus.multi_task_query_log_level | Log-level for any query that generates more than one task | 
| citus.stat_statements_max | Max number of rows to store in citus_stat_statements | 
| citus.stat_statements_purge_interval | Frequency at which the maintenance daemon removes records from citus_stat_statementsthat are unmatched inpg_stat_statements | 
| citus.stat_statements_track | Enable/disable statement tracking | 
| citus.show_shards_for_app_name_prefixes | Allows shards to be displayed for selected clients that want to see them | 
| citus.override_table_visibility | Enable/disable shard hiding | 
Inter-node connection management
| Name | Description | 
|---|---|
| citus.executor_slow_start_interval | Time to wait in milliseconds between opening connections to the same worker node | 
| citus.force_max_query_parallelization | Open as many connections as possible | 
| citus.max_adaptive_executor_pool_size | Max worker connections per session | 
| citus.max_cached_conns_per_worker | Number of connections kept open to speed up subsequent commands | 
| citus.node_connection_timeout | Max duration (in milliseconds) to wait for connection establishment | 
Data transfer
| Name | Description | 
|---|---|
| citus.enable_binary_protocol | Use PostgreSQL’s binary serialization format (when applicable) to transfer data with workers | 
| citus.max_intermediate_result_size | Size in KB of intermediate results for CTEs and subqueries that are unable to be pushed down | 
Deadlock
| Name | Description | 
|---|---|
| citus.distributed_deadlock_detection_factor | Time to wait before checking for distributed deadlocks | 
| citus.log_distributed_deadlock_detection | Whether to log distributed deadlock detection-related processing in the server log | 
System tables
The coordinator node contains metadata tables and views to help you see data properties and query activity across the cluster.
| Name | Description | 
|---|---|
| citus_dist_stat_activity | Distributed queries that are executing on all nodes | 
| citus_lock_waits | Queries blocked throughout the cluster | 
| citus_shards | The location of each shard, the type of table it belongs to, and its size | 
| citus_stat_statements | Stats about how queries are being executed, and for whom | 
| citus_tables | A summary of all distributed and reference tables | 
| citus_worker_stat_activity | Queries on workers, including tasks on individual shards | 
| pg_dist_colocation | Which tables' shards should be placed together | 
| pg_dist_node | Information about worker nodes in the cluster | 
| pg_dist_object | Objects such as types and functions that have been created on the coordinator node and propagated to worker nodes | 
| pg_dist_placement | The location of shard replicas on worker nodes | 
| pg_dist_rebalance_strategy | Strategies that rebalance_table_shardscan use to determine where to move shards | 
| pg_dist_shard | The table, distribution column, and value ranges for every shard | 
| time_partitions | Information about each partition managed by such functions as create_time_partitionsanddrop_old_time_partitions | 
Next steps
- Learn some useful diagnostic queries
- Review the list of configuration parameters in the underlying PostgreSQL database.