Share via


Monitor materialized views in Databricks SQL

This article describes how to monitor and query refresh data about a materialized view in Databricks SQL.

View the details of a single materialized view

You can view the details of a single materialized view by using the Catalog Explorer, or programmatically, with the DESCRIBE EXTENDED operation. You can also query the event log for details of the refresh history of a materialized view.

View details in Catalog Explorer

You can access information about your materialized view by viewing it in Catalog Explorer. In Catalog Explorer, you can see the last refresh status and any refresh schedules created in SQL. For details on schedules created in SQL, see CREATE MATERIALIZED VIEW.

The Catalog Explorer shows details for each materialized view in a right-side panel, including:

  1. Current refresh status and last run time. To see more details about the update, including run duration, any detailed errors, as well as the refresh type (including whether the materialized view was fully or incrementally refreshed and why), click See refresh details.
  2. Refresh Schedule. If the materialized view has a schedule set via SQL, the schedule is shown. To update the schedule, use ALTER MATERIALIZED VIEW.
  3. Tags. To add custom tags, click Add tags. Tags are not automatically used to attribute costs, but you can manually create a query to connect custom tags to your costs. For details, see Attribute costs to the SQL warehouse with custom tags.

There are properties of the materialized view that are not available in Catalog Explorer. For those properties, or to get the information programmatically, you can use the DESCRIBE EXTENDED command.

View details with DESCRIBE EXTENDED

You can view details about a materialized view programmatically by using the DESCRIBE EXTENDED command. This includes details beyond what you get from the Catalog Explorer. They include:

  • The status of the latest completed refresh.

  • The refresh schedule.

  • The columns of the materialized view.

  • The refresh type of the materialized view (not available in Catalog Explorer).

  • The data size for the materialized view, in total bytes (not available in Catalog Explorer).

  • The storage location of the materialized view (not available in Catalog Explorer).

  • Some information is only included in the result when enabled:

    • Clustering columns, if enabled.
    • Whether deletion vectors are enabled (only shown when true).
    • Whether row tracking is enabled (only shown when true).
-- As table:
DESCRIBE TABLE EXTENDED sales;

-- As a single JSON object:
DESCRIBE TABLE EXTENDED sales AS JSON;

Querying the event log programmatically

To get details about the refresh history for a materialized view, or the details as a refresh is happening, you can query the event log programmatically.

As the pipeline owner, you can create a view to allow others to query the event log for your pipeline. The following query creates a view that others can use to query the event log. This query uses the TABLE value function to query the correct event log table.

CREATE VIEW my_event_log_view AS
  SELECT *
  FROM event_log(TABLE(<catalog_name>.<schema_name>.<mv_name>));

To query the event log view, use a query like the following.

SELECT *
  FROM my_event_log_view
  WHERE event_type = "update_progress"
  ORDER BY timestamp desc;

To query the event log directly as the pipeline owner, you do not need to create a view. You can use the TABLE value function and query the data directly, such as the following sample query.

SELECT *
  FROM event_log(TABLE(<catalog_name>.<schema_name>.<mv_name>))
  WHERE event_type = "update_progress"
  ORDER BY timestamp desc;

For a full list of sample queries using the event log, see Basic query examples.

Monitor materialized view runs

You can monitor pipeline runs across your workspace, using the Jobs & Pipelines page, the Query History page, or programmatically by querying the event log.

View all materialized view runs in the UI

If you use the Jobs & Pipelines page to monitor the status of different orchestration jobs across your workspace, you can also track all of the materialized views and streaming tables created there. Each materialized view created in Databricks SQL has a pipeline backing it. To see all of the materialized views and streaming tables that you have access to:

  1. Click the Workflows icon. Jobs & Pipelines button on the left side of your workspace.
  2. Click on the Pipelines toggle to filter your view to only pipelines.
  3. Click on Pipeline type button and select MV/ST to filter to only pipelines created by Databricks SQL.
  4. Click on the Accessible by me filter, to show all pipelines you have access to.

You will be able to see all materialized views and streaming tables created across your org, including a summary view of the recent runs statuses. Clicking the name of a pipeline opens the pipeline monitoring details page to get more information. To learn more about the pipeline monitoring details page, see Troubleshoot a failed refresh.

View runs using query history

If you’re more familiar with the Query History tab, you can also use it to view all previous runs of all queries where you have at least CAN VIEW access to the executing SQL warehouse. You can use the query history page to access query details and query profiles that can help you identify poorly performing queries and bottlenecks in the Lakeflow Declarative Pipelines used to run your streaming table updates. For an overview of the kind of information available for query histories and query profiles, see Query history and Query profile.

Important

This feature is in Public Preview. Workspace admins can control access to this feature from the Previews page. See Manage Azure Databricks previews.

All statements related to materialized views appear in the query history. You can use the Statement drop-down filter to select any command and inspect the related queries. All CREATE statements are followed by a REFRESH statement that executes asynchronously on a pipeline. The REFRESH statements typically include detailed query plans that provide insights into optimizing performance.

To access REFRESH statements in the query history UI, use the following steps:

  1. Click History icon. Query History in the left sidebar.
  2. Select the REFRESH checkbox from the Statement drop-down filter.
  3. Click the name of the query statement to view summary details like the duration of the query and aggregated metrics.
  4. Click See query profile to open the query profile. For details about navigating the query profile, see Query profile.
  5. Optionally, use the links in the Query Source section to open the related query or pipeline.

See CREATE MATERIALIZED VIEW.

Troubleshoot a failed refresh

You can find failing updates for materialized views (or streaming tables) by looking through the Jobs & Pipelines list. To troubleshoot a failed update, use the pipeline monitoring details page or the event log.

To troubleshoot a refresh that is full when you believe it should be incremental, first check that you have row tracking enabled for any source Delta tables. For other details about incremental refresh, see Support for materialized view incremental refresh.

You can get additional details from the pipeline monitoring page, or by querying the event log programmatically.

Using the pipeline monitoring page

To get more details about a materialized view (or streaming table) that has failed, you can use the pipeline monitoring page to debug issues. Each materialized view has a pipeline backing it. The pipeline monitoring page includes information, such as:

  • The status of the last run and the run history.
  • The last run duration.
  • Whether the materialized view fully or incrementally refreshed. For more details on getting your materialized view to incrementally refresh, see Support for materialized view incremental refresh.
  • The event log for more detailed debugging. In the event that your materialized view failed to update or had other issues, click the issues panel to view the logs in more detail.

To troubleshoot your materialized view:

  1. In your workspace, click the Workflows icon. Jobs & Pipelines button on the left-navigation bar.
  2. Click the name of your pipeline in the list.
  3. If the update failed to run, the UI shows an error (or a list of errors) in the bottom panel.
  4. Either click the button to View Logs or click the issues panel to view the errors in more detail.
  5. This opens the event log UI. Each error has a high-level message and summary, as well as a JSON tab with more details. To fix an issue with the Databricks Assistant, click Diagnose Error.

For more details on the pipeline monitoring page, see here.

Query the refresh history for a materialized view

The event log can be useful for setting up dashboards to monitor update status or duration across the workspace, or if you prefer programmatic monitoring over using the UI. Materialized views created with Databricks SQL do not support saving the event log to a metastore, so only the owner of the materialized view can query the event log directly.

To view the status of REFRESH operations on a materialized view, including current and past refreshes, query the Lakeflow Declarative Pipelines event log:

SELECT *
FROM event_log(TABLE(<fully-qualified-table-name>))
WHERE event_type = "update_progress"
ORDER BY timestamp desc;

Replace <fully-qualified-table-name> with the fully qualified name of the materialized view, including the catalog and schema. If you are not the pipeline owner, you may need to first have a view created to query the event log. See Querying the event log programmatically.

Query the type of refresh that occurred for a materialized view

Some queries can be incrementally refreshed. If an incremental refresh cannot be performed, a full refresh is performed instead.

To see the refresh type for a materialized view, query the event log:

SELECT timestamp, message
FROM event_log(TABLE(my_catalog.my_schema.sales))
WHERE event_type = 'planning_information'
ORDER BY timestamp desc;

Sample output for this command:

    • timestamp
    • message
    • 2025-03-21T22:23:16.497+00:00
    • Flow 'sales' has been planned in :re[LDP] to be executed as ROW_BASED.

Attribute costs to the SQL warehouse with custom tags

Tags are not automatically appended to billing records, but you can manually join them to your billing information in a query.

After you have added tags to each materialized view (or streaming table) with Catalog Explorer, you can monitor the costs of materialized views by joining the table_tags and billing system tables. This is a sample query to retrieve billing records for all materialized views and streaming tables created with Databricks SQL and append table-level tags:

SELECT
  u.*,
  tag_info.tags
FROM
  system.billing.usage u
LEFT JOIN (
  SELECT
    t.catalog_name,
    t.schema_name,
    t.table_name,
    collect_list(named_struct('tag_name', t.tag_name, 'tag_value', t.tag_value)) AS tags
  FROM
    main.information_schema.table_tags t
  GROUP BY
    t.catalog_name,
    t.schema_name,
    t.table_name
) tag_info
  ON tag_info.catalog_name = u.usage_metadata.uc_table_catalog
  AND tag_info.schema_name = u.usage_metadata.uc_table_schema
  AND tag_info.table_name = u.usage_metadata.uc_table_name
  WHERE usage_metadata.uc_table_name is not null;

You can use this same idea to join column tags from the column_tags table instead.