Complex views and its creation

azure_learner 900 Reputation points
2025-07-26T10:47:32.3933333+00:00

Hello experts, this is followed up with the question below which @ Marcin Policht was kind enough to answer.

https://free.blessedness.top/en-us/answers/questions/4378500/synapse-and-use-of-databricks

We have done initial brainstorming on the great suggestion, we are doing pilot POC of 10-15 views in

In both Databricks SQL using %SQL as well as in Synapse, afterwards we would compare the

Benchmark metrics on TPC-S.

We have three types of views 1. Base view (not complex) 2. Derived views (which has hierarchical tree structure and depth of 5 levels) 3. Picas views which are quite complex and depth of more than 5 levels)

Our implementation plan is to have base view developed in silver layer and derived and Picas views in gold layer.

Now, we need a detailed technical plan for the solution particularly for the derived views and Picas views which have multiple depth levels that are 4-5 levels of dependent views in the lineage hierarchy.

Such as V6 to V5to V4 to V3to V2 to V1 ( it is here where we are starting up and goes to depth of V6)

I would be grateful for experts to guide me on detailed implementation of derived views and Picas in both Databricks SQL and Synapse. I would appreciate and be thankful for your help.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
{count} votes

Answer accepted by question author
  1. Vinodh247 39,376 Reputation points MVP Volunteer Moderator
    2025-07-27T01:42:12.45+00:00

    Hi ,

    Thanks for reaching out to Microsoft Q&A.

    Detailed technical implementation plan for managing multi level derived views in databricks SQL and synapse analytics with special emphasis on lineage, performance, and modularity as you have requested.

    User's image Databricks SQL:

    -- vw_base_level_1 (Silver)
    CREATE OR REPLACE VIEW vw_base_level_1 AS
    SELECT * FROM cleaned_orders;
    -- vw_derived_level_2 (Gold)
    CREATE OR REPLACE VIEW vw_derived_level_2 AS
    SELECT customer_id, SUM(total_amount) AS total_spend
    FROM vw_base_level_1
    GROUP BY customer_id;
    -- vw_derived_level_3 (Gold)
    CREATE OR REPLACE VIEW vw_derived_level_3 AS
    SELECT d2.customer_id, d2.total_spend, c.customer_segment
    FROM vw_derived_level_2 d2
    JOIN dim_customers c ON d2.customer_id = c.customer_id;
    
    
    
    • Delta Caching: Use OPTIMIZE, ZORDER on intermediate tables if performance issues arise.
    • Materialized Views: Consider persisting deep levels as Delta tables with scheduled refresh via Jobs or Workflows.
    • Use Lakehouse Federation to access upstream sources efficiently.
    • Leverage Unity Catalog for view lineage.
    • Databricks also supports view lineage via the Catalog Explorer UI.

    Testing

    • Create unit test notebooks that validate each view level with LIMIT and assert expected results.

    Synapse SQL (Dedicated/Serverless Pools)

    -- Base View (Silver)
    CREATE VIEW dbo.vw_orders_base AS
    SELECT * FROM dbo.orders_cleaned;
    -- Derived View Level 2 (Gold)
    CREATE VIEW dbo.vw_customer_spend AS
    SELECT customer_id, SUM(amount) AS spend
    FROM dbo.vw_orders_base
    GROUP BY customer_id;
    -- Level 3
    CREATE VIEW dbo.vw_customer_summary AS
    SELECT c.customer_id, c.spend, s.segment
    FROM dbo.vw_customer_spend c
    JOIN dbo.customer_segment s ON c.customer_id = s.customer_id;
    
    
    
    • Use materialized views for levels 3+ if query performance is degrading
    • Enable result set caching and partitioned external tables when using serverless pools.
    • Avoid nested view chains in Serverless flatten if necessary.
    1. Dependency/Lineage
    • Maintain a custom lineage table or use azure purview for full lineage.
    • Use synapse studio's built-in view dependency tracker.

    For benchmarking:

    Run equivalent TPC-S queries in both systems across:

    • Each view level.
    • Full chain (V1 → V6).
    • Different concurrency scenarios.

    Capture:

    • Query latency.
    • CPU/memory utilization.
    • Caching effect.
    • View refresh cost (if materialized).

    Please 'Upvote'(Thumbs-up) and 'Accept' as answer if the reply was helpful. This will be benefitting other community members who face the same issue.


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.