Complex views and its creation

Question

Complex views and its creation

azure_learner 900

Hello experts, this is followed up with the question below which @ Marcin Policht was kind enough to answer.

https://free.blessedness.top/en-us/answers/questions/4378500/synapse-and-use-of-databricks

We have done initial brainstorming on the great suggestion, we are doing pilot POC of 10-15 views in

In both Databricks SQL using %SQL as well as in Synapse, afterwards we would compare the

Benchmark metrics on TPC-S.

We have three types of views 1. Base view (not complex) 2. Derived views (which has hierarchical tree structure and depth of 5 levels) 3. Picas views which are quite complex and depth of more than 5 levels)

Our implementation plan is to have base view developed in silver layer and derived and Picas views in gold layer.

Now, we need a detailed technical plan for the solution particularly for the derived views and Picas views which have multiple depth levels that are 4-5 levels of dependent views in the lineage hierarchy.

Such as V6 to V5to V4 to V3to V2 to V1 ( it is here where we are starting up and goes to depth of V6)

I would be grateful for experts to guide me on detailed implementation of derived views and Picas in both Databricks SQL and Synapse. I would appreciate and be thankful for your help.

Smaran Thoomu 32,155 Reputation points Microsoft External Staff Moderator

2025-07-29T10:51:23.1466667+00:00

azure_learner Following up to see if the below answer was helpful. If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Answer accepted by question author

0 additional answers

Your answer

Smaran Thoomu 32,155 Reputation points Microsoft External Staff Moderator

2025-07-29T10:51:23.1466667+00:00

azure_learner Following up to see if the below answer was helpful. If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Answer 1

Hi ,

Thanks for reaching out to Microsoft Q&A.

Detailed technical implementation plan for managing multi level derived views in databricks SQL and synapse analytics with special emphasis on lineage, performance, and modularity as you have requested.

Databricks SQL:

-- vw_base_level_1 (Silver)
CREATE OR REPLACE VIEW vw_base_level_1 AS
SELECT * FROM cleaned_orders;
-- vw_derived_level_2 (Gold)
CREATE OR REPLACE VIEW vw_derived_level_2 AS
SELECT customer_id, SUM(total_amount) AS total_spend
FROM vw_base_level_1
GROUP BY customer_id;
-- vw_derived_level_3 (Gold)
CREATE OR REPLACE VIEW vw_derived_level_3 AS
SELECT d2.customer_id, d2.total_spend, c.customer_segment
FROM vw_derived_level_2 d2
JOIN dim_customers c ON d2.customer_id = c.customer_id;

Delta Caching: Use OPTIMIZE, ZORDER on intermediate tables if performance issues arise.
Materialized Views: Consider persisting deep levels as Delta tables with scheduled refresh via Jobs or Workflows.
Use Lakehouse Federation to access upstream sources efficiently.
Leverage Unity Catalog for view lineage.
Databricks also supports view lineage via the Catalog Explorer UI.

Testing

Create unit test notebooks that validate each view level with LIMIT and assert expected results.

Synapse SQL (Dedicated/Serverless Pools)

-- Base View (Silver)
CREATE VIEW dbo.vw_orders_base AS
SELECT * FROM dbo.orders_cleaned;
-- Derived View Level 2 (Gold)
CREATE VIEW dbo.vw_customer_spend AS
SELECT customer_id, SUM(amount) AS spend
FROM dbo.vw_orders_base
GROUP BY customer_id;
-- Level 3
CREATE VIEW dbo.vw_customer_summary AS
SELECT c.customer_id, c.spend, s.segment
FROM dbo.vw_customer_spend c
JOIN dbo.customer_segment s ON c.customer_id = s.customer_id;

Use materialized views for levels 3+ if query performance is degrading
Enable result set caching and partitioned external tables when using serverless pools.
Avoid nested view chains in Serverless flatten if necessary.

Dependency/Lineage

Maintain a custom lineage table or use azure purview for full lineage.
Use synapse studio's built-in view dependency tracker.

For benchmarking:

Run equivalent TPC-S queries in both systems across:

Each view level.
Full chain (V1 → V6).
Different concurrency scenarios.

Capture:

Query latency.
CPU/memory utilization.
Caching effect.
View refresh cost (if materialized).

Please 'Upvote'(Thumbs-up) and 'Accept' as answer if the reply was helpful. This will be benefitting other community members who face the same issue.

azure_learner 900 Reputation points

2025-07-27T09:01:19.61+00:00

Thank you @Vinodh247 The derived and Picas views as I have mentioned has 6 levels of depth

View1 to View2 to View3 to View4 to View5 to View6

But, When you have a derived view with a lineage of multiple depth levels (e.g., View 3 is based on View 2, which is based on View 1), this chain of views is invoked and does not work independently.

So, in this context V1 is the base view that is to be developed in the silver layer and its chain of views from V2-toV6 in the gold layer. Please correct me if my understanding is wrong.

Lastly, how these views would be invoked in base view so that consistent data that is there in origin system is consistent and reconciled. Thank you and I appreciate your help.
azure_learner 900 Reputation points

2025-07-28T08:57:57.9633333+00:00

Could someone please tag this in the Synapse analytics also. Thank you
Smaran Thoomu 32,155 Reputation points Microsoft External Staff Moderator

2025-07-28T10:59:06.95+00:00

azure_learner You're absolutely right in your understanding - View1 (V1) in the silver layer acts as the foundational dataset, and each subsequent derived view (V2 to V6) in the gold layer should build upon it in a modular, layered fashion.

Key clarifications:

Views don’t run independently: When you query View6, it internally triggers the full chain (V6 → V5 → ... → V1). This is expected behavior, and it's why performance optimization (e.g., materializing intermediate views) becomes important.

Data consistency and reconciliation: To ensure consistency with the origin system:

V1 should be sourced from a reliable, reconciled silver table updated by your ingestion pipelines.

You can schedule unit tests or reconciliation checks between V1 and the source system (row counts, hash checks).

If consistency checks fail at V1, V2–V6 will inherently carry the same issue - so start validation at the base.

If you're concerned about repeated triggering of heavy logic, consider materializing intermediate views (e.g., V3 or V4) into Delta tables with job-triggered refresh logic.

I hope this information helps. Please do let us know if you have any further queries.

Share via

Complex views and its creation

0 additional answers

Your answer