Hi ,
Thanks for reaching out to Microsoft Q&A.
tldr:
- Short term: if the KPI requirements are small and not latency sensitive, synapse serverless with Power BI will do.
- Medium to long term: especially for scalable KPI computation or AIML integration, Databricks with Unity Catalog is your best option.
- Synapse (Serverless SQL / Dedicated SQL)
- Query ADLS via Synapse Serverless SQL or Spark, join with legacy data via Linked Server/PolyBase or external tables.
- Pros:
- Good for adhoc joins between ADLS and SQL MI.
- Tight integration with PBI.
- Cons:
- Performance is not optimal for highvolume joins.
- Synapse Spark
- Use Synapse Spark pools to read ERP data from ADLS and connect to SQL MI via JDBC/ODBC for legacy data.
- Pros:
Scales well for large datasets.
- Spark transformations can enrich or preaggregate before KPI calculation.
Easy integration with Fabric or Power BI.
Cons:
Complex to manage pipelines.
Real-time KPIs require more orchestration.
Cost can rise quickly if Spark pools are not optimized.
- Best fit: Heavy transformations and largescale KPI computation done in batches.
- Databricks (with Unity Catalog)
- Read ADLS data natively, connect to Azure SQL MI using JDBC, register both in UC, and create views for KPIs.
- Pros:
Best scalability and performance for both batch and near-real-time.
- Strong governance with UC.
Easy to automate and orchestrate pipelines.
Cons:
Slightly higher skill requirement for setup and optimization.
Cost management must be done carefully to avoid overruns.
Best fit: If you want future-proofing, high performance, and an open architecture for expansion.
- Power BI (Direct Query to SQL MI + Import from ADLS)
- Use Direct Query mode for SQLMI data and Import mode for ADLS or link via Synapse serverless.
Pros:
Fast to implement.
Good for exploratory or low-volume KPI reports.
Cons:
Performance degrades heavily with complex joins or large datasets.
High query latency with Direct Query.
Governance and data transformations are limited.
- Best fit: Small-scale, lightweight dashboards, not so heavy KPI processing.
- ADF with Dataflows
- Use ADF to move legacy data periodically into ADLS, then process KPIs downstream.
Pros:
Stable ETL orchestration.
Simplifies joins once both data sources are in the lake.
- Cons:
You mentioned legacy data is not moving to ADLS.
- Limited to batch, no realtime capability.
Additional cost and latency in data movement.
- Best fit: Only viable if you change the policy and ingest legacy data into the lake.
Please 'Upvote'(Thumbs-up) and 'Accept' as answer if the reply was helpful. This will be benefitting other community members who face the same issue.