adls static data sources

azure_learner 900 Reputation points
2025-09-02T15:39:10.81+00:00

Hi friends, Thank you to the experts @Marcin Policht and @Vinodh247 for the direction and kind help.

https://free.blessedness.top/en-us/answers/questions/5532358/adls-structure-approach                

https://free.blessedness.top/en-us/answers/questions/5535161/adls-data-incoporation-with-other-data-source    

As we are converting to delta format for optimised format and scalability ,performance optimizations and better structural design.

But, we have few data sources that are static and infrequently accessed as per use basis. I have come across that it is best practice to have all the data sources(including the static data sources) to be converted into delta format.

Please suggest and advise what is the best approach to deal with this , would it be architectural flaw if we have missed the data format within the ADLS.

Also ,please suggest what would be pros and cons of not converting all the data sources into delta and or it is indeed a best practice to have all data sources in ADLS in uniform format. Thank you advance for your help.

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
{count} votes

Answer accepted by question author
  1. VRISHABHANATH PATIL 1,380 Reputation points Microsoft External Staff Moderator
    2025-09-03T09:55:11.8433333+00:00

    Hi

    Thanks for contacting to Microsoft QA

    You're currently exploring how to structure data within Azure Data Lake Storage (ADLS), with a particular focus on whether static and rarely accessed data sources should be converted to Delta format. Here's a framework to help guide your decision-making:

    1. Delta Format Benefits: Delta Lake adds ACID transactions, scalable metadata handling, and time travel to ADLS, which is otherwise limited in these areas. Even for static data, Delta format avoids expensive file listing operations and improves query performance by using transaction logs

    2. While converting static data to Delta format may seem unnecessary due to low access frequency, the architectural benefits—such as schema enforcement, unified processing pipelines, and simplified governance—often outweigh the conversion cost.

    Aspect Convert to Delta Keep as Mixed Format
    Performance Faster queries due to transaction logs Slower queries, especially for large datasets
    Governance Easier schema enforcement and lineage tracking Requires custom logic for format-specific governance
    Tool Compatibility Seamless integration with Databricks, Synapse, Power BI May need format-specific connectors or logic
    Maintenance Uniform pipelines and monitoring Increased complexity in ETL and debugging
    Cost Initial conversion cost Potential long-term inefficiencies

    Recommendation:

    For long-term architectural integrity and operational simplicity, it is recommended to convert static data sources to Delta format, even if they are infrequently accessed. This ensures:

    • Uniformity across ADLS layers
    • Simplified governance and security via tools like Microsoft Purview

    Regards,
    Vrishabh


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.