Request for Guidance on Enabling Managed Identity for Blob Storage Access with HNS Requirement

Gumber, GK (Gaurav) 0 Reputation points
2025-10-08T10:59:39.45+00:00

As part of our IT Security compliance and audit requirements, we are aligning with Microsoft’s recommendation to use Azure Active Directory (Azure AD) for authorizing access to Azure Storage accounts, instead of using Shared Key authorization. Azure AD provides enhanced security and simplifies access management.

To meet this requirement, we plan to disable the “Allow storage account key access” setting on our storage accounts.

We currently use two storage accounts:

  • mscstammdfacc (Blob Storage)
  • mscstammdfacc03 (ADLS Gen2)

These are used for logging Databricks pipeline activity and storing data for Hive metastore tables.

Following internal discussions, we agreed to use Managed Identity for accessing blob storage. However, during implementation, we encountered an issue: Managed Identity access to blob storage requires Hierarchical Namespace (HNS) to be enabled, which is currently not the case for the mscstammdfacc account.

We would like guidance on:

  1. The correct steps to enable HNS on an existing blob storage account (mscstammdfacc), or if this requires creating a new storage account.
  2. How to configure Managed Identity access for Databricks to interact with blob storage using Azure AD authentication.
  3. Any best practices or limitations we should be aware of when disabling Shared Key access and switching to Azure AD-based authentication.

Please advise on the recommended approach to proceed with this transition securely and efficiently.

I’ve attached the code snippet used to configure and access blob storage using Managed Identity from Databricks. The implementation fails due to HNS not being enabled on the storage account. Please review and advise on the correct approach.mount_sources_test_managedidentity.txt

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. PRADEEPCHEEKATLA 91,321 Reputation points Moderator
    2025-10-08T13:41:28.6466667+00:00

    Gumber, GK (Gaurav) - Thanks for the question and using MS Q&A platform.

    You're trying to mount blob storage (mscstammdfacc) using Managed Identity in Azure Databricks. However, the mount fails because Hierarchical Namespace (HNS) is not enabled on the storage account. This is a known limitation: Managed Identity access via abfss:// requires HNS to be enabled, which is only available on ADLS Gen2.

    ✅ Recommended Approach

    Step1: Can You Enable HNS on an Existing Storage Account?

    No — HNS cannot be enabled on an existing storage account. If HNS is not enabled during creation, it cannot be turned on later. You must:

    • Create a new storage account with HNS enabled.
    • Migrate data from the old blob storage (mscstammdfacc) to the new ADLS Gen2 account.

    Step2: Steps to Create a New ADLS Gen2 Storage Account with HNS

    • Go to Azure Portal → Storage Accounts → Create.
    • Under Advanced, enable Hierarchical namespace.
    • Set up RBAC roles for the Databricks Managed Identity:
    • Assign Storage Blob Data Contributor or Storage Blob Data Owner to the Databricks workspace's Managed Identity on the new storage account.

    Step3: Configure Managed Identity Access in Databricks

    Your code is mostly correct. Here's a simplified version of the mount logic:

    configs = {
      "fs.azure.account.auth.type": "ManagedIdentity",
      "fs.azure.account.oauth.msi.endpoint": "http://169.254.169.254/metadata/identity/oauth2/token",
      "fs.azure.account.oauth.msi.token.provider": "org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider"
    }
    
    dbutils.fs.mount(
      source = f"abfss://{container}@{storage_account}.dfs.core.windows.net/{folder}",
      mount_point = f"/mnt/{storage_account}/{container}/{folder}_1",
      extra_configs = configs
    )
    

    Make sure:

    • You're using dfs.core.windows.net (not blob.core.windows.net) for ADLS Gen2.
    • The storage account has HNS enabled.
    • The Databricks workspace has a Managed Identity assigned and granted appropriate RBAC roles.

    ✅ Next Steps

    1. Create a new ADLS Gen2 storage account with HNS enabled.
    2. Migrate data from mscstammdfacc to the new account.
    3. Assign RBAC roles to Databricks Managed Identity.
    4. Update your mount logic to use the new storage account.
    5. Disable Shared Key access only after verifying all dependencies are updated.

    Hope this helps. Let me know if you have any further questions or need additional assistance. Also, if these answers your query, do click the "Upvote" and click "Accept the answer" of which might be beneficial to other community members reading this thread.


    𝘛𝘰 𝘴𝘵𝘢𝘺 𝘪𝘯𝘧𝘰𝘳𝘮𝘦𝘥 𝘢𝘣𝘰𝘶𝘵 𝘵𝘩𝘦 𝘭𝘢𝘵𝘦𝘴𝘵 𝘶𝘱𝘥𝘢𝘵𝘦𝘴 𝘢𝘯𝘥 𝘪𝘯𝘴𝘪𝘨𝘩𝘵𝘴 𝘰𝘯 𝘈𝘻𝘶𝘳𝘦 𝘋𝘢𝘵𝘢𝘣𝘳𝘪𝘤𝘬𝘴, 𝘥𝘢𝘵𝘢 𝘦𝘯𝘨𝘪𝘯𝘦𝘦𝘳𝘪𝘯𝘨, 𝘢𝘯𝘥 Data & AI 𝘪𝘯𝘯𝘰𝘷𝘢𝘵𝘪𝘰𝘯𝘴, 𝘧𝘰𝘭𝘭𝘰𝘸 𝘮𝘦 𝘰𝘯 𝘓𝘪𝘯𝘬𝘦𝘥𝘐𝘯

    0 comments No comments

  2. Luis Arias 9,011 Reputation points Volunteer Moderator
    2025-10-08T15:48:12.8933333+00:00

    Hello Gumber, GK (Gaurav),

    Welcome to Microsoft Q&A, I'm familiar with the security improvements recommended for Azure Storage accounts. Databricks, however, has its own set of requirements to meet. I'll answer your questions one by one.":

    1. The correct steps to enable HNS on an existing blob storage account (mscstammdfacc), or if this requires creating a new storage account.

    You can't enable Hierarchical Namespace (HNS) on an existing Blob Storage account like mscstammdfacc. Microsoft requires HNS to be set only at creation time. So, the correct approach is to create a new storage account with HNS enabled and migrate your data if needed.

    1. How to configure Managed Identity access for Databricks to interact with blob storage using Azure AD authentication.

    You can follow the step by step procedure on databricks page: https://docs.databricks.com/aws/en/connect/storage/azure-storage

    1. Any best practices or limitations we should be aware of when disabling Shared Key access and switching to Azure AD-based authentication.
    • Disabling Shared Key access ensures that only Azure AD-based authentication is used, which improves security posture by eliminating the use of account keys and SAS tokens—before doing so, validate that all apps and services accessing the storage support Azure AD, replace any Shared Key or SAS usage with User Delegation SAS, and monitor access via Azure Monitor.
    • Using dbutils.fs.mount and DBFS mounts is not recommended in Unity Catalog-enabled Azure Databricks workspaces. Unity Catalog introduces a more secure and governed approach to data access using external locations and storage credentials, which fully integrate with Unity Catalog's fine-grained access controls. Mounts bypass Unity Catalog’s governance model and can expose data to all users in the workspace if not carefully managed.
    • As conclusion a good next step for you will be start to use Unity Catalog in Azure Databricks, it's recommended to create dedicated storage accounts within the Unity Catalog scope using external locations, manage access via account-level identities and groups, and avoid DBFS mounts to ensure consistent governance and access control.

    References:

    If this resolves your question, please accept the answer.

    Luis


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.