你当前正在访问 Microsoft Azure Global Edition 技术文档网站。 如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站,请访问 https://docs.azure.cn

Configure Cloud Mirror subvolumes (preview)

This article describes how to configure Cloud Mirror subvolumes (syncing data from the cloud to the edge) in Azure Container Storage enabled by Azure Arc. Conceptually, a Cloud Mirror subvolume is a location that mirrors data from a cloud destination to the edge as a read only copy. The frequency controls when a mirror sync occurs without direct user intervention, for instance, every hour, or once a day at a certain time. The OneShot functionality allows a user to perform a sync right now at a time of their choosing but having the system do a one off sync operation, after which it returns to its normal schedule per the frequency.

Important

Cloud Mirror subvolumes are currently in preview. This functionality isn't recommended for production workloads. If you have any issues or need help with configuration, or to give feedback, contact our team.

Prerequisites

If your mirror root location is blob storage or ADLSgen2, continue following the prerequisites and instructions in this article. If your mirror root location is OneLake, follow the instructions in Configure OneLake Identity for Cloud subvolumes first.

  1. Create a storage account following the instructions in Create an Azure storage account.

    Note

    When you create your storage account, it's recommended that you create it under the same resource group and region/location as your Kubernetes cluster.

  2. Create a container in the storage account that you created previously, following the instructions in Quickstart: Upload, download, and list blobs with the Azure portal > Create a container.

Configure Extension Identity

Edge Volumes allows the use of a system-assigned extension identity for access to blob storage. This section describes how to use the system-assigned extension identity to grant access to your storage account, allowing data to be mirrored from these locations to your edge location.

If you wish to use Workload Identity with Azure Container Storage Enabled by Azure Arc, follow the instructions in Configure Workload Identity for Cloud subvolumes.

Azure portal

  1. Navigate to your Arc-enabled cluster.
  2. Select Extensions.
  3. Select your Azure Container Storage enabled by Azure Arc extension.
  4. Note the Principal ID under Cluster Extension Details.

Configure blob storage account for Extension Identity

Add Extension Identity permissions to a storage account

  1. Navigate to storage account in the Azure portal.
  2. Select Access Control (IAM).
  3. Select Add+ -> Add role assignment.
  4. Select Storage Blob Data Owner, then select Next.
  5. Select +Select Members.
  6. To add your principal ID to the Selected Members: list, paste the ID and select + next to the identity.
  7. Click Select.
  8. To review and assign permissions, select Next, then select Review + Assign.

Create a Cloud Mirror Persistent Volume Claim (PVC)

To create a PVC for your Mirror subvolume, use the following process:

  1. Create a file named cloudMirrorPVC.yaml with the following content:

    kind: PersistentVolumeClaim
    apiVersion: v1
    metadata:
      ### Create a name for your PVC ###
      name: <create-persistent-volume-claim-name-here>
      ### Use a namespace that matched your intended consuming pod, or "default" ###
      namespace: <intended-consuming-pod-or-default-here>
    spec:
      accessModes:
        - ReadWriteMany
      resources:
        requests:
          storage: 2Gi
      storageClassName: cloud-backed-sc
    

    Note

    Use only lowercase letters and dashes. For more information, see the Kubernetes object naming documentation.

    • Edit the metadata.name value and create a name for your PVC. This name is referenced on the last line of deploymentExample.yaml in the next step.
    • Edit the metadata.namespace value with your intended consuming pod. If you don't have an intended consuming pod, set its value to default.
    • The spec.resources.requests.storage parameter determines the size of the persistent volume. It's 2 GB in this example, but can be modified to fit your needs.
  2. To apply the cloudMirrorPVC.yaml, run:

    kubectl apply -f "cloudMirrorPVC.yaml"
    

Attach Mirror subvolume to the Edge Volume

To create a subvolume for Mirror, using extension identity to connect to your storage account container, use the following process:

  1. Get the name of the Edge Volume you created by running the following command:
kubectl get edgevolumes
  1. Create a file named mirrorSubvolume.yaml with the following content:

    apiVersion: "arccontainerstorage.azure.net/v1"
    kind: MirrorSubvolume
    metadata:
      name: <create-a-subvolume-name-here>
    spec:
      edgevolume: <your-edge-volume-name-here>
      path: mirrorSubDir # Don't use a preceding slash
      authentication:
        authType: MANAGED_IDENTITY
      blobAccount:
        accountEndpoint: "https://<STORAGE ACCOUNT NAME>.blob.core.windows.net/"
        containerName: <your-blob-storage-account-container-name>
        indexTagsMode: NoIndexTags
      blobFiltering:
        blobNamePrefix:
      schedule:
        frequency: "@hourly"
        oneshot:
    

    Note

    Use only lowercase letters and dashes. For more information, see the Kubernetes object naming documentation.

    • metadata.name: Create a name for your subvolume.
    • spec.edgevolume: This name was retrieved from the previous step.
    • spec.path: Create your own subdirectory name under the mount path. The default name is mirrorSubDir.
    • spec.authentication.authType: This should be MANAGED_IDENTITY or WORKLOAD_IDENTITY, depending on the authentication mechanism chosen.
    • spec.blobAccount.accountEndpoint: Navigate to your storage account in the Azure portal. On the Overview page, near the top right of the screen, select JSON View. You can find the link under properties.primaryEndpoints.blob. Copy the entire link.
    • spec.blobAccount.containerName: The container name in your storage account.
    • spec.blobAccount.indexTagsMode: NoIndexTags or MirrorIndexTags. MirrorIndexTags requires "Storage Blob Data Owner" permissions, and when set, index tags are translated to corresponding azindex.<name> xattrs. NoIndexTags only requires "Storage Blob Data Reader" permissions, and when set, it keeps index tag xattrs unset.
    • spec.blobFiltering.blobNamePrefix: Optional prefix to filter blobs. For example, if the value is blobNamePrefix: a it only mirrors blobs with names starting with a.
    • spec.schedule.frequency: Schedule for when mirroring should run. Options are: @annually, @yearly, @monthly, @weekly, @daily, @hourly, "never", or cron syntax (five digits, first is minutes (0-59), second is hours (0-23), third is day (1-31), fourth is month (1-12), fifth is day of the week (0-6))
    • spec.schedule.oneshot: By default, left blank. If at any time a uuid is specified here, it triggers an 'immediate' sync. If a uuid is specified on creation, the subvolume performs a sync on initial create, and thereafter according to the frequency. If this parameter is empty on creation, the subvolume initially creates without a sync, and then syncs according to the frequency. Uuids can be generated at uuidgenerator.net or with sed -i "s/oneshot: .*/oneshot: $(uuidgen)/" mirrorSubvolume.yaml
  2. To apply the mirrorSubvolume.yaml, run:

    kubectl apply -f "mirrorSubvolume.yaml"
    

Attach your app (Kubernetes native application)

To configure a generic single pod (Kubernetes native application) against the PVC to use the Mirror capabilities, use the following process:

  1. Create a file named deploymentExample.yaml with the following content:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: cloudmirrorsubvol-deployment ### This must be unique for each deployment you choose to create.
    spec:
      replicas: 2
      selector:
        matchLabels:
          name: acsa-testclientdeployment
      template:
        metadata:
          name: acsa-testclientdeployment
          labels:
            name: acsa-testclientdeployment
        spec:
          affinity:
            podAntiAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
              - labelSelector:
                  matchExpressions:
                  - key: app
                    operator: In
                    values:
                    - acsa-testclientdeployment
                topologyKey: kubernetes.io/hostname
          containers:
            ### Specify the container in which to launch the busy box. ###
            - name: mirror-deployment-container
              image: mcr.microsoft.com/azure-cli:2.57.0@sha256:c7c8a97f2dec87539983f9ded34cd40397986dcbed23ddbb5964a18edae9cd09
              command:
                - "/bin/sh"
                - "-c"
                - "while true; do ls /data/mirrorSubDir &>/dev/null || break; sleep 1; done"
              volumeMounts:
                ### This name must match the volumes.name attribute below ###
                - name: acsa-volume
                  ### This mountPath is where the PVC is attached to the pod's filesystem ###
                  mountPath: "/data"
          volumes:
             ### User-defined 'name' that's used to link the volumeMounts. This name must match volumeMounts.name as previously specified. ###
            - name: acsa-volume
              persistentVolumeClaim:
                ### This claimName must refer to your PVC metadata.name (Line 5)
                claimName: <your-pvc-metadata-name-from-line-5-of-pvc-yaml>
    

    Note

    Use only lowercase letters and dashes. For more information, see the Kubernetes object naming documentation.

    • Edit the containers.name and volumes.persistentVolumeClaim.claimName values.
    • If you edited the spec.path value in mirrorSubvolume.yaml, the value mirrorSubDir on this file must be updated with your new path name.
    • The spec.replicas parameter determines the number of replica pods to create. It's 2 in this example, but can be modified to fit your needs.
  2. To apply the deploymentExample.yaml and create the pod, run:

    kubectl apply -f "deploymentExample.yaml"
    
  3. Find the name of your pod to use in the next step:

    kubectl get pods
    

    Note

    Because spec.replicas from deploymentExample.yaml was specified with 2, two pods are created. You can use either pod name for the next step.

  4. Run the following command to start exec into the pod. Replace <name-of-pod> with your pod name from the previous step:

    kubectl exec -it <name-of-pod> -- sh
    
  5. Change directories into the /data mount path as specified from your deploymentExample.yaml file:

    cd /data
    
  6. You should see a directory that matches the value you set for spec.pathin mirrorSubvolume.yaml. If you used the default value, its name is mirrorSubDir. Change to that subdirectory, and check for any contents mirrored there:

    cd mirrorSubDir
    ls
    

If you specified a uuid in the spec.schedule.oneshot field upon creation, and/or the specified spec.schedule.frequency requirements are satisfied, the subvolume should perform a sync on initial create, and you should see data here mirrored from your specified storage account container. If either of those conditions wasn't met, this directory should be empty.

Check the status of the Cloud Mirror subvolume Synchronization

First, check the status of the mirrorSubvolume. Next, ensure that you have data in your Storage Account Container in Azure. Finally, check the contents of the Mirror subvolume again to ensure that the data was properly Mirrored.

Check Mirror subvolume Status

Check the status of your Mirror subvolume, in particular, check that the BACKENDCONNECTION field is Connected:

kubectl get mirrorsubvolumes

Add data to your Storage Account Container

If you don't already have data in your specified container, use one of the following methods to upload files:

Add or move files to your blob storage account container so they can be mirrored on your edge cluster.

Trigger the Mirror sync

  1. To make sure the new data is mirrored from the cloud to the subvolume, update the spec.schedule.oneshot field in the file mirrorSubvolume.yaml to a new uuid. Uuids can be generated at uuidgenerator.net or with the following command:

    sed -i "s/oneshot: .*/oneshot: $(uuidgen)/" mirrorSubvolume.yaml
    
  2. To apply the change and thus trigger the oneshot sync, run:

    kubectl apply -f "mirrorSubvolume.yaml"
    

Check Mirror subvolume again

  1. Check the status of our Mirror subvolume again to ensure the BACKENDCONNECTION field is still Connected:

    kubectl get mirrorsubvolumes
    
  2. Once complete, we can check the contents of our Mirror subvolume. You can do that by connecting to the example pod we created by running:

    kubectl exec -it <name-of-pod> -- sh
    

    You should now see the files from your Cloud Storage Account mirrored into this subvolume.

Next Steps