Edit

Share via


Quickstart: Install Edge RAG Preview enabled by Azure Arc

In this quickstart, you deploy Edge RAG on Azure Kubernetes Service (AKS) without the need for local hardware like Azure Local. This quickstart is intended to get you started with Edge RAG for evaluation or development purposes. To deploy Edge RAG for a production environment, see Deployment overview.

Important

Edge RAG Preview, enabled by Azure Arc is currently in PREVIEW. See the Supplemental Terms of Use for Microsoft Azure Previews for legal terms that apply to Azure features that are in beta, preview, or otherwise not yet released into general availability.

Prerequisites

Before you begin, make sure you have:

Open Azure Cloud Shell or Azure CLI

Open Azure Cloud Shell or your local Azure CLI to run the commands in this article. In Azure Cloud Shell, you might need to select Switch to PowerShell.

  1. Sign in to Azure to get started:

    az login
    
  2. If you have multiple subscriptions, run the following command to get a list of your subscriptions and then set the context of your session to the appropriate subscription name:

    az account list --output table
    

    Replace the placeholder "subscription name" with your subscription and run the following command:

    $sub = "<subscription name>" 
    az account set --subscription  $sub
    

Create resource group

Create a resource group to contain the AKS cluster, node pool, and Edge RAG resources.

$rg = "edge-rag-aks-rg" 
$location = "eastus2"
az group create `
     --name $rg `
     --location $location

Create and configure an AKS cluster

In this section, you create an AKS cluster and configure it for Edge RAG deployment. The steps include setting up the cluster, connecting it to Azure Arc, and preparing it with the necessary extensions and GPU support.

  1. Create an AKS cluster:

    $k8scluster = "edge-rag-aks"  
    az aks create `
       --resource-group $rg `
       --name $k8scluster `
       --node-count 2 `
       --generate-ssh-keys
    
  2. Set the rest of the following values as needed and then run the command. If you created the application registration for Edge RAG in a different tenant from the AKS cluster, set the values for $entraAppId and $entraTenantId by using the Application (client) ID and Directory (tenant) ID on the EdgeRAG app registration page in the Azure portal.

    
    # Set Edge RAG extension values
    $modelName = "microsoft/Phi-3.5"    
    $gpu_enabled = "true" # set to false if no GPU nodes 
    $localextname = "edgeragdemo"  
    $autoUpgrade = "false" 
    $extension = "microsoft.arc.rag" # do not change    
    $n = "arc-rag" # do not change
    
    # Set Entra ID app registration values
    $domainName = "arcrag.contoso.com" # Edit to match the domain used in your registration  
    $entraAppId = $(az ad app list --display-name "EdgeRAG" --query "[].appId" --output tsv)  # Display name is the application name in your registration   
    $entraTenantId = $(az account show --query tenantId --output tsv) # Directory or tenant ID     
    
    

    If you get a warning when setting $entraAppId or $entraTenantId by using the queries, set the values by using the Application (client) ID and Directory (tenant) ID on the EdgeRAG app registration page in the Azure portal.

  3. Connect to Azure and AKS:

    az login `
       --scope https://management.core.windows.net//.default `
       --tenant $entraTenantId    
    az aks get-credentials `
       --resource-group $rg `
       --name $k8scluster `
       --overwrite-existing 
    

    Follow the prompts in the command line to sign in and select the subscription.

  4. Install the NVIDIA GPU operator on the cluster:

    helm repo add nvidia https://helm.ngc.nvidia.com/nvidia 
    helm repo update   
    helm install --wait --generate-name -n gpu-operator --create-namespace nvidia/gpu-operator --version=v24.9.2 
    
  5. Register the Microsoft.Kubernetes provider by running the following command:

    az provider register -n Microsoft.Kubernetes
    
  6. Connect the AKS cluster to Azure Arc:

    az connectedk8s connect `
       --resource-group $rg ` 
       --location $location ` 
       --name $k8scluster  
    

    If prompted, select y to install the extension "connectedk8s".

  7. Install the required certificate and trust manager:

    
     kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.15.3/cert-manager.yaml --wait  
     helm repo add jetstack https://charts.jetstack.io --force-update  
     start-sleep -Seconds 20 
     helm upgrade trust-manager jetstack/trust-manager --install --namespace cert-manager --wait 
    

Create node pools

Add dedicated GPU and CPU node pools to your AKS cluster to support Edge RAG.

If you get an error message when you try to create the node pools, you might need to request a quota increase for your Azure subscription, try a different virtual machine size, or create the Azure Kubernetes cluster and node pools in a different Azure region. For more information, see Limits for resources, SKUs, and regions in Azure Kubernetes Service (AKS).

  1. Run the following command to create a GPU node pool with three nodes:

    az aks nodepool add ` 
        --resource-group $rg ` 
        --cluster-name $k8scluster ` 
        --name "gpunodepool" ` 
        --node-count 4 ` 
        --node-vm-size "Standard_NC24ads_A100_v4" `
        --enable-cluster-autoscaler `
        --min-count 4 ` 
        --max-count 4 ` 
        --mode User 
    
  2. Run the following command to create a CPU node pool with three nodes:

    az aks nodepool add ` 
        --resource-group $rg ` 
        --cluster-name $k8scluster ` 
        --name "cpunodepool" ` 
        --node-count 4 ` 
        --node-vm-size "Standard_D8s_v3" ` 
        --enable-cluster-autoscaler ` 
        --min-count 4 ` 
        --max-count 4 ` 
        --mode User
    

Deploy Edge RAG on AKS

Complete the following steps to deploy the Edge RAG extension onto your AKS cluster.

  1. Deploy the Edge RAG extension by running the following command:

    az k8s-extension create `    
        --cluster-type connectedClusters `   
        --cluster-name $k8scluster `    
        --resource-group $rg `    
        --name $localextname `   
        --extension-type $extension `    
        --debug --release-train preview ` 
        --auto-upgrade $autoUpgrade ` 
        --configuration-settings isManagedIdentityRequired=true ` 
        --configuration-settings gpu_enabled=$gpu_enabled ` 
        --configuration-settings AgentOperationTimeoutInMinutes=60 ` 
        --configuration-settings model=$modelName ` 
        --configuration-settings auth.tenantId=$entraTenantId ` 
        --configuration-settings auth.clientId=$entraAppId ` 
        --configuration-settings ingress.domainname=$domainName ` 
        --configuration-settings ingress-nginx.controller.service.annotations.service\.beta\.kubernetes\.io/azure-load-balancer-health-probe-request-path=/healthz 
    

    Wait several minutes for the deployment to complete.

  2. Get the load balancer VIP by running the following command:

    kubectl get service ingress-nginx-controller -n arc-rag -o yaml 
    

    Look for:

    status:    
      loadBalancer:   
        ingress:    
         - ip: <load_balancer_ip>    
       ipMode: VIP 
    

Connect to the developer portal

Update your host file on your local machine to connect to the developer portal for Edge RAG.

  1. On your local machine, open Notepad in Administrator mode.

  2. Go to File > Open > C:\windows\System32\drivers\etc > hosts. If you can't see the "hosts" file, set the extension type to All files.

  3. Add the following line at the end of the file where you replace load_balancer_ip with the load balancer IP, and edit the domain to match the app registration:

    <load_balancer_ip> arcrag.contoso.com

    For example:

    # Edge RAG developer portal
    172.16.0.0 arcrag.contoso.com
    
  4. Save the file.

  5. Go to the developer portal for Edge RAG by using the domain URL you added to the local "hosts" file. For example: https://arcrag.contoso.com.

  6. Select Get started. Then, follow the next steps at the end of this article to add a data source and set up the data query.

(Optional) Clean up resources

If you're done trying out Edge RAG, remove the resources created in this quickstart by running the following command:

az group delete `
   --name $rg `
   --yes `
   --no-wait

Next step