Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
In this quickstart, you deploy Edge RAG on Azure Kubernetes Service (AKS) without the need for local hardware like Azure Local. This quickstart is intended to get you started with Edge RAG for evaluation or development purposes. To deploy Edge RAG for a production environment, see Deployment overview.
Important
Edge RAG Preview, enabled by Azure Arc is currently in PREVIEW. See the Supplemental Terms of Use for Microsoft Azure Previews for legal terms that apply to Azure features that are in beta, preview, or otherwise not yet released into general availability.
Prerequisites
Before you begin, make sure you have:
- An active Azure subscription. If you don't have a service subscription, create a free account before you begin.
- Azure CLI, Helm, kubectl, and the extensions aksarc and Kubernetes-extension installed locally unless you plan to use Azure Cloud Shell. If you're not using Azure Cloud Shell, see Script to configure machine to manage Azure Arc-enabled Kubernetes cluster.
- Edge RAG registered as an application, and app roles and an assigned user created in Microsoft Entra ID. See Configure authentication for Edge RAG.
- Application (client) ID and the directory (tenant) ID. To get these values after registering Edge RAG, see Get app and tenant IDs.
Open Azure Cloud Shell or Azure CLI
Open Azure Cloud Shell or your local Azure CLI to run the commands in this article. In Azure Cloud Shell, you might need to select Switch to PowerShell.
Sign in to Azure to get started:
az loginIf you have multiple subscriptions, run the following command to get a list of your subscriptions and then set the context of your session to the appropriate subscription name:
az account list --output tableReplace the placeholder "subscription name" with your subscription and run the following command:
$sub = "<subscription name>" az account set --subscription $sub
Create resource group
Create a resource group to contain the AKS cluster, node pool, and Edge RAG resources.
$rg = "edge-rag-aks-rg"
$location = "eastus2"
az group create `
--name $rg `
--location $location
Create and configure an AKS cluster
In this section, you create an AKS cluster and configure it for Edge RAG deployment. The steps include setting up the cluster, connecting it to Azure Arc, and preparing it with the necessary extensions and GPU support.
Create an AKS cluster:
$k8scluster = "edge-rag-aks" az aks create ` --resource-group $rg ` --name $k8scluster ` --node-count 2 ` --generate-ssh-keysSet the rest of the following values as needed and then run the command. If you created the application registration for Edge RAG in a different tenant from the AKS cluster, set the values for
$entraAppIdand$entraTenantIdby using the Application (client) ID and Directory (tenant) ID on the EdgeRAG app registration page in the Azure portal.# Set Edge RAG extension values $modelName = "microsoft/Phi-3.5" $gpu_enabled = "true" # set to false if no GPU nodes $localextname = "edgeragdemo" $autoUpgrade = "false" $extension = "microsoft.arc.rag" # do not change $n = "arc-rag" # do not change # Set Entra ID app registration values $domainName = "arcrag.contoso.com" # Edit to match the domain used in your registration $entraAppId = $(az ad app list --display-name "EdgeRAG" --query "[].appId" --output tsv) # Display name is the application name in your registration $entraTenantId = $(az account show --query tenantId --output tsv) # Directory or tenant IDIf you get a warning when setting
$entraAppIdor$entraTenantIdby using the queries, set the values by using the Application (client) ID and Directory (tenant) ID on the EdgeRAG app registration page in the Azure portal.Connect to Azure and AKS:
az login ` --scope https://management.core.windows.net//.default ` --tenant $entraTenantId az aks get-credentials ` --resource-group $rg ` --name $k8scluster ` --overwrite-existingFollow the prompts in the command line to sign in and select the subscription.
Install the NVIDIA GPU operator on the cluster:
helm repo add nvidia https://helm.ngc.nvidia.com/nvidia helm repo update helm install --wait --generate-name -n gpu-operator --create-namespace nvidia/gpu-operator --version=v24.9.2Register the
Microsoft.Kubernetesprovider by running the following command:az provider register -n Microsoft.KubernetesConnect the AKS cluster to Azure Arc:
az connectedk8s connect ` --resource-group $rg ` --location $location ` --name $k8sclusterIf prompted, select y to install the extension "connectedk8s".
Install the required certificate and trust manager:
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.15.3/cert-manager.yaml --wait helm repo add jetstack https://charts.jetstack.io --force-update start-sleep -Seconds 20 helm upgrade trust-manager jetstack/trust-manager --install --namespace cert-manager --wait
Create node pools
Add dedicated GPU and CPU node pools to your AKS cluster to support Edge RAG.
If you get an error message when you try to create the node pools, you might need to request a quota increase for your Azure subscription, try a different virtual machine size, or create the Azure Kubernetes cluster and node pools in a different Azure region. For more information, see Limits for resources, SKUs, and regions in Azure Kubernetes Service (AKS).
Run the following command to create a GPU node pool with three nodes:
az aks nodepool add ` --resource-group $rg ` --cluster-name $k8scluster ` --name "gpunodepool" ` --node-count 4 ` --node-vm-size "Standard_NC24ads_A100_v4" ` --enable-cluster-autoscaler ` --min-count 4 ` --max-count 4 ` --mode UserRun the following command to create a CPU node pool with three nodes:
az aks nodepool add ` --resource-group $rg ` --cluster-name $k8scluster ` --name "cpunodepool" ` --node-count 4 ` --node-vm-size "Standard_D8s_v3" ` --enable-cluster-autoscaler ` --min-count 4 ` --max-count 4 ` --mode User
Deploy Edge RAG on AKS
Complete the following steps to deploy the Edge RAG extension onto your AKS cluster.
Deploy the Edge RAG extension by running the following command:
az k8s-extension create ` --cluster-type connectedClusters ` --cluster-name $k8scluster ` --resource-group $rg ` --name $localextname ` --extension-type $extension ` --debug --release-train preview ` --auto-upgrade $autoUpgrade ` --configuration-settings isManagedIdentityRequired=true ` --configuration-settings gpu_enabled=$gpu_enabled ` --configuration-settings AgentOperationTimeoutInMinutes=60 ` --configuration-settings model=$modelName ` --configuration-settings auth.tenantId=$entraTenantId ` --configuration-settings auth.clientId=$entraAppId ` --configuration-settings ingress.domainname=$domainName ` --configuration-settings ingress-nginx.controller.service.annotations.service\.beta\.kubernetes\.io/azure-load-balancer-health-probe-request-path=/healthzWait several minutes for the deployment to complete.
Get the load balancer VIP by running the following command:
kubectl get service ingress-nginx-controller -n arc-rag -o yamlLook for:
status: loadBalancer: ingress: - ip: <load_balancer_ip> ipMode: VIP
Connect to the developer portal
Update your host file on your local machine to connect to the developer portal for Edge RAG.
On your local machine, open Notepad in Administrator mode.
Go to File > Open > C:\windows\System32\drivers\etc > hosts. If you can't see the "hosts" file, set the extension type to All files.
Add the following line at the end of the file where you replace
load_balancer_ipwith the load balancer IP, and edit the domain to match the app registration:<load_balancer_ip> arcrag.contoso.comFor example:
# Edge RAG developer portal 172.16.0.0 arcrag.contoso.comSave the file.
Go to the developer portal for Edge RAG by using the domain URL you added to the local "hosts" file. For example:
https://arcrag.contoso.com.Select Get started. Then, follow the next steps at the end of this article to add a data source and set up the data query.
(Optional) Clean up resources
If you're done trying out Edge RAG, remove the resources created in this quickstart by running the following command:
az group delete `
--name $rg `
--yes `
--no-wait