Why can't nvidia-smi communicate with the driver on my Ubuntu 22.04 Azure VM, even after a clean install, disabling Secure Boot, and confirming the DKMS driver is built for the active kernel?

Deeksha Dixit 20 Reputation points
2025-10-16T18:42:29.6433333+00:00

nvidia-smi Fails on Ubuntu 22.04 Azure VM Despite Correct Setup

Goal: Get the NVIDIA driver for an A10 GPU working on an Azure VM.

Problem: nvidia-smi consistently fails with couldn't communicate with the NVIDIA driver, even after extensive troubleshooting and multiple reboots.


System Details

OS: Ubuntu 22.04.5 LTS

GPU: NVIDIA A10

Kernel: 6.8.0-1036-azure

Driver: nvidia-driver-580


Troubleshooting Performed

The installation appears correct, but the driver fails to load at runtime. Key steps taken include:

Clean Installation: Overcame initial package conflicts by completely purging all nvidia-* packages and reinstalling the recommended driver.

Secure Boot Disabled: The VM was fully recreated from its disk with Standard Security to ensure Secure Boot was disabled.

Verified: mokutil --sb-state confirms SecureBoot disabled.

Correct Kernel Module Built: The NVIDIA driver module has been successfully built for the active kernel via DKMS.

  ✅ __Verified:__ `dkms status` shows `nvidia/580.95.05, 6.8.0-1036-azure, x86_64: installed`.
  

Current Status

Despite all indicators showing a correct installation, the driver still fails to load. This suggests a persistent, unrecoverable issue with the OS state on the disk, making a fresh VM from a pre-built image the most logical next step.

Azure Virtual Machines
Azure Virtual Machines
An Azure service that is used to provision Windows and Linux virtual machines.
{count} votes

Answer accepted by question author
  1. Jilakara Hemalatha 3,115 Reputation points Microsoft External Staff Moderator
    2025-10-16T19:00:48.59+00:00

    Hi Deeksha Dixit,

    Thank you for providing detailed information on your NVIDIA driver issue on the Azure Ubuntu 22.04 VM with the A10 GPU.

    Could you please follow below steps:

    1. If you have any existing drives uninstall all existing NVIDIA drivers to start from a clean state or alternatively, you can create a new VM to ensure a clean environment.
    2. After creating the VM. Enable the NVIDIA GPU Driver Extension for Linux using this CLI command:
    az vm extension set --publisher Microsoft.HpcCompute --name NvidiaGpuDriverLinux --vm-name <your-vm-name> --resource-group <your-rg>
    

    This installs and configures the validated NVIDIA driver automatically, ensuring compatibility with the Azure kernel and persistence across reboots.User's image

    1. Login to the VM and verify the drive details:
    nvidia-smi
    
    

    User's image

    Reference: https://free.blessedness.top/en-us/azure/virtual-machines/extensions/hpccompute-gpu-linux

    https://free.blessedness.top/en-us/azure/virtual-machines/linux/n-series-driver-setup#supported-distributions-and-drivers

    Hope this helps! You should now be able to view the NVIDIA driver details successfully. Please let me know if you have any further questions or if the issue persists after these steps.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.