What is Foundry Local?

2025-10-01

Important

Foundry Local is available in preview. Public preview releases provide early access to features that are in active deployment.
Features, approaches, and processes can change or have limited capabilities, before General Availability (GA).

Foundry Local is an on-device AI inference solution that provides performance, privacy, customization, and cost benefits. It integrates with your workflows and applications through a CLI, SDK, and REST API.

Key features

On-Device inference: Run models locally to reduce costs and keep data on your device.
Model customization: Select a preset model or use your own to meet specific needs.
Cost efficiency: Use existing hardware to eliminate recurring cloud costs and make AI more accessible.
Seamless integration: Integrate with your apps through the SDK, API endpoints, or CLI, and scale to Azure AI Foundry as your needs grow.

Use cases

Foundry Local is ideal when you need to:

Keep sensitive data on your device
Operate in limited or offline environments
Reduce cloud inference costs
Get low latency AI responses for real-time applications
Experiment with AI models before you deploy to the cloud

Do I need an Azure subscription

No. Foundry Local runs on your hardware, letting you use your existing infrastructure without cloud services.

Frequently asked questions

Do I need special drivers for NPU acceleration

Install the driver for your NPU hardware:

Intel NPU: Install the Intel NPU driver to enable NPU acceleration on Windows.
Qualcomm NPU: Install the Qualcomm NPU driver to enable NPU acceleration. If you see the error Qnn error code 5005: Failed to load from EpContext model. qnn_backend_manager., it likely indicates an outdated driver or an NPU resource conflict. Reboot to clear the conflict, especially after using Windows Copilot+ features.

After you install the drivers, Foundry Local automatically detects and uses the NPU.

Get started

Follow the Get started with Foundry Local guide to set up Foundry Local, discover models, and run your first local AI model.

Feedback

Was this page helpful?