Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Important
- Foundry Local is available in preview. Public preview releases provide early access to features that are in active deployment.
- Features, approaches, and processes can change or have limited capabilities, before General Availability (GA).
Foundry Local is an on-device AI inference solution that provides performance, privacy, customization, and cost benefits. It integrates with your workflows and applications through a CLI, SDK, and REST API.
Key features
On-Device inference: Run models locally to reduce costs and keep data on your device.
Model customization: Select a preset model or use your own to meet specific needs.
Cost efficiency: Use existing hardware to eliminate recurring cloud costs and make AI more accessible.
Seamless integration: Integrate with your apps through the SDK, API endpoints, or CLI, and scale to Azure AI Foundry as your needs grow.
Use cases
Foundry Local is ideal when you need to:
- Keep sensitive data on your device
- Operate in limited or offline environments
- Reduce cloud inference costs
- Get low latency AI responses for real-time applications
- Experiment with AI models before you deploy to the cloud
Do I need an Azure subscription
No. Foundry Local runs on your hardware, letting you use your existing infrastructure without cloud services.
Frequently asked questions
Do I need special drivers for NPU acceleration
Install the driver for your NPU hardware:
Intel NPU: Install the Intel NPU driver to enable NPU acceleration on Windows.
Qualcomm NPU: Install the Qualcomm NPU driver to enable NPU acceleration. If you see the error
Qnn error code 5005: Failed to load from EpContext model. qnn_backend_manager., it likely indicates an outdated driver or an NPU resource conflict. Reboot to clear the conflict, especially after using Windows Copilot+ features.
After you install the drivers, Foundry Local automatically detects and uses the NPU.
Get started
Follow the Get started with Foundry Local guide to set up Foundry Local, discover models, and run your first local AI model.