Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This article lists a selection of Azure AI Foundry Models from partners and community along with their capabilities, deployment types, and regions of availability, excluding deprecated and legacy models. Most Foundry Models come from partners and community. Trusted third-party organizations, partners, research labs, and community contributors provide these models.
Depending on the kind of project you use in Azure AI Foundry, you see a different selection of models. Specifically, if you use a Foundry project built on an Azure AI Foundry resource, you see the models that are available for standard deployment to a Foundry resource. Alternatively, if you use a hub-based project hosted by an Azure AI Foundry hub, you see models that are available for deployment to managed compute and serverless APIs. These model selections often overlap because many models support multiple deployment options.
To learn more about attributes of Foundry Models from partners and community, see Explore Azure AI Foundry Models.
Note
For a list of models sold directly by Azure, see Foundry Models sold directly by Azure.
Cohere
The Cohere family of models includes various models optimized for different use cases, including chat completions and embeddings. Cohere models are optimized for various use cases that include reasoning, summarization, and question answering.
| Model | Type | Capabilities | Project type |
|---|---|---|---|
| Cohere-command-a | chat-completion | - Input: text (131,072 tokens) - Output: text (8,182 tokens) - Languages: en, fr, es, it, de, pt-br, ja, ko, zh-cn, and ar - Tool calling: Yes - Response formats: Text, JSON |
Foundry, Hub-based |
| Cohere-command-r-plus-08-2024 | chat-completion | - Input: text (131,072 tokens) - Output: text (4,096 tokens) - Languages: en, fr, es, it, de, pt-br, ja, ko, zh-cn, and ar - Tool calling: Yes - Response formats: Text, JSON |
Foundry, Hub-based |
| Cohere-command-r-08-2024 | chat-completion | - Input: text (131,072 tokens) - Output: text (4,096 tokens) - Languages: en, fr, es, it, de, pt-br, ja, ko, zh-cn, and ar - Tool calling: Yes - Response formats: Text, JSON |
Foundry, Hub-based |
| embed-v-4-0 | embeddings | - Input: text (512 tokens) and images (2MM pixels) - Output: Vector (256, 512, 1024, 1536 dim.) - Languages: en, fr, es, it, de, pt-br, ja, ko, zh-cn, and ar |
Foundry, Hub-based |
| Cohere-embed-v3-english | embeddings | - Input: text and images (512 tokens) - Output: Vector (1024 dim.) - Languages: en |
Foundry, Hub-based |
| Cohere-embed-v3-multilingual | embeddings | - Input: text (512 tokens) - Output: Vector (1024 dim.) - Languages: en, fr, es, it, de, pt-br, ja, ko, zh-cn, and ar |
Foundry, Hub-based |
Cohere rerank
| Model | Type | Capabilities | API Reference | Project type |
|---|---|---|---|---|
| Cohere-rerank-v3.5 | rerank text classification |
- Input: text - Output: text - Languages: English, Chinese, French, German, Indonesian, Italian, Portuguese, Russian, Spanish, Arabic, Dutch, Hindi, Japanese, Vietnamese |
Cohere's v2/rerank API | Hub-based |
For more details on pricing for Cohere rerank models, see Pricing for Cohere rerank models.
See the Cohere model collection in Azure AI Foundry portal.
Core42
Core42 includes autoregressive bilingual LLMs for Arabic and English with state-of-the-art capabilities in Arabic.
| Model | Type | Capabilities | Project type |
|---|---|---|---|
| jais-30b-chat | chat-completion | - Input: text (8,192 tokens) - Output: (4,096 tokens) - Languages: en and ar - Tool calling: Yes - Response formats: Text, JSON |
Foundry, Hub-based |
See this model collection in Azure AI Foundry portal.
Meta
Meta Llama models and tools are a collection of pretrained and fine-tuned generative AI text and image reasoning models. Meta models range in scale to include:
- Small language models (SLMs) like 1B and 3B Base and Instruct models for on-device and edge inferencing
- Mid-size large language models (LLMs) like 7B, 8B, and 70B Base and Instruct models
- High-performance models like Meta Llama 3.1-405B Instruct for synthetic data generation and distillation use cases.
| Model | Type | Capabilities | Project type |
|---|---|---|---|
| Llama-3.2-11B-Vision-Instruct | chat-completion | - Input: text and image (128,000 tokens) - Output: (8,192 tokens) - Languages: en - Tool calling: No - Response formats: Text |
Foundry, Hub-based |
| Llama-3.2-90B-Vision-Instruct | chat-completion | - Input: text and image (128,000 tokens) - Output: (8,192 tokens) - Languages: en - Tool calling: No - Response formats: Text |
Foundry, Hub-based |
| Meta-Llama-3.1-405B-Instruct | chat-completion | - Input: text (131,072 tokens) - Output: (8,192 tokens) - Languages: en, de, fr, it, pt, hi, es, and th - Tool calling: No - Response formats: Text |
Foundry, Hub-based |
| Meta-Llama-3.1-8B-Instruct | chat-completion | - Input: text (131,072 tokens) - Output: (8,192 tokens) - Languages: en, de, fr, it, pt, hi, es, and th - Tool calling: No - Response formats: Text |
Foundry, Hub-based |
| Llama-4-Scout-17B-16E-Instruct | chat-completion | - Input: text and image (128,000 tokens) - Output: text (8,192 tokens) - Tool calling: No - Response formats: Text |
Foundry, Hub-based |
See this model collection in Azure AI Foundry portal. You can also find several Meta models available as models sold directly by Azure.
Microsoft
Microsoft models include various model groups such as MAI models, Phi models, healthcare AI models, and more.
| Model | Type | Capabilities | Project type |
|---|---|---|---|
| Phi-4-mini-instruct | chat-completion | - Input: text (131,072 tokens) - Output: (4,096 tokens) - Languages: ar, zh, cs, da, nl, en, fi, fr, de, he, hu, it, ja, ko, no, pl, pt, ru, es, sv, th, tr, and uk - Tool calling: No - Response formats: Text |
Foundry, Hub-based |
| Phi-4-multimodal-instruct | chat-completion | - Input: text, images, and audio (131,072 tokens) - Output: (4,096 tokens) - Languages: ar, zh, cs, da, nl, en, fi, fr, de, he, hu, it, ja, ko, no, pl, pt, ru, es, sv, th, tr, and uk - Tool calling: No - Response formats: Text |
Foundry, Hub-based |
| Phi-4 | chat-completion | - Input: text (16,384 tokens) - Output: (16,384 tokens) - Languages: en, ar, bn, cs, da, de, el, es, fa, fi, fr, gu, ha, he, hi, hu, id, it, ja, jv, kn, ko, ml, mr, nl, no, or, pa, pl, ps, pt, ro, ru, sv, sw, ta, te, th, tl, tr, uk, ur, vi, yo, and zh - Tool calling: No - Response formats: Text |
Foundry, Hub-based |
| Phi-4-reasoning | chat-completion with reasoning content | - Input: text (32,768 tokens) - Output: text (32,768 tokens) - Languages: en - Tool calling: No - Response formats: Text |
Foundry, Hub-based |
| Phi-4-mini-reasoning | chat-completion with reasoning content | - Input: text (128,000 tokens) - Output: text (128,000 tokens) - Languages: en - Tool calling: No - Response formats: Text |
Foundry, Hub-based |
See the Microsoft model collection in Azure AI Foundry portal. Microsoft models are also available as models sold directly by Azure.
Mistral AI
Mistral AI offers two categories of models: premium models such as Mistral Large 2411 and Ministral 3B, and open models such as Mistral Nemo.
| Model | Type | Capabilities | Project type |
|---|---|---|---|
| Codestral-2501 | chat-completion | - Input: text (262,144 tokens) - Output: text (4,096 tokens) - Languages: en - Tool calling: No - Response formats: Text |
Foundry, Hub-based |
| Ministral-3B | chat-completion | - Input: text (131,072 tokens) - Output: text (4,096 tokens) - Languages: fr, de, es, it, and en - Tool calling: Yes - Response formats: Text, JSON |
Foundry, Hub-based |
| Mistral-Nemo | chat-completion | - Input: text (131,072 tokens) - Output: text (4,096 tokens) - Languages: en, fr, de, es, it, zh, ja, ko, pt, nl, and pl - Tool calling: Yes - Response formats: Text, JSON |
Foundry, Hub-based |
| Mistral-small-2503 | chat-completion | - Input: text (32,768 tokens) - Output: text (4,096 tokens) - Languages: fr, de, es, it, and en - Tool calling: Yes - Response formats: Text, JSON |
Foundry, Hub-based |
| Mistral-medium-2505 | chat-completion | - Input: text (128,000 tokens), image - Output: text (128,000 tokens) - Tool calling: No - Response formats: Text, JSON |
Foundry, Hub-based |
| Mistral-Large-2411 | chat-completion | - Input: text (128,000 tokens) - Output: text (4,096 tokens) - Languages: en, fr, de, es, it, zh, ja, ko, pt, nl, and pl - Tool calling: Yes - Response formats: Text, JSON |
Foundry, Hub-based |
| Mistral-OCR-2503 | image to text | - Input: image or PDF pages (1,000 pages, max 50MB PDF file) - Output: text - Tool calling: No - Response formats: Text, JSON, Markdown |
Hub-based |
| mistralai-Mistral-7B-Instruct-v01 | chat-completion | - Input: text - Output: text - Languages: en - Response formats: Text |
Hub-based |
| mistralai-Mistral-7B-Instruct-v0-2 | chat-completion | - Input: text - Output: text - Languages: en - Response formats: Text |
Hub-based |
| mistralai-Mixtral-8x7B-Instruct-v01 | chat-completion | - Input: text - Output: text - Languages: en - Response formats: Text |
Hub-based |
| mistralai-Mixtral-8x22B-Instruct-v0-1 | chat-completion | - Input: text (64,000 tokens) - Output: text (4,096 tokens) - Languages: fr, it, de, es, en - Response formats: Text |
Hub-based |
See this model collection in Azure AI Foundry portal. Mistral models are also available as models sold directly by Azure.
Nixtla
Nixtla's TimeGEN-1 is a generative pretrained forecasting and anomaly detection model for time series data. TimeGEN-1 produces accurate forecasts for new time series without training, using only historical values and exogenous covariates as inputs.
To perform inferencing, TimeGEN-1 requires you to use Nixtla's custom inference API.
| Model | Type | Capabilities | Inference API | Project type |
|---|---|---|---|---|
| TimeGEN-1 | Forecasting | - Input: Time series data as JSON or dataframes (with support for multivariate input) - Output: Time series data as JSON - Tool calling: No - Response formats: JSON |
Forecast client to interact with Nixtla's API | Hub-based |
For more details on pricing for Nixtla models, see Nixtla.
NTT Data
tsuzumi is an autoregressive language-optimized transformer. The tuned versions use supervised fine-tuning (SFT). tsuzumi handles both Japanese and English language with high efficiency.
| Model | Type | Capabilities | Project type |
|---|---|---|---|
| tsuzumi-7b | chat-completion | - Input: text (8,192 tokens) - Output: text (8,192 tokens) - Languages: en and jp - Tool calling: No - Response formats: Text |
Hub-based |
See this model collection in Azure AI Foundry portal.
Stability AI
The Stability AI collection of image generation models includes Stable Image Core, Stable Image Ultra, and Stable Diffusion 3.5 Large. Stable Diffusion 3.5 Large accepts both image and text input.
| Model | Type | Capabilities | Project type |
|---|---|---|---|
| Stable Diffusion 3.5 Large | Image generation | - Input: text and image (1,000 tokens and 1 image) - Output: One Image - Tool calling: No - Response formats: Image (PNG and JPG) |
Foundry, Hub-based |
| Stable Image Core | Image generation | - Input: text (1,000 tokens) - Output: One Image - Tool calling: No - Response formats: Image (PNG and JPG) |
Foundry, Hub-based |
| Stable Image Ultra | Image generation | - Input: text (1,000 tokens) - Output: One Image - Tool calling: No - Response formats: Image (PNG and JPG) |
Foundry, Hub-based |
See this model collection in Azure AI Foundry portal.
Open and custom models
The model catalog offers a larger selection of models from a wider range of providers. For these models, you can't use the option for standard deployment in Azure AI Foundry resources, where models are provided as APIs. Instead, to deploy these models, you might need to host them on your infrastructure, create an AI hub, and provide the underlying compute quota to host the models.
Furthermore, these models can be open-access or IP protected. In both cases, you have to deploy them in managed compute offerings in Azure AI Foundry. To get started, see How-to: Deploy to Managed compute.