AIFoundryModel.OpenAI Class

Definition

Namespace:: Aspire.Hosting.Azure

Assembly:: Aspire.Hosting.Azure.AIFoundry.dll

Package:: Aspire.Hosting.Azure.AIFoundry v9.5.0-preview.1.25474.7

Source:: AIFoundryModel.Generated.cs

Important

Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.

Models published by OpenAI.

public static class AIFoundryModel.OpenAI

type AIFoundryModel.OpenAI = class

Public Class AIFoundryModel.OpenAI

Inheritance: Object
AIFoundryModel.OpenAI

Fields

CodexMini	codex-mini is a fine-tuned variant of the o4-mini model, designed to deliver rapid, instruction-following performance for developers working in CLI workflows. Whether you're automating shell commands, editing scripts, or refactoring repositories, Codex-Min
DallE3	DALL-E 3 generates images from text prompts that are provided by the user. DALL-E 3 is generally available for use on Azure OpenAI. The image generation API creates an image from a text prompt. It does not edit existing images or create variations. Learn more at: <https://free.blessedness.top/azure/ai-services/openai/concepts/models#dall-e>
Davinci002	Davinci-002 is the latest versions of Davinci, gpt-3 base models. Davinci-002 replaces the deprecated Curie and Davinci models. It is a smaller, faster model that is primarily used for fine tuning tasks. This model supports 16384 max input tokens and training data is up to Sep 2021. Davinci-002 supports fine-tuning, allowing developers and businesses to customize the model for specific applications. Your training data and validation data sets consist of input and output examples for how you would like the model to perform. The training and validation data you use must be formatted as a JSON Lines (JSONL) document in which each line represents a single prompt-completion pair. ## Model variation Davinci-002 is the latest version of Davinci, a gpt-3 based model. Learn more at <https://free.blessedness.top/azure/cognitive-services/openai/concepts/models>
Gpt35Turbo	The gpt-35-turbo (also known as ChatGPT) is the most capable and cost-effective model in the gpt-3.5 family which has been optimized for chat using the Chat Completions API. It is a language model designed for conversational interfaces and the model behaves differently than previous gpt-3 models. Previous models were text-in and text-out, meaning they accepted a prompt string and returned a completion to append to the prompt. However, the ChatGPT model is conversation-in and message-out. The model expects a prompt string formatted in a specific chat-like transcript format and returns a completion that represents a model-written message in the chat. Learn more at <https://free.blessedness.top/azure/cognitive-services/openai/concepts/models>
Gpt35Turbo16k	gpt-3.5 models can understand and generate natural language or code. The most capable and cost effective model in the gpt-3.5 family is gpt-3.5-turbo, which has been optimized for chat and works well for traditional completions tasks as well. gpt-3.5-turbo is available for use with the Chat Completions API. gpt-3.5-turbo Instruct has similar capabilities to text-davinci-003 using the Completions API instead of the Chat Completions API. We recommend using gpt-3.5-turbo and gpt-3.5-turbo-instruct over <a href="https://free.blessedness.top/azure/ai-services/openai/concepts/legacy-models" target="_blank">legacy gpt-3.5 and gpt-3 models.</a> - gpt-35-turbo - gpt-35-turbo-16k - gpt-35-turbo-instruct You can see the token context length supported by each model in the model summary table. To learn more about how to interact with gpt-3.5-turbo and the Chat Completions API check out our <a href="https://free.blessedness.top/azure/ai-services/openai/how-to/chatgpt?tabs=python&pivots=programming-language-chat-completions" target="_blank">in-depth how-to.</a> \| Model ID \| Model Availability \| Max Request (tokens) \| Training Data (up to) \| \| ------------------------------- \| ------------------------------------------------------------------------------------------------------------------------------------------ \| --------------------------- \| --------------------- \| \| gpt-35-turbo<sup>1</sup> (0301) \| East US, France Central, South Central US, UK South, West Europe \| 4,096 \| Sep 2021 \| \| gpt-35-turbo (0613) \| Australia East, Canada East, East US, East US 2, France Central, Japan East, North Central US, Sweden Central, Switzerland North, UK South \| 4,096 \| Sep 2021 \| \| gpt-35-turbo-16k (0613) \| Australia East, Canada East, East US, East US 2, France Central, Japan East, North Central US, Sweden Central, Switzerland North, UK South \| 16,384 \| Sep 2021 \| \| gpt-35-turbo-instruct (0914) \| East US, Sweden Central \| 4,097 \| Sep 2021 \| \| gpt-35-turbo (1106) \| Australia East, Canada East, France Central, South India, Sweden Central, UK South, West US \| Input: 16,385 Output: 4,096 \| Sep 2021 \| <sup>1</sup> This model will accept requests > 4,096 tokens. It is not recommended to exceed the 4,096 input token limit as the newer version of the model are capped at 4,096 tokens. If you encounter issues when exceeding 4,096 input tokens with this model this configuration is not officially supported.
Gpt35TurboInstruct	gpt-3.5 models can understand and generate natural language or code. The most capable and cost effective model in the gpt-3.5 family is gpt-3.5-turbo, which has been optimized for chat and works well for traditional completions tasks as well. gpt-3.5-turbo is available for use with the Chat Completions API. gpt-3.5-turbo-instruct has similar capabilities to text-davinci-003 using the Completions API instead of the Chat Completions API. We recommend using gpt-3.5-turbo and gpt-3.5-turbo-instruct over <a href="https://free.blessedness.top/azure/ai-services/openai/concepts/legacy-models" target="_blank">legacy gpt-3.5 and gpt-3 models.</a> - gpt-35-turbo - gpt-35-turbo-16k - gpt-35-turbo-instruct You can see the token context length supported by each model in the model summary table. To learn more about how to interact with GPT-3.5 Turbo and the Chat Completions API check out our <a href="https://free.blessedness.top/azure/ai-services/openai/how-to/chatgpt?tabs=python&pivots=programming-language-chat-completions" target="_blank">in-depth how-to.</a> \| Model ID \| Model Availability \| Max Request (tokens) \| Training Data (up to) \| \| ------------------------------- \| ------------------------------------------------------------------------------------------------------------------------------------------ \| --------------------------- \| --------------------- \| \| gpt-35-turbo<sup>1</sup> (0301) \| East US, France Central, South Central US, UK South, West Europe \| 4,096 \| Sep 2021 \| \| gpt-35-turbo (0613) \| Australia East, Canada East, East US, East US 2, France Central, Japan East, North Central US, Sweden Central, Switzerland North, UK South \| 4,096 \| Sep 2021 \| \| gpt-35-turbo-16k (0613) \| Australia East, Canada East, East US, East US 2, France Central, Japan East, North Central US, Sweden Central, Switzerland North, UK South \| 16,384 \| Sep 2021 \| \| gpt-35-turbo-instruct (0914) \| East US, Sweden Central \| 4,097 \| Sep 2021 \| \| gpt-35-turbo (1106) \| Australia East, Canada East, France Central, South India, Sweden Central, UK South, West US \| Input: 16,385 Output: 4,096 \| Sep 2021 \| <sup>1</sup> This model will accept requests > 4,096 tokens. It is not recommended to exceed the 4,096 input token limit as the newer version of the model are capped at 4,096 tokens. If you encounter issues when exceeding 4,096 input tokens with this model this configuration is not officially supported.
Gpt4	gpt-4 is a large multimodal model that accepts text or image inputs and outputs text. It can solve complex problems with greater accuracy than any of our previous models, thanks to its extensive general knowledge and advanced reasoning capabilities. gpt-4 provides a wide range of model versions to fit your business needs. Please note that AzureML Studio only supports the deployment of the gpt-4-0314 model version and AI Studio supports the deployment of all the model versions listed below. - gpt-4-turbo-2024-04-09: This is the GPT-4 Turbo with Vision GA model. The context window is 128,000 tokens, and it can return up to 4,096 output tokens. The training data is current up to December 2023. - gpt-4-1106-preview (GPT-4 Turbo): The latest gpt-4 model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. It returns a maximum of 4,096 output tokens. This preview model is not yet suited for production traffic. Context window: 128,000 tokens. Training Data: Up to April 2023. - gpt-4-vision Preview (GPT-4 Turbo with vision): This multimodal AI model enables users to direct the model to analyze image inputs they provide, along with all the other capabilities of GPT-4 Turbo. It can return up to 4,096 output tokens. As a preview model version, it is not yet suitable for production traffic. The context window is 128,000 tokens. Training data is current up to April 2023. - gpt-4-0613: gpt-4 model with a context window of 8,192 tokens. Training data up to September 2021. - gpt-4-0314: gpt-4 legacy model with a context window of 8,192 tokens. Training data up to September 2021. This model version will be retired no earlier than July 5, 2024. Learn more at <https://free.blessedness.top/azure/cognitive-services/openai/concepts/models>
Gpt41	gpt-4.1 outperforms gpt-4o across the board, with major gains in coding, instruction following, and long-context understanding
Gpt41Mini	gpt-4.1-mini outperform gpt-4o-mini across the board, with major gains in coding, instruction following, and long-context handling
Gpt41Nano	gpt-4.1-nano provides gains in coding, instruction following, and long-context handling along with lower latency and cost
Gpt432k	gpt-4 can solve difficult problems with greater accuracy than any of the previous OpenAI models. Like gpt-35-turbo, gpt-4 is optimized for chat but works well for traditional completions tasks. The gpt-4 supports 8192 max input tokens and the gpt-4-32k supports up to 32,768 tokens. Note: this model can be deployed for inference, but cannot be finetuned. Learn more at <https://free.blessedness.top/azure/cognitive-services/openai/concepts/models>
Gpt4o	OpenAI's most advanced multimodal model in the gpt-4o family. Can handle both text and image inputs.
Gpt4oAudioPreview	Best suited for rich, asynchronous audio input/output interactions, such as creating spoken summaries from text.
Gpt4oMini	An affordable, efficient AI solution for diverse text and image tasks.
Gpt4oMiniAudioPreview	Best suited for rich, asynchronous audio input/output interactions, such as creating spoken summaries from text.
Gpt4oMiniRealtimePreview	Best suited for rich, asynchronous audio input/output interactions, such as creating spoken summaries from text.
Gpt4oMiniTranscribe	A highly efficient and cost effective speech-to-text solution that deliverables reliable and accurate transcripts.
Gpt4oMiniTts	An advanced text-to-speech solution designed to convert written text into natural-sounding speech.
Gpt4oRealtimePreview	The gpt-4o-realtime-preview model introduces a new era in AI interaction by incorporating the new audio modality powered by gpt-4o. This new modality allows for seamless speech-to-speech and text-to-speech applications, providing a richer and more engaging user experience. Engineered for speed and efficiency, gpt-4o-realtime-preview handles complex audio queries with minimal resources, translating into improved audio performance. The introduction of gpt-4o-realtime-preview opens numerous possibilities for businesses in various sectors: - Enhanced customer service: By integrating audio inputs, gpt-4o-realtime-preview enables more dynamic and comprehensive customer support interactions. - Content innovation: Use gpt-4o-realtime-preview's generative capabilities to create engaging and diverse audio content, catering to a broad range of consumer preferences. - Real-time translation: Leverage gpt-4o-realtime-preview's capability to provide accurate and immediate translations, facilitating seamless communication across different languages Model Versions: - 2024-12-17: Updating the gpt-4o-realtime-preview model with improvements in voice quality and input reliability. As this is a preview version, it is designed for testing and feedback purposes and is not yet optimized for production traffic. - 2024-10-01: Introducing our new multimodal AI model, which now supports both text and audio modalities. As this is a preview version, it is designed for testing and feedback purposes and is not yet optimized for production traffic. ## Limitations IMPORTANT: The system stores your prompts and completions as described in the "Data Use and Access for Abuse Monitoring" section of the service-specific Product Terms for Azure OpenAI Service, except that the Limited Exception does not apply. Abuse monitoring will be turned on for use of the GPT-4o-realtime-preview API even for customers who otherwise are approved for modified abuse monitoring. Currently, the gpt-4o-realtime-preview model focuses on text and audio and does not support existing gpt-4o features such as image modality and structured outputs. For many tasks, the generally available gpt-4o models may still be more suitable. IMPORTANT: At this time, gpt-4o-realtime-preview usage limits are suitable for test and development. To prevent abuse and preserve service integrity, rate limits will be adjusted as needed.
Gpt4oTranscribe	A cutting-edge speech-to-text solution that deliverables reliable and accurate transcripts.
Gpt5Chat	gpt-5-chat (preview) is an advanced, natural, multimodal, and context-aware conversations for enterprise applications.
Gpt5Mini	gpt-5-mini is a lightweight version for cost-sensitive applications.
Gpt5Nano	gpt-5-nano is optimized for speed, ideal for applications requiring low latency.
GptOss120b	Push the open model frontier with GPT-OSS models, released under the permissive Apache 2.0 license, allowing anyone to use, modify, and deploy them freely.
O1	Focused on advanced reasoning and solving complex problems, including math and science tasks. Ideal for applications that require deep contextual understanding and agentic workflows.
O1Mini	Smaller, faster, and 80% cheaper than o1-preview, performs well at code generation and small context operations.
O3Mini	o3-mini includes the o1 features with significant cost-efficiencies for scenarios requiring high performance.
O4Mini	o4-mini includes significant improvements on quality and safety while supporting the existing features of o3-mini and delivering comparable or better performance.
Sora	An efficient AI solution to generate videos
TextEmbedding3Large	Text-embedding-3 series models are the latest and most capable embedding model from OpenAI.
TextEmbedding3Small	Text-embedding-3 series models are the latest and most capable embedding model from OpenAI.
TextEmbeddingAda002	text-embedding-ada-002 outperforms all the earlier embedding models on text search, code search, and sentence similarity tasks and gets comparable performance on text classification. Embeddings are numerical representations of concepts converted to number sequences, which make it easy for computers to understand the relationships between those concepts. Note: this model can be deployed for inference, specifically for embeddings, but cannot be finetuned. ## Model variation text-embedding-ada-002 is part of gpt-3 model family. Learn more at <https://free.blessedness.top/azure/cognitive-services/openai/concepts/models#embeddings-models>
Tts	TTS is a model that converts text to natural sounding speech. TTS is optimized for realtime or interactive scenarios. For offline scenarios, TTS-HD provides higher quality. The API supports six different voices. Max request data size: 4,096 chars can be converted from text to speech per API request. ## Model Variants - TTS: optimized for speed. - TTS-HD: optimized for quality.
TtsHd	TTS-HD is a model that converts text to natural sounding speech. TTS is optimized for realtime or interactive scenarios. For offline scenarios, TTS-HD provides higher quality. The API supports six different voices. Max request data size: 4,096 chars can be converted from text to speech per API request. ## Model Variants - TTS: optimized for speed. - TTS-HD: optimized for quality.
Whisper	The Whisper models are trained for speech recognition and translation tasks, capable of transcribing speech audio into the text in the language it is spoken (automatic speech recognition) as well as translated into English (speech translation). Researchers at OpenAI developed the models to study the robustness of speech processing systems trained under large-scale weak supervision. The model version 001 corresponds to whisper large v2. Max request data size: 25mb of audio can be converted from speech to text per API request.

Applies to

Feedback

Was this page helpful?