How do you use RAG in the REST API with GPT-5?

Ronald Wyman 76 Reputation points
2025-09-28T15:55:50.0566667+00:00

I have added an Azure Search Service index with the options for semantic configuration and a Vector Profile. When I execute the following REST request. The following error is raised. Does anyone have some insight into this result?

[Error Message] An error occurred when calling Azure OpenAI: Server responded with status 400. Error message: {

"error": {

"message": "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.",

"type": "invalid_request_error",

"param": "max_tokens",

"code": "unsupported_parameter"

}

}

As you can see, I make no reference to max_tokens. When I add max_completion_tokens, reasoning_effort, or verbosity, an error is raised indicating that these options are not supported.

My JSON Request
{

"messages": [

{

  "role": "system",

  "content": "My prompt here"

},

{

  "role": "user",

  "content": "show a bar chart"

}

],

"stop": null,

"stream": false,

"frequency_penalty": 0,

"presence_penalty": 0,

"temperature": 1,

"data_sources": [

{

  "type": "azure_search",

  "parameters": {

    "endpoint": "https:\/\/....search.windows.net",

    "index_name": "aisearch-indexset-42984f3678dd446fa51ee376e792c125",

    "query_type": "semantic",

    "semantic_configuration": "semantic-config",

    "strictness": 3,

    "top_n_documents": 10,

    "in_scope": true,

    "filter": "search.ismatch('LIB00000000000000*', 'dbdefid')",

    "fields_mapping": {

      "content_fields_separator": "\\n",

      "content_fields": [

        "content"

      ],

      "filepath_field": "dxlink",

      "title_field": "exportfilename",

      "url_field": "dxlink",

      "vector_fields": [

        "vector_content"

      ]

    },

    "authentication": {

      "type": "api_key",

      "key": "...."

    }

  }

}

]

}

Thank you in advance.
Ron

Developer technologies | C++
Developer technologies | C++
A high-level, general-purpose programming language, created as an extension of the C programming language, that has object-oriented, generic, and functional features in addition to facilities for low-level memory manipulation.
0 comments No comments
{count} vote

7 answers

Sort by: Most helpful
  1. Adiba Khan 895 Reputation points Microsoft External Staff
    2025-09-29T11:03:09.7666667+00:00

    Thanks for reaching out. This appears to be a multi-faceted issue related to using Retrieval-Augmented Generation (RAG) with Azure OpenAI’s FPT-5 via the REST API.

    1.      Solution for unsupported parameter: max_tokens’ Error

    This is the most explicit error message and points to a change in API parameter naming for newer models like GPT-5, o1, and gpt-4o/mini in Azure openAI.

    The fix:

    You should remove the unsupported parameter and replace it with the correct one in your JSON request body.

    ·         Remove: “max_tokens”:[value]

    ·         Use Instead: “max_completion_tokens”:[value]

    Note on the other unsupported parameters: you also mentioned issues with reasoning_effort and verbosity.

    ·         Reasoning_effort: This is parameter specific to certain advanced reasoning models (like the full GPT-5) and may not be supported or available in the standard Chat Completions API and RAG integration, or it might be a model-specific parameter. If you are getting an error, it’s best to remove it unless the specific model documentation confirms its support in this context.

    ·         Verbosity: This parameter is also model specific and generally used to control the verbosity of certain advanced model outputs (like tool-calling or reasoning steps). Remove it if you receive an error.

    Reference link: Azure OpenAI in Azure AI Foundry Models v1 REST API reference - Azure OpenAI | Microsoft Learn

    2.      Solution for Status 400 error (Bad request)

    A status 400 with a generic error message, after fixing parameter name, often indicates a fundamentals issue with the request structure or the RAG configuration itself.

    Common causes for a 400 error in Azure OpenAI RAG:

    ·         Missing or incorrect Authentication in data_sources: It shows that authentication is set to api_key but the key is masked as …. Double check that the API key provided for your Azure AI search resource is correct and has the necessary permissions.

    ·         Required parameter missing for search type: you configuration is using query_type: ”semantic” which requires a semantic configuration name (semantic_configuration: “semantic-config”), which you have. However, sometimes for certain models or search configurations, other parameters like Embedding Endpoint are required, especially if your data source has vector fields and the model is not handling the embedding internally.

    o   Check for embedding Endpoint requirement: if your data source index relies on a vector store, ensure the REST API call correctly specifies how the embeddings are handled, or if a separate embeddingEndpoint is now required for your model/API version combination.

    Action for status 400:

    1.      Verify API Key: Ensure the key under authentication in data_sources is the correct admin key for your Azure AI search service.

    2.      Verify Index Name: Double-check that the index-name exists and is correct in your Azure AI search instance.

    3.      Check for embedding Endpoint (if using vector search): if the RAG service requires an embedding model deployment URI for vector search/reranking, you might need to add an embedding Endpoint parameter within the data_sources section or ensure you are using a model that handles this implicitly.

    Reference links:

    ·         Azure OpenAI On Your Data Python & REST API reference - Azure OpenAI | Microsoft Learn

    ·         Quickstart: Generative Search (RAG) - Azure AI Search | Microsoft Learn

    ·         Connect using API keys - Azure AI Search | Microsoft Learn

    Let me know if you need any further help with this. We'll be happy to assist.

    If you find this helpful, please mark this as answered.

    0 comments No comments

  2. Ronald Wyman 76 Reputation points
    2025-09-29T11:39:55.2466667+00:00

    Hi Aibid,

    max_tokens is not referenced in the request.

    Thx

    Ron

    0 comments No comments

  3. Varsha Dundigalla(INFOSYS LIMITED) 2,700 Reputation points Microsoft External Staff
    2025-09-30T10:55:37.87+00:00

    Thank you for reaching out

    Why This Happens

    You’re right — even though your request JSON doesn’t explicitly include max_tokens, the error still appears. This is because the SDK or client library you are using is automatically adding max_tokens behind the scenes when building the request.

    For newer models such as GPT-5 and the GPT-4o family, the Azure OpenAI API no longer accepts the max_tokens parameter. Instead, these models require the newer parameter max_completion_tokens. If the older one is present in the request (even if injected automatically), the API rejects it. 

    This explains why you’re seeing the error even though you didn’t type max_tokens in your JSON.

     How to Resolve

    1. Use the correct parameter

    • Replace max_tokens with max_completion_tokens.
    • Remove unsupported parameters like reasoning_effort or verbosity unless your specific model documentation confirms support.

    2. Check your SDK or library version

    • Many older SDKs and client wrappers still inject max_tokens.
    • Update to the latest Azure OpenAI SDK or try sending the request manually through REST/Postman to confirm the behavior.

     3. Inspect the raw request

    • Enable request logging or use a tool like Postman/Fiddler to capture the exact JSON being sent.
    • This will confirm whether max_tokens is being added automatically.

     4. Test step by step

    • First, send a simple chat completion request without RAG enabled to verify the parameter issue is fixed.
    • Once that succeeds, add your data_sources block back in to confirm the Retrieval-Augmented Generation configuration works as expected.

     5. Check RAG configuration if 400 errors persist

    • Verify your Azure AI Search index name, semantic configuration, and vector settings are correct.
    • Make sure your search service has a valid API key with the right permissions.
    • If you are using vector fields, confirm that your index has a vectorizer configured, otherwise an embedding endpoint may be needed.

    References

    Here are the official reference links you can include:

    In short: the error isn’t from your JSON, but from the SDK inserting max_tokens. Updating your client, switching to max_completion_tokens, and inspecting the raw request should solve it.

    Let me know if you need any further help with this. We'll be happy to assist.

    If you find this helpful, please mark this as answered

    0 comments No comments

  4. Ronald Wyman 76 Reputation points
    2025-09-30T11:36:44.9866667+00:00

    Hi Varsha,

    I'm not using any library, and the JSON I provided is what is placed in the body of the request.

    Thx

    Ron


  5. Ronald Wyman 76 Reputation points
    2025-09-30T12:32:11.0666667+00:00

    Hi Varsha,

    I provided this as part of the initial posting.

    Ron


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.