Azure AI Foundry Agent Hallucinating When AI Search Index Connected

Question

Azure AI Foundry Agent Hallucinating When AI Search Index Connected

Siddhant Kumta 60

My current pipeline consists of using the azure document intelligence read model to process large pdf documents, whose data is then moved into an azure AI Search Index, which I then want to directly connect to an AI foundry agent to then be queried upon. The search index is functioning perfectly with clean, well-structured data, but the foundry agent consistently hallucinates document names and information instead of using the actual search results from the index.

Index Setup:

Fields:

chunk_id (String, key)
source_document (String, searchable, filterable)
blob_path (String, searchable, filterable)
document_id (String, filterable)
page_numbers (String, searchable, filterable)
start_page (Int32, filterable, sortable)
end_page (Int32, filterable, sortable)
chunk_text (String, searchable)
main content field
total_pages_in_document (Int32)
processed_date (DateTimeOffset)

Semantic Configuration: Enabled

Title field: source_document
Content fields: chunk_text, page_numbers

Through direct testing, I can see that the index is functioning correctly, and is able to find the correct documents and correct content, it is only when connected directly to the foundry agent that it stops working.

Field Mappings in Azure AI Foundry:

Content field: chunk_text
Title field: source_document
URL field: blob_path

Would also appreciate guidance on:

The correct settings/configuration to enforce strict grounding
How to prevent the agent from using its training data when a search index is connected
Best practices for pharmaceutical use cases requiring perfect accuracy
Whether we should be using a different Azure service/API for this use case

If more context is needed please let me know.

Answer accepted by question author

0 additional answers

Your answer

Answer 1

Hi Siddhant Kumta,

I understood that you have a high-quality, well-structured Azure AI Search index that functions perfectly in isolation. However, when connected as a tool to an AI Foundry agent, the agent ignores the retrieved search results (the "context") and hallucinates answers from its pre-trained (parametric) memory.

The possible cause could be the agent's underlying Large Language Model (LLM) has a choice with every query:

1.Answer from its general-purpose, pre-trained knowledge.

2.Call the Azure AI Search tool, receive the results, and formulate an answer based only on those results.

Your agent is defaulting to option #1. This is usually caused by one of two things:

Competing Tools: As the volunteer noted, if another tool like Bing Grounding is also enabled, the agent may get confused about which knowledge source to use or may blend them.
Weak Instructions: The agent's system instructions (meta-prompt) are not strong or specific enough to force it to use the search tool and forbid it from using its general knowledge.

Steps to enforce/restrict the agent:

1: Force Tool Usage and Set Strict Agent Instructions

from azure.ai.projects import AIProjectClient
from azure.ai.projects.models import AzureAISearchTool

# Define your project client and search connection
project_client = AIProjectClient(...)
search_connection_id = "your-search-connection"
search_index_name = "your-pharmaceutical-index"

# 1. Create powerfully strict system instructions
pharmaceutical_instructions = """You are a pharmaceutical document retrieval assistant with ZERO TOLERANCE for hallucination.

STRICT OPERATIONAL RULES:
1.  SOURCE VERIFICATION: You MUST ONLY use information from the connected Azure AI Search results.
2.  NO FALLBACK: If the search results do not contain the answer, you MUST respond with: 'I cannot find that information in the indexed documents.'
3.  NEVER GUESS: You are FORBIDDEN from using your pre-trained general knowledge, making assumptions, or providing information not in the search results.
4.  CITATIONS REQUIRED: Every factual statement you make MUST include a citation, formatted as: [source_document, Page X]
5.  UNCERTAINTY: If search results are ambiguous, you must state that, not invent a consolidated answer.
"""

# 2. Create the agent with enforced tool use and zero creativity
agent = project_client.agents.create_agent(
    name="pharmaceutical-document-agent",
    model="gpt-4o",  # Or your preferred model
    instructions=pharmaceutical_instructions,
    tools=[AzureAISearchTool(connection_id=search_connection_id)],
    tool_resources={
        "azure_ai_search": {
            "indexes": [search_index_name]
        }
    },
    
    # 3. CRITICAL PARAMETERS:
    temperature=0.0,       # Eliminates creativity
    top_p=0.1,             # Further restricts token sampling
    tool_choice="required" # FORCES the agent to use a tool on every single turn
)

2: Optimize Your AI Search Index Schema for Grounding

{
  "fields": [
    { "name": "chunk_id", "type": "Edm.String", "key": true },
    {
      "name": "source_document",
      "type": "Edm.String",
      "searchable": true,
      "filterable": true,
      "facetable": true,  // ADD: Enable document name faceting
      "sortable": true   // ADD: Enable sorting by document name
    },
    {
      "name": "chunk_text",
      "type": "Edm.String",
      "searchable": true,
      "analyzer": "en.microsoft" // Use a robust language analyzer
    },
    { "name": "page_numbers", "type": "Edm.String", "searchable": true, "filterable": true },
    { "name": "document_id", "type": "Edm.String", "filterable": true },
    { "name": "blob_path", "type": "Edm.String", "filterable": true },
    {
      "name": "document_metadata",  // ADD: For traceability/compliance
      "type": "Edm.ComplexType",
      "fields": [
        {"name": "approval_date", "type": "Edm.DateTimeOffset"},
        {"name": "document_version", "type": "Edm.String", "filterable": true}
      ]
    }
  ],
  "semantic": {
    "configurations": [
      {
        "name": "pharma-semantic-config",  // Use this name in your agent/API calls
        "prioritizedFields": {
          "titleField": {
            "fieldName": "source_document"
          },
          "prioritizedContentFields": [
            {"fieldName": "chunk_text"},
            {"fieldName": "page_numbers"}
          ],
          "prioritizedKeywordsFields": [
            {"fieldName": "document_id"},
            {"fieldName": "source_document"} // Boost exact document matches
          ]
        }
      }
    ]
  }
}

3: Alternative Architecture

For a pharmaceutical use case, you should not leave grounding to chance. The AI Foundry Agent service is designed for flexibility. A more direct and controllable approach is to use the Azure OpenAI "On Your Data" (Grounded Chat) API directly.

This API is built specifically for this purpose and gives you a strictness parameter to force the model to only answer from the data.

from openai import AzureOpenAI

client = AzureOpenAI(
    api_key=os.getenv("AZURE_OPENAI_KEY"),
    api_version="2024-02-01",
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT")
)

# Use the same strict instructions from Step 1
pharmaceutical_instructions = "You are a pharmaceutical document assistant..."

response = client.chat.completions.create(
    model="gpt-4o",  # Your model deployment name
    messages=[
        {"role": "system", "content": pharmaceutical_instructions},
        {"role": "user", "content": "What is the dosage for Drug X?"}
    ],
    temperature=0.0,
    
    # This extra_body is the "On Your Data" configuration
    extra_body={
        "data_sources": [{
            "type": "azure_search",
            "parameters": {
                "endpoint": "YOUR_SEARCH_ENDPOINT",
                "index_name": "your-pharmaceutical-index",
                "semantic_configuration": "pharma-semantic-config", # From Step 2
                "query_type": "vector_semantic_hybrid",
                "authentication": {
                    "type": "api_key",
                    "key": "YOUR_SEARCH_ADMIN_KEY"
                },
                
                # CRITICAL: Enforce strict grounding
                "in_scope": True,  # Forbids the model from using pre-trained knowledge
                "strictness": 5    # Max strictness (1-5 scale), forces model to refuse
                                   # to answer if not in the search results
            }
        }]
    }
)

Answering your other question "Should you use a different service?":

No. This is the correct set of services. I believe your problem is not with the services themselves but with the configuration and prompt engineering of the agent, which is solved with the steps above.

Reference documentation:

Please accept the answer and upvote for visibility to other community members.

Siddhant Kumta 60 Reputation points

2025-10-22T14:00:45.8+00:00

Thank you for the help it seems to be working out better! Just as a quick follow-up, the only knowledge source that is attached is the index, I never added the Bing Grounding. When asked questions like "what documents are in the index" I would have to get specific with the exact index name for it to even attempt at finding the documents.

Answer 2

Hello Siddhant !

Thank you for posting on Microsoft Learn Q&A.

The problem in your case is how the Agent decides to use or ignore your index.

You need to only give the agent one way to retrieve knowledge so try to remove or disable every other knowledge source for this agent and keep only the Azure AI Search tool attached. If Bing grounding is connected, the agent will sometimes prefer it.

https://free.blessedness.top/en-us/azure/ai-foundry/agents/how-to/tools/bing-grounding

When you create or run the agent via SDK, set tool_choice="required" and include only the Azure AI Search tool in tools.

https://free.blessedness.top/en-us/azure/ai-foundry/agents/how-to/tools/azure-ai-search-samples

and then set the agent or the model temperature to 0–0.2 so this reduces creative paraphrasing and off index guesses.

This simple deflect when no data pattern makes a big difference (and aligns with MS guidance to control tool invocation via instructions).

https://free.blessedness.top/en-us/azure/ai-foundry/agents/how-to/tools/fabric

Siddhant Kumta 60 Reputation points

2025-10-22T13:57:41.98+00:00

Thank you for your help!

Share via

Azure AI Foundry Agent Hallucinating When AI Search Index Connected

0 additional answers

Your answer