Hi Siddhant Kumta,
I understood that you have a high-quality, well-structured Azure AI Search index that functions perfectly in isolation. However, when connected as a tool to an AI Foundry agent, the agent ignores the retrieved search results (the "context") and hallucinates answers from its pre-trained (parametric) memory.
The possible cause could be the agent's underlying Large Language Model (LLM) has a choice with every query:
1.Answer from its general-purpose, pre-trained knowledge.
2.Call the Azure AI Search tool, receive the results, and formulate an answer based only on those results.
Your agent is defaulting to option #1. This is usually caused by one of two things:
- Competing Tools: As the volunteer noted, if another tool like Bing Grounding is also enabled, the agent may get confused about which knowledge source to use or may blend them.
- Weak Instructions: The agent's system instructions (meta-prompt) are not strong or specific enough to force it to use the search tool and forbid it from using its general knowledge.
Steps to enforce/restrict the agent:
1: Force Tool Usage and Set Strict Agent Instructions
from azure.ai.projects import AIProjectClient
from azure.ai.projects.models import AzureAISearchTool
# Define your project client and search connection
project_client = AIProjectClient(...)
search_connection_id = "your-search-connection"
search_index_name = "your-pharmaceutical-index"
# 1. Create powerfully strict system instructions
pharmaceutical_instructions = """You are a pharmaceutical document retrieval assistant with ZERO TOLERANCE for hallucination.
STRICT OPERATIONAL RULES:
1. SOURCE VERIFICATION: You MUST ONLY use information from the connected Azure AI Search results.
2. NO FALLBACK: If the search results do not contain the answer, you MUST respond with: 'I cannot find that information in the indexed documents.'
3. NEVER GUESS: You are FORBIDDEN from using your pre-trained general knowledge, making assumptions, or providing information not in the search results.
4. CITATIONS REQUIRED: Every factual statement you make MUST include a citation, formatted as: [source_document, Page X]
5. UNCERTAINTY: If search results are ambiguous, you must state that, not invent a consolidated answer.
"""
# 2. Create the agent with enforced tool use and zero creativity
agent = project_client.agents.create_agent(
name="pharmaceutical-document-agent",
model="gpt-4o", # Or your preferred model
instructions=pharmaceutical_instructions,
tools=[AzureAISearchTool(connection_id=search_connection_id)],
tool_resources={
"azure_ai_search": {
"indexes": [search_index_name]
}
},
# 3. CRITICAL PARAMETERS:
temperature=0.0, # Eliminates creativity
top_p=0.1, # Further restricts token sampling
tool_choice="required" # FORCES the agent to use a tool on every single turn
)
2: Optimize Your AI Search Index Schema for Grounding
{
"fields": [
{ "name": "chunk_id", "type": "Edm.String", "key": true },
{
"name": "source_document",
"type": "Edm.String",
"searchable": true,
"filterable": true,
"facetable": true, // ADD: Enable document name faceting
"sortable": true // ADD: Enable sorting by document name
},
{
"name": "chunk_text",
"type": "Edm.String",
"searchable": true,
"analyzer": "en.microsoft" // Use a robust language analyzer
},
{ "name": "page_numbers", "type": "Edm.String", "searchable": true, "filterable": true },
{ "name": "document_id", "type": "Edm.String", "filterable": true },
{ "name": "blob_path", "type": "Edm.String", "filterable": true },
{
"name": "document_metadata", // ADD: For traceability/compliance
"type": "Edm.ComplexType",
"fields": [
{"name": "approval_date", "type": "Edm.DateTimeOffset"},
{"name": "document_version", "type": "Edm.String", "filterable": true}
]
}
],
"semantic": {
"configurations": [
{
"name": "pharma-semantic-config", // Use this name in your agent/API calls
"prioritizedFields": {
"titleField": {
"fieldName": "source_document"
},
"prioritizedContentFields": [
{"fieldName": "chunk_text"},
{"fieldName": "page_numbers"}
],
"prioritizedKeywordsFields": [
{"fieldName": "document_id"},
{"fieldName": "source_document"} // Boost exact document matches
]
}
}
]
}
}
3: Alternative Architecture
For a pharmaceutical use case, you should not leave grounding to chance. The AI Foundry Agent service is designed for flexibility. A more direct and controllable approach is to use the Azure OpenAI "On Your Data" (Grounded Chat) API directly.
This API is built specifically for this purpose and gives you a strictness parameter to force the model to only answer from the data.
from openai import AzureOpenAI
client = AzureOpenAI(
api_key=os.getenv("AZURE_OPENAI_KEY"),
api_version="2024-02-01",
azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT")
)
# Use the same strict instructions from Step 1
pharmaceutical_instructions = "You are a pharmaceutical document assistant..."
response = client.chat.completions.create(
model="gpt-4o", # Your model deployment name
messages=[
{"role": "system", "content": pharmaceutical_instructions},
{"role": "user", "content": "What is the dosage for Drug X?"}
],
temperature=0.0,
# This extra_body is the "On Your Data" configuration
extra_body={
"data_sources": [{
"type": "azure_search",
"parameters": {
"endpoint": "YOUR_SEARCH_ENDPOINT",
"index_name": "your-pharmaceutical-index",
"semantic_configuration": "pharma-semantic-config", # From Step 2
"query_type": "vector_semantic_hybrid",
"authentication": {
"type": "api_key",
"key": "YOUR_SEARCH_ADMIN_KEY"
},
# CRITICAL: Enforce strict grounding
"in_scope": True, # Forbids the model from using pre-trained knowledge
"strictness": 5 # Max strictness (1-5 scale), forces model to refuse
# to answer if not in the search results
}
}]
}
)
Answering your other question "Should you use a different service?":
- No. This is the correct set of services. I believe your problem is not with the services themselves but with the configuration and prompt engineering of the agent, which is solved with the steps above.
Reference documentation:
- Azure AI Search tool in AI Foundry Agents
- AI foundary Agent Service
- Groundedness detection (preview)
- Semantic search in Azure AI Search
Please accept the answer and upvote for visibility to other community members.