Edit

Share via


Design an index for agentic retrieval in Azure AI Search

Note

This feature is currently in public preview. This preview is provided without a service-level agreement and isn't recommended for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

In Azure AI Search, agentic retrieval is a new parallel query architecture that uses a chat completion model for query planning, generating subqueries that broaden the scope of what's searchable and relevant.

Subqueries are created internally. Certain aspects of the subqueries are determined by your search index. This article explains which index elements have an effect on the query logic. None of the required elements are new or specific to agentic retrieval, which means you can use an existing index if it meets the criteria identified in this article, even if it was created using earlier API versions.

A search index that's used in agentic retrieval is specified as knowledge source on a knowledge agent, and is either:

  • An existing indexing containing searchable content. This index is made available to agentic retrieval through a search index knowledge source definition.

  • A generated index created from a blob indexer pipeline. This index is generated and populated using information from a blob knowledge source. It's based on a template that meets all of the criteria for knowledge agents and agentic retrieval.

Criteria for agentic retrieval

An index that's used in agentic retrieval must have these elements:

  • String fields attributed as searchable and retrievable
  • A semantic configuration, with a defaultSemanticConfiguration
  • Vector fields and a vectorizer if you want to include vector queries in the pipeline

It should also have fields that can be used for citations, such as document or file name, page or chapter name, or at least a chunk ID.

Optionally, the following index elements increase your opportunities for optimization:

  • scoringProfile with a defaultScoringProfile, for boosting relevance
  • synonymMaps for terminology or jargon
  • analyzers for linguistics rules or patterns (like whitespace preservation, or special characters)

Example index definition

Here's an example index that works for agentic retrieval. It meets the criteria for required elements.

{
  "name": "earth_at_night",
  "description": "Contains images an descriptions of our planet in darkness as captured from space by Earth-observing satellites and astronauts on the International Space Station over the past 25 years.",
  "fields": [
    {
      "name": "id", "type": "Edm.String",
      "searchable": true, "retrievable": true, "filterable": true, "sortable": true, "facetable": true,
      "key": true,
      "stored": true,
      "synonymMaps": []
    },
    {
      "name": "page_chunk", "type": "Edm.String",
      "searchable": true, "retrievable": true, "filterable": false, "sortable": false, "facetable": false,
      "analyzer": "en.microsoft",
      "stored": true,
      "synonymMaps": []
    },
    {
      "name": "page_chunk_vector_text_3_large", "type": "Collection(Edm.Single)",
      "searchable": true, "retrievable": false, "filterable": false, "sortable": false, "facetable": false,
      "dimensions": 3072,
      "vectorSearchProfile": "hnsw_text_3_large",
      "stored": false,
      "synonymMaps": []
    },
    {
      "name": "page_number", "type": "Edm.Int32",
      "searchable": false, "retrievable": true, "filterable": true, "sortable": true, "facetable": true,
      "stored": true,
      "synonymMaps": []
    },
    {
      "name": "chapter_number", "type": "Edm.Int32",
      "searchable": false, "retrievable": true, "filterable": true, "sortable": true, "facetable": true,
      "stored": true,
      "synonymMaps": []
    }
  ],
  "scoringProfiles": [],
  "suggesters": [],
  "analyzers": [],
  "normalizers": [],
  "tokenizers": [],
  "tokenFilters": [],
  "charFilters": [],
  "semantic": {
    "defaultConfiguration": "semantic_config",
    "configurations": [
      {
        "name": "semantic_config",
        "flightingOptIn": false,
        "prioritizedFields": {
          "prioritizedContentFields": [
            {
              "fieldName": "page_chunk"
            }
          ],
          "prioritizedKeywordsFields": []
        }
      }
    ]
  },
  "vectorSearch": {
    "algorithms": [
      {
        "name": "alg",
        "kind": "hnsw",
        "hnswParameters": {
          "metric": "cosine",
          "m": 4,
          "efConstruction": 400,
          "efSearch": 500
        }
      }
    ],
    "profiles": [
      {
        "name": "hnsw_text_3_large",
        "algorithm": "alg",
        "vectorizer": "azure_openai_text_3_large"
      }
    ],
    "vectorizers": [
      {
        "name": "azure_openai_text_3_large",
        "kind": "azureOpenAI",
        "azureOpenAIParameters": {
          "resourceUri": "https://YOUR-AOAI-RESOURCE.openai.azure.com",
          "deploymentId": "text-embedding-3-large",
          "apiKey": "<redacted>",
          "modelName": "text-embedding-3-large"
        }
      }
    ],
    "compressions": []
  }
}

Key points:

In agentic retrieval, a large language model (LLM) is used twice. First, it's used to create a query plan. After the query plan is executed and search results are generated, those results are passed to the LLM again, this time as grounding data that's used to formulate an answer.

LLMs consume and emit tokenized strings of human readable plain text content. For this reason, you must have searchable fields that provide plain text strings, and are retrievable in the response. Vector fields and vector search are also important because they add similarity search to information retrieval. Vectors enhance and improve the quality of search that produces grounding data, but aren't otherwise strictly required. Azure AI Search has built-in capabilities that simplify and automate vectorization.

The previous example index includes a vector field that's used at query time. You don't need the vector in results because it isn't human or LLM readable, but notice that its searchable for vector search. Since you don't need vectors in the response, both retrievable and stored are false.

The vectorizer defined in the vector search configuration is critical. It determines whether your vector field is used during query execution. The vectorizer encodes string subqueries into vectors at query time for similarity search over the vectors. The vectorizer must be the same embedding model used to create the vectors in the index.

All searchable fields are included in query execution. There's no support for a select statement that explicitly states which fields to query.

Add a description

An index description field is a user-defined string that you can use to provide guidance to LLMs and Model Context Protocol (MCP) servers when deciding to use a specific index for a query. This human-readable text is invaluable when a system must access several indexes and make a decision based on the description.

An index description is a schema update, and you can add it without having to rebuild the entire index.

  • String length is 4,000 characters maximum.
  • Content must be human-readable, in Unicode. Your use case should determine which language to use (for example, English or another language).

Add a semantic configuration

The index must have at least one semantic configuration. The semantic configuration must have:

  • A defaultSemanticConfiguration set to a named configuration.
  • A prioritizedContentFields set to at least one string field that is both searchable and retrievable.

Within the configuration, prioritizedContentFields is required. Title and keywords are optional. For chunked content, you might not have either. However, if you add entity recognition or key phrase extraction, you might have some keywords associated with each chunk that can be useful in search scenarios, perhaps in a scoring profile.

Here's an example of a semantic configuration that works for agentic retrieval:

"semantic":{
   "defaultConfiguration":"semantic_config",
   "configurations":[
      {
         "name":"semantic_config",
         "flightingOptIn":false,
         "prioritizedFields":{
            "prioritizedFields":{
               "titleField":{
                  "fieldName":""
               },
               "prioritizedContentFields":[
                  {
                     "fieldName":"page_chunk"
                  }
               ],
               "prioritizedKeywordsFields":[
                  {
                     "fieldName":"Category"
                  },
                  {
                     "fieldName":"Tags"
                  },
                  {
                     "fieldName":"Location"
                  }
               ]
            }
         }
      }
   ]
}

Note

The response provides title, terms, and content, which map to the prioritized fields in this configuration.

Add a vectorizer

If you have vector fields in the index, the query plan includes them if they're searchable and have a vectorizer assignment.

A vectorizer specifies an embedding model that provides text-to-vector conversions at query time. It must point to the same embedding model used to encode the vector content in your index. You can use any embedding model supported by Azure AI Search. Vectorizers are specified on vector fields by way of a vector profile.

Recall the vector field definition in the index example. Attributes on a vector field include dimensions or the number of embeddings generated by the model, and the profile.

  {
    "name": "page_chunk_text_3_large", "type": "Collection(Edm.Single)",
    "searchable": true, "retrievable": false, "filterable": false, "sortable": false, "facetable": false,
    "dimensions": 3072,
    "vectorSearchProfile": "hnsw_text_3_large",
    "stored": false,
    "synonymMaps": []
  }

Vector profiles are configurations of vectorizers, algorithms, and compression techniques. Each vector field can only use one profile, but your index can have many in case you want unique profiles for every vector field.

Querying vectors and calling a vectorizer adds latency to the overall request, but if you want similarity search it might be worth the trade-off.

Here's an example of a vectorizer that works for agentic retrieval, as it appears in a vectorSearch configuration. There's nothing in the vectorizer definition that needs to be changed to work with agentic retrieval.

"vectorSearch": {
  "algorithms": [
    {
      "name": "alg",
      "kind": "hnsw",
      "hnswParameters": {
        "metric": "cosine",
        "m": 4,
        "efConstruction": 400,
        "efSearch": 500
      }
    }
  ],
  "profiles": [
    {
      "name": "hnsw_text_3_large",
      "algorithm": "alg",
      "vectorizer": "azure_openai_text_3_large"
    }
  ],
  "vectorizers": [
    {
      "name": "azure_openai_text_3_large",
      "kind": "azureOpenAI",
      "azureOpenAIParameters": {
        "resourceUri": "https://YOUR-AOAI-RESOURCE.openai.azure.com",
        "deploymentId": "text-embedding-3-large",
        "apiKey": "<redacted>",
        "modelName": "text-embedding-3-large"
      }
    }
  ],
  "compressions": []
}  

Add a scoring profile

Scoring profiles are criteria for relevance boosting. They're applied to non-vector fields (text and numbers) and are evaluated during query execution, although the precise behavior depends on the API version used to create the index.

If you create the index using 2025-05-01-preview or later, the scoring profile executes last. If the index is created using an earlier API version, scoring profiles are evaluated before semantic reranking.

You can use any scoring profile that makes sense for your index. Here's an example of one that boosts the search score of a match if the match is found in a specific field. Fields are weighted by boosting multipliers. For example if a match was found in the "Category" field, the boosted score is multiplied by 5.

"scoringProfiles": [  
    {  
      "name": "boostSearchTerms",  
      "text": {  
        "weights": {  
          "Location": 2,  
          "Category": 5 
        }  
      }  
    }
]

Add analyzers

Analyzers apply to text fields and can be language analyzers or custom analyzers that control tokenization in the index, such as preserving special characters or whitespace.

Analyzers are defined within a search index and assigned to fields. The fields collection example includes an analyzer reference on the text chunks. In this example, the default analyzer (standard Lucene) is replaced with a Microsoft language analyzer.

{
  "name": "page_chunk", "type": "Edm.String",
  "searchable": true, "retrievable": true, "filterable": false, "sortable": false, "facetable": false,
  "analyzer": "en.microsoft",
  "stored": true,
  "synonymMaps": []
}

Add a synonym map

Synonym maps expand queries by adding synonyms for named terms. For example, you might have scientific or medical terms for common terms.

Synonym maps are defined as a top-level resource on a search index and assigned to fields. The fields collection example doesn't include a synonym map, but if you had variant spellings of country names in synonym map, here's what an example might look like if it was assigned to a hypothetical "locations" field.

{
    "name":"locations",
    "type":"Edm.String",
    "searchable":true,
    "synonymMaps":[ "country-synonyms" ]
}