RetrievalEvaluator Class

Evaluates retrieval score for a given query and context or a multi-turn conversation, including reasoning.

The retrieval measure assesses the AI system's performance in retrieving information for additional context (e.g. a RAG scenario).

Retrieval scores range from 1 to 5, with 1 being the worst and 5 being the best.

High retrieval scores indicate that the AI system has successfully extracted and ranked the most relevant information at the top, without introducing bias from external knowledge and ignoring factual correctness. Conversely, low retrieval scores suggest that the AI system has failed to surface the most relevant context chunks at the top of the list and/or introduced bias and ignored factual correctness.

Note

To align with our support of a diverse set of models, an output key without the gpt_ prefix has been added.

To maintain backwards compatibility, the old key with the gpt_ prefix is still be present in the output;

however, it is recommended to use the new key moving forward as the old key will be deprecated in the future.

Constructor

RetrievalEvaluator(model_config, *, threshold: float = 3, credential=None)

Parameters

Name	Description
model_config Required	Union[AzureOpenAIModelConfiguration, OpenAIModelConfiguration] Configuration for the Azure OpenAI model.
threshold Required	float The threshold for the evaluation. Default is 3.

Keyword-Only Parameters

Name	Description
threshold	Default value: 3
credential	Default value: None

Examples

Initialize with threshold and call a RetrievalEvaluator.


   import os
   from azure.ai.evaluation import RetrievalEvaluator

   model_config = {
       "azure_endpoint": os.environ.get("AZURE_OPENAI_ENDPOINT"),
       "api_key": os.environ.get("AZURE_OPENAI_KEY"),
       "azure_deployment": os.environ.get("AZURE_OPENAI_DEPLOYMENT"),
   }

   retrieval_eval = RetrievalEvaluator(model_config=model_config, threshold=2)
   conversation = {
       "messages": [
           {
               "content": "What is the capital of France?`''\"</>{}{{]",
               "role": "user",
               "context": "Customer wants to know the capital of France",
           },
           {"content": "Paris", "role": "assistant", "context": "Paris is the capital of France"},
           {
               "content": "What is the capital of Hawaii?",
               "role": "user",
               "context": "Customer wants to know the capital of Hawaii",
           },
           {"content": "Honolulu", "role": "assistant", "context": "Honolulu is the capital of Hawaii"},
       ],
       "context": "Global context",
   }
   retrieval_eval(conversation=conversation)

Attributes

id

Evaluator identifier, experimental and to be used only with evaluation in cloud.

id = 'azureai://built-in/evaluators/retrieval'

Feedback

Was this page helpful?