RetrievalEvaluator Class
Evaluates retrieval score for a given query and context or a multi-turn conversation, including reasoning.
The retrieval measure assesses the AI system's performance in retrieving information for additional context (e.g. a RAG scenario).
Retrieval scores range from 1 to 5, with 1 being the worst and 5 being the best.
High retrieval scores indicate that the AI system has successfully extracted and ranked the most relevant information at the top, without introducing bias from external knowledge and ignoring factual correctness. Conversely, low retrieval scores suggest that the AI system has failed to surface the most relevant context chunks at the top of the list and/or introduced bias and ignored factual correctness.
Note
To align with our support of a diverse set of models, an output key without the gpt_ prefix has been added.
To maintain backwards compatibility, the old key with the gpt_ prefix is still be present in the output;
however, it is recommended to use the new key moving forward as the old key will be deprecated in the future.
Constructor
RetrievalEvaluator(model_config, *, threshold: float = 3, credential=None)
Parameters
| Name | Description |
|---|---|
|
model_config
Required
|
Configuration for the Azure OpenAI model. |
|
threshold
Required
|
The threshold for the evaluation. Default is 3. |
Keyword-Only Parameters
| Name | Description |
|---|---|
|
threshold
|
Default value: 3
|
|
credential
|
Default value: None
|
Examples
Initialize with threshold and call a RetrievalEvaluator.
import os
from azure.ai.evaluation import RetrievalEvaluator
model_config = {
"azure_endpoint": os.environ.get("AZURE_OPENAI_ENDPOINT"),
"api_key": os.environ.get("AZURE_OPENAI_KEY"),
"azure_deployment": os.environ.get("AZURE_OPENAI_DEPLOYMENT"),
}
retrieval_eval = RetrievalEvaluator(model_config=model_config, threshold=2)
conversation = {
"messages": [
{
"content": "What is the capital of France?`''\"</>{}{{]",
"role": "user",
"context": "Customer wants to know the capital of France",
},
{"content": "Paris", "role": "assistant", "context": "Paris is the capital of France"},
{
"content": "What is the capital of Hawaii?",
"role": "user",
"context": "Customer wants to know the capital of Hawaii",
},
{"content": "Honolulu", "role": "assistant", "context": "Honolulu is the capital of Hawaii"},
],
"context": "Global context",
}
retrieval_eval(conversation=conversation)
Attributes
id
Evaluator identifier, experimental and to be used only with evaluation in cloud.
id = 'azureai://built-in/evaluators/retrieval'