QAEvaluator Class
Initialize a question-answer evaluator configured for a specific Azure OpenAI model.
Note
To align with our support of a diverse set of models, keys without the gpt_ prefix has been added.
To maintain backwards compatibility, the old keys with the gpt_ prefix are still be present in the output;
however, it is recommended to use the new keys moving forward as the old keys will be deprecated in the future.
Constructor
QAEvaluator(model_config, *, groundedness_threshold: int = 3, relevance_threshold: int = 3, coherence_threshold: int = 3, fluency_threshold: int = 3, similarity_threshold: int = 3, f1_score_threshold: float = 0.5, **kwargs)
Parameters
| Name | Description |
|---|---|
|
model_config
Required
|
Configuration for the Azure OpenAI model. |
|
groundedness_threshold
Required
|
The threshold for groundedness evaluation. Default is 3. |
|
relevance_threshold
Required
|
The threshold for relevance evaluation. Default is 3. |
|
coherence_threshold
Required
|
The threshold for coherence evaluation. Default is 3. |
|
fluency_threshold
Required
|
The threshold for fluency evaluation. Default is 3. |
|
similarity_threshold
Required
|
The threshold for similarity evaluation. Default is 3. |
|
f1_score_threshold
Required
|
The threshold for F1 score evaluation. Default is 0.5. |
|
kwargs
Required
|
Additional arguments to pass to the evaluator. |
Keyword-Only Parameters
| Name | Description |
|---|---|
|
groundedness_threshold
|
Default value: 3
|
|
relevance_threshold
|
Default value: 3
|
|
coherence_threshold
|
Default value: 3
|
|
fluency_threshold
|
Default value: 3
|
|
similarity_threshold
|
Default value: 3
|
|
f1_score_threshold
|
Default value: 0.5
|
Examples
Initialize with threshold and call a QAEvaluator.
import os
from azure.ai.evaluation import QAEvaluator
model_config = {
"azure_endpoint": os.environ.get("AZURE_OPENAI_ENDPOINT"),
"api_key": os.environ.get("AZURE_OPENAI_KEY"),
"azure_deployment": os.environ.get("AZURE_OPENAI_DEPLOYMENT"),
}
qa_eval = QAEvaluator(
model_config=model_config,
groundedness_threshold=2,
relevance_threshold=2,
coherence_threshold=2,
fluency_threshold=2,
similarity_threshold=2,
f1_score_threshold=0.5,
)
qa_eval(query="This's the color?", response="Black", ground_truth="gray", context="gray")
Attributes
id
Evaluator identifier, experimental and to be used only with evaluation in cloud.
id = 'azureai://built-in/evaluators/qa'