AzureOpenAIPythonGrader Class
Note
This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Wrapper class for OpenAI's Python code graders.
Enables custom Python-based evaluation logic with flexible scoring and pass/fail thresholds. The grader executes user-provided Python code to evaluate outputs against custom criteria.
Supplying a PythonGrader to the evaluate method will cause an asynchronous request to evaluate the grader via the OpenAI API. The results of the evaluation will then be merged into the standard evaluation results.
] :param name: The name of the grader. :type name: str :param image_tag: The image tag for the Python execution environment. :type image_tag: str :param pass_threshold: Score threshold for pass/fail classification.
Scores >= threshold are considered passing.
Constructor
AzureOpenAIPythonGrader(*, model_config: AzureOpenAIModelConfiguration | OpenAIModelConfiguration, name: str, image_tag: str, pass_threshold: float, source: str, **kwargs: Any)
Parameters
| Name | Description |
|---|---|
|
model_config
Required
|
The model configuration to use for the grader. |
|
source
Required
|
Python source code containing the grade function. Must define: def grade(sample: dict, item: dict) -> float |
|
kwargs
Required
|
Additional keyword arguments to pass to the grader. |
Keyword-Only Parameters
| Name | Description |
|---|---|
|
model_config
Required
|
|
|
name
Required
|
|
|
image_tag
Required
|
|
|
pass_threshold
Required
|
|
|
source
Required
|
|
Examples
Using AzureOpenAIPythonGrader for custom evaluation logic.
from azure.ai.evaluation import AzureOpenAIPythonGrader, evaluate
from azure.ai.evaluation._model_configurations import AzureOpenAIModelConfiguration
import os
# Configure your Azure OpenAI connection
model_config = AzureOpenAIModelConfiguration(
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
api_key=os.environ["AZURE_OPENAI_API_KEY"],
api_version=os.environ["AZURE_OPENAI_API_VERSION"],
azure_deployment=os.environ["MODEL_DEPLOYMENT_NAME"],
)
# Create a Python grader with custom evaluation logic
python_grader = AzureOpenAIPythonGrader(
model_config=model_config,
name="custom_accuracy",
image_tag="2025-05-08",
pass_threshold=0.8, # 80% threshold for passing
source="""
def grade(sample: dict, item: dict) -> float:
\"\"\"
Custom grading logic that compares model output to expected label.
Args:
sample: Dictionary that is typically empty in Azure AI Evaluation
item: Dictionary containing ALL the data including model output and ground truth
Returns:
Float score between 0.0 and 1.0
\"\"\"
# Important: In Azure AI Evaluation, all data is in 'item', not 'sample'
# The 'sample' parameter is typically an empty dictionary
# Get the model's response/output from item
output = item.get("response", "") or item.get("output", "") or item.get("output_text", "")
output = output.lower()
# Get the expected label/ground truth from item
label = item.get("ground_truth", "") or item.get("label", "") or item.get("expected", "")
label = label.lower()
# Handle empty cases
if not output or not label:
return 0.0
# Exact match gets full score
if output == label:
return 1.0
# Partial match logic (customize as needed)
if output in label or label in output:
return 0.5
return 0.0
""",
)
# Run evaluation
evaluation_result = evaluate(
data="evaluation_data.jsonl", # JSONL file with columns: query, response, ground_truth, etc.
evaluators={"custom_accuracy": python_grader},
)
# Access results
print(f"Pass rate: {evaluation_result['metrics']['custom_accuracy.pass_rate']}")
Methods
| get_client |
Construct an appropriate OpenAI client using this grader's model configuration. Returns a slightly different client depending on whether or not this grader's model configuration is for Azure OpenAI or OpenAI. |
get_client
Construct an appropriate OpenAI client using this grader's model configuration. Returns a slightly different client depending on whether or not this grader's model configuration is for Azure OpenAI or OpenAI.
get_client() -> Any
Returns
| Type | Description |
|---|---|
|
[<xref:openai.OpenAI>, <xref:openai.AzureOpenAI>]
|
The OpenAI client. |
Attributes
id
id = 'azureai://built-in/evaluators/azure-openai/python_grader'