Edit

Share via


How to configure content filters for models in Azure AI Foundry

The content filtering system integrated into Azure AI Foundry runs alongside Azure AI Foundry Models. It uses an ensemble of multi-class classification models to detect four categories of harmful content (violence, hate, sexual, and self-harm) at four severity levels (safe, low, medium, and high). It offers optional binary classifiers for detecting jailbreak risk, existing text, and code in public repositories. For more information about content categories, severity levels, and the behavior of the content filtering system, see the following article.

The default content filtering configuration filters content at the medium severity threshold for all four harmful categories for both prompts and completions. Content detected at medium or high severity level is filtered out, while content detected at low or safe severity level isn't filtered.

You can configure content filters at the resource level and associate them with one or more deployments.

Prerequisites

To complete this article, you need:

Create a custom content filter

Follow these steps to create a custom content filter:

  1. Go to the Azure AI Foundry portal.

  2. Select Guardrails & controls from the left pane.

  3. Select the Content filters tab, then select Create content filter.

  4. On the Basic information page, enter a name for the content filter.

  5. For Connection, select the connection to the Azure AI Services resource that is connected to your project.

  6. Select Next to go to the Input filter page.

  7. Configure the input filter depending on your requirements. This configuration is applied before the request reaches the model itself.

  8. Select Next to go to the Output filter page.

  9. Configure the output filter depending on your requirements. This configuration is applied after the model is executed and content is generated.

  10. Select Next to go to the Connection page.

  11. On this page, you have the option to associate model deployments with the created content filter. You can change the associated model deployments at any time.

  12. Select Next to review the filter settings. Then, select Create filter.

  13. When the deployment completes, the new content filter is applied to the model deployment.

Account for content filtering in your code

When you apply content filtering to your model deployment, the service can intercept requests based on the inputs and outputs. If a content filter triggers, the service returns a 400 error code with a description of the rule that triggered the error.

Install the package azure-ai-inference using your package manager, like pip:

pip install azure-ai-inference

Then, you can use the package to consume the model. The following example shows how to create a client to consume chat completions:

import os
from azure.ai.inference import ChatCompletionsClient
from azure.core.credentials import AzureKeyCredential

client = ChatCompletionsClient(
    endpoint="https://<resource>.services.ai.azure.com/models",
    credential=AzureKeyCredential(os.environ["AZURE_INFERENCE_CREDENTIAL"]),
)

Explore our samples and read the API reference documentation to get yourself started.

The following example shows the response for a chat completion request that has triggered Guardrails & controls.

from azure.ai.inference.models import AssistantMessage, UserMessage, SystemMessage
from azure.core.exceptions import HttpResponseError

try:
    response = model.complete(
        messages=[
            SystemMessage(content="You are an AI assistant that helps people find information."),
            UserMessage(content="Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills."),
        ]
    )

    print(response.choices[0].message.content)

except HttpResponseError as ex:
    if ex.status_code == 400:
        response = json.loads(ex.response._content.decode('utf-8'))
        if isinstance(response, dict) and "error" in response:
            print(f"Your request triggered an {response['error']['code']} error:\n\t {response['error']['message']}")
        else:
            raise ex
    else:
        raise ex

Follow best practices

To address potential harms that are relevant for a specific model, application, and deployment scenario, use an iterative identification process (such as red team testing, stress-testing, and analysis) and a measurement process to inform your content filtering configuration decisions. After you implement mitigations like content filtering, repeat measurement to test effectiveness.

For recommendations and best practices on Responsible AI for Azure OpenAI, grounded in the Microsoft Responsible AI Standard, see the Responsible AI Overview for Azure OpenAI.

The content filtering system integrated into Azure AI Foundry runs alongside Azure AI Foundry Models. It uses an ensemble of multi-class classification models to detect four categories of harmful content (violence, hate, sexual, and self-harm) at four severity levels (safe, low, medium, and high). It offers optional binary classifiers for detecting jailbreak risk, existing text, and code in public repositories. For more information about content categories, severity levels, and the behavior of the content filtering system, see the following article.

The default content filtering configuration filters content at the medium severity threshold for all four harmful categories for both prompts and completions. Content detected at medium or high severity level is filtered out, while content detected at low or safe severity level isn't filtered.

You can configure content filters at the resource level and associate them with one or more deployments.

Prerequisites

To complete this article, you need:

Add a model deployment with custom content filtering

We recommend creating content filters using either Azure AI Foundry portal or in code using Bicep. Creating custom content filters or applying them to deployments is not supported using the Azure CLI.

Account for content filtering in your code

When you apply content filtering to your model deployment, the service can intercept requests based on the inputs and outputs. If a content filter triggers, the service returns a 400 error code with a description of the rule that triggered the error.

Install the package azure-ai-inference using your package manager, like pip:

pip install azure-ai-inference

Then, you can use the package to consume the model. The following example shows how to create a client to consume chat completions:

import os
from azure.ai.inference import ChatCompletionsClient
from azure.core.credentials import AzureKeyCredential

client = ChatCompletionsClient(
    endpoint="https://<resource>.services.ai.azure.com/models",
    credential=AzureKeyCredential(os.environ["AZURE_INFERENCE_CREDENTIAL"]),
)

Explore our samples and read the API reference documentation to get yourself started.

The following example shows the response for a chat completion request that has triggered Guardrails & controls.

from azure.ai.inference.models import AssistantMessage, UserMessage, SystemMessage
from azure.core.exceptions import HttpResponseError

try:
    response = model.complete(
        messages=[
            SystemMessage(content="You are an AI assistant that helps people find information."),
            UserMessage(content="Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills."),
        ]
    )

    print(response.choices[0].message.content)

except HttpResponseError as ex:
    if ex.status_code == 400:
        response = json.loads(ex.response._content.decode('utf-8'))
        if isinstance(response, dict) and "error" in response:
            print(f"Your request triggered an {response['error']['code']} error:\n\t {response['error']['message']}")
        else:
            raise ex
    else:
        raise ex

Follow best practices

To address potential harms that are relevant for a specific model, application, and deployment scenario, use an iterative identification process (such as red team testing, stress-testing, and analysis) and a measurement process to inform your content filtering configuration decisions. After you implement mitigations like content filtering, repeat measurement to test effectiveness.

For recommendations and best practices on Responsible AI for Azure OpenAI, grounded in the Microsoft Responsible AI Standard, see the Responsible AI Overview for Azure OpenAI.

The content filtering system integrated into Azure AI Foundry runs alongside Azure AI Foundry Models. It uses an ensemble of multi-class classification models to detect four categories of harmful content (violence, hate, sexual, and self-harm) at four severity levels (safe, low, medium, and high). It offers optional binary classifiers for detecting jailbreak risk, existing text, and code in public repositories. For more information about content categories, severity levels, and the behavior of the content filtering system, see the following article.

The default content filtering configuration filters content at the medium severity threshold for all four harmful categories for both prompts and completions. Content detected at medium or high severity level is filtered out, while content detected at low or safe severity level isn't filtered.

You can configure content filters at the resource level and associate them with one or more deployments.

Prerequisites

To complete this article, you need:

  • Install the Azure CLI.

  • Identify the following information:

    • Your Azure subscription ID.

    • Your Azure AI Services resource name.

    • The resource group where you deployed the Azure AI Services resource.

    • The model name, provider, version, and SKU you want to deploy. You can use the Azure AI Foundry portal or the Azure CLI to find this information. In this example, deploy the following model:

      • Model name:: Phi-4-mini-instruct
      • Provider: Microsoft
      • Version: 1
      • Deployment type: Global standard

Add a model deployment with custom content filtering

  1. Use the template ai-services-content-filter-template.bicep to describe the content filter policy:

    ai-services-content-filter-template.bicep

    @description('Name of the Azure AI Services account where the policy will be created')
    param accountName string
    
    @description('Name of the policy to be created')
    param policyName string
    
    @allowed(['Asynchronous_filter', 'Blocking', 'Default', 'Deferred'])
    param mode string = 'Default'
    
    @description('Base policy to be used for the new policy')
    param basePolicyName string = 'Microsoft.DefaultV2'
    
    param contentFilters array = [
      {
          name: 'Violence'
          severityThreshold: 'Medium'
          blocking: true
          enabled: true
          source: 'Prompt'
      }
      {
          name: 'Hate'
          severityThreshold: 'Medium'
          blocking: true
          enabled: true
          source: 'Prompt'
      }
      {
          name: 'Sexual'
          severityThreshold: 'Medium'
          blocking: true
          enabled: true
          source: 'Prompt'
      }
      {
          name: 'Selfharm'
          severityThreshold: 'Medium'
          blocking: true
          enabled: true
          source: 'Prompt'
      }
      {
          name: 'Jailbreak'
          blocking: true
          enabled: true
          source: 'Prompt'
      }
      {
          name: 'Indirect Attack'
          blocking: true
          enabled: true
          source: 'Prompt'
      }
      {
          name: 'Profanity'
          blocking: true
          enabled: true
          source: 'Prompt'
      }
      {
          name: 'Violence'
          severityThreshold: 'Medium'
          blocking: true
          enabled: true
          source: 'Completion'
      }
      {
          name: 'Hate'
          severityThreshold: 'Medium'
          blocking: true
          enabled: true
          source: 'Completion'
      }
      {
          name: 'Sexual'
          severityThreshold: 'Medium'
          blocking: true
          enabled: true
          source: 'Completion'
      }
      {
          name: 'Selfharm'
          severityThreshold: 'Medium'
          blocking: true
          enabled: true
          source: 'Completion'
      }
      {
          name: 'Protected Material Text'
          blocking: true
          enabled: true
          source: 'Completion'
      }
      {
          name: 'Protected Material Code'
          blocking: false
          enabled: true
          source: 'Completion'
      }
      {
          name: 'Profanity'
          blocking: true
          enabled: true
          source: 'Completion'
      }
    ]
    
    resource raiPolicy 'Microsoft.CognitiveServices/accounts/raiPolicies@2024-06-01-preview' = {
        name: '${accountName}/${policyName}'
        properties: {
            mode: mode
            basePolicyName: basePolicyName
            contentFilters: contentFilters
        }
    }
    
  2. Use the template ai-services-deployment-template.bicep to describe model deployments:

    ai-services-deployment-template.bicep

    @description('Name of the Azure AI services account')
    param accountName string
    
    @description('Name of the model to deploy')
    param modelName string
    
    @description('Version of the model to deploy')
    param modelVersion string
    
    @allowed([
      'AI21 Labs'
      'Cohere'
      'Core42'
      'DeepSeek'
      'xAI'
      'Meta'
      'Microsoft'
      'Mistral AI'
      'OpenAI'
    ])
    @description('Model provider')
    param modelPublisherFormat string
    
    @allowed([
        'GlobalStandard'
        'DataZoneStandard'
        'Standard'
        'GlobalProvisioned'
        'Provisioned'
    ])
    @description('Model deployment SKU name')
    param skuName string = 'GlobalStandard'
    
    @description('Content filter policy name')
    param contentFilterPolicyName string = 'Microsoft.DefaultV2'
    
    @description('Model deployment capacity')
    param capacity int = 1
    
    resource modelDeployment 'Microsoft.CognitiveServices/accounts/deployments@2024-04-01-preview' = {
      name: '${accountName}/${modelName}'
      sku: {
        name: skuName
        capacity: capacity
      }
      properties: {
        model: {
          format: modelPublisherFormat
          name: modelName
          version: modelVersion
        }
        raiPolicyName: contentFilterPolicyName == null ? 'Microsoft.Nill' : contentFilterPolicyName
      }
    }
    
  3. Create the main deployment definition:

    main.bicep

    param accountName string
    param modelName string
    param modelVersion string
    param modelPublisherFormat string
    param contentFilterPolicyName string
    
    module raiPolicy 'ai-services-content-filter-template.bicep' = {
      name: 'raiPolicy'
      scope: resourceGroup(resourceGroupName)
      params: {
        accountName: accountName
        policyName: contentFilterPolicyName
      }
    }
    
    module modelDeployment 'ai-services-deployment-template.bicep' = {
        name: 'modelDeployment'
        scope: resourceGroup(resourceGroupName)
        params: {
            accountName: accountName
            modelName: modelName
            modelVersion: modelVersion
            modelPublisherFormat: modelPublisherFormat
            contentFilterPolicyName: contentFilterPolicyName
        }
        dependsOn: [
            raiPolicy
        ]
    }
    
  4. Run the deployment:

    RESOURCE_GROUP="<resource-group-name>"
    ACCOUNT_NAME="<azure-ai-model-inference-name>" 
    MODEL_NAME="Phi-4-mini-instruct"
    PROVIDER="Microsoft"
    VERSION=1
    RAI_POLICY_NAME="custom-policy"
    
    az deployment group create \
        --resource-group $RESOURCE_GROUP \
        --template-file main.bicep \
        --parameters accountName=$ACCOUNT_NAME raiPolicyName=$RAI_POLICY_NAME modelName=$MODEL_NAME modelVersion=$VERSION modelPublisherFormat=$PROVIDER
    

Account for content filtering in your code

When you apply content filtering to your model deployment, the service can intercept requests based on the inputs and outputs. If a content filter triggers, the service returns a 400 error code with a description of the rule that triggered the error.

Install the package azure-ai-inference using your package manager, like pip:

pip install azure-ai-inference

Then, you can use the package to consume the model. The following example shows how to create a client to consume chat completions:

import os
from azure.ai.inference import ChatCompletionsClient
from azure.core.credentials import AzureKeyCredential

client = ChatCompletionsClient(
    endpoint="https://<resource>.services.ai.azure.com/models",
    credential=AzureKeyCredential(os.environ["AZURE_INFERENCE_CREDENTIAL"]),
)

Explore our samples and read the API reference documentation to get yourself started.

The following example shows the response for a chat completion request that has triggered Guardrails & controls.

from azure.ai.inference.models import AssistantMessage, UserMessage, SystemMessage
from azure.core.exceptions import HttpResponseError

try:
    response = model.complete(
        messages=[
            SystemMessage(content="You are an AI assistant that helps people find information."),
            UserMessage(content="Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills."),
        ]
    )

    print(response.choices[0].message.content)

except HttpResponseError as ex:
    if ex.status_code == 400:
        response = json.loads(ex.response._content.decode('utf-8'))
        if isinstance(response, dict) and "error" in response:
            print(f"Your request triggered an {response['error']['code']} error:\n\t {response['error']['message']}")
        else:
            raise ex
    else:
        raise ex

Follow best practices

To address potential harms that are relevant for a specific model, application, and deployment scenario, use an iterative identification process (such as red team testing, stress-testing, and analysis) and a measurement process to inform your content filtering configuration decisions. After you implement mitigations like content filtering, repeat measurement to test effectiveness.

For recommendations and best practices on Responsible AI for Azure OpenAI, grounded in the Microsoft Responsible AI Standard, see the Responsible AI Overview for Azure OpenAI.