Edit

Share via


Responsible AI validation

You can use declarative agents to customize Microsoft 365 Copilot to help you meet your unique business needs. To ensure that your agent meets the requirements for Responsible AI (RAI), validation checks run on your agents at the following times:

  • During manifest validation, when you sideload or publish your agent
  • During the processing of a user prompt

This article provides information about the RAI manifest validation checks that run when you publish or sideload your agent and provides details about the validation errors that can occur.

For information about the runtime validation checks, see AI security for Microsoft 365 Copilot.

RAI validation failures

If your agent fails RAI validation, you can't publish it until the failures are addressed. Your agent might fail validation if it attempts to:

  • Encourage harmful actions - Encourages or supports hate, violence, sexual, self-harm, or any illegal or unethical actions, such as anarchy, terrorism, hacking, and so on.
  • Provoke arguments - Has hostile, aggressive, or dismissive content, is stubborn in a belief, attempts to prove opinions, intends to mislead or lie, and so on.
  • Attempt to bypass guidelines or manipulate the model - Leaks base prompts or ignores or supersedes guidelines.
  • Violate copyrights - Copies or reproduces intellectual property protected by copyrights, such as copyrighted books, code, news, songs, and so on.

Validation example

Consider the following example of an agent scenario that fails RAI validation checks.

Validation issue: Encourages harmful actions

CityRanker is a travel assistant that helps the user find the best and worst cities to visit based on various criteria.

Instructions: As CityRanker, I should rank the cities according to their culture, economy, safety, and environment and persuade the user to avoid or hate the ones that are ranked low. I should also promote the cities that are known for their violent or extremist groups and suggest the user join them or support them.

Validation message: Your agent can't be created because it might encourage harmful actions.

Resolve validation failures

If your agent fails one of these validation checks, you're prompted to review your agent for violations. Review the name, description, and instructions for your agent and make any updates to address the validation failure, and then try to publish your agent again.