The Protected material detection APIs scan the output of large language models to identify and flag known protected material. These APIs help organizations prevent the generation of content that closely matches copyrighted text or code.
The Protected material text API flags any known text content that large language models output, such as song lyrics, articles, recipes, and selected web content.
The Protected material code API flags protected code content that large language models might output. This content comes from known GitHub repositories and includes software libraries, source code, algorithms, and other proprietary programming content.
Caution
The Content Safety service's code scanner/indexer is only current through April 6, 2023. Code that was added to GitHub after this date won't be detected. Use your own discretion when using Protected Material for Code to detect recent bodies of code.
 
By detecting and preventing the display of protected material, organizations can ensure compliance with intellectual property laws, maintain content originality, and protect their reputations.
This guide provides details about the kinds of content that the protected material API detects.
User scenarios
Content generation platforms for creative writing
- Scenario: A content generation platform that uses generative AI for creative writing (for example, blog posts, stories, marketing copy) integrates the Protected Material for Text feature to prevent the generation of content that closely matches known copyrighted material.
- User: Platform administrators and content creators.
- Action: The platform uses Azure AI Content Safety to scan AI-generated content before it's provided to users. If the generated text matches protected material, the content is flagged and either blocked or revised.
- Outcome: The platform avoids potential copyright infringements and ensures that all generated content is original and compliant with intellectual property laws.
Automated social media content creation
- Scenario: A digital marketing agency uses generative AI to automate social media content creation. The agency integrates the Protected Material for Text feature to avoid publishing AI-generated content that includes copyrighted text, such as song lyrics or excerpts from books.
- User: Digital marketers and social media managers.
- Action: The agency employs Azure AI Content Safety to check all AI-generated social media content for matches against a database of protected material. Content that matches is flagged for revision or blocked from posting.
- Outcome: The agency maintains compliance with copyright laws and avoids reputation risks associated with posting unauthorized content.
AI-assisted news writing
- Scenario: A news outlet uses generative AI to assist journalists in drafting articles and reports. To ensure the content does not unintentionally replicate protected news articles or other copyrighted material, the outlet uses the Protected Material for Text feature.
- User: Journalists, editors, and compliance officers.
- Action: The news outlet integrates Azure AI Content Safety into its content creation workflow. AI-generated drafts are automatically scanned for protected content before submission for editorial review.
- Outcome: The news outlet prevents accidental copyright violations and maintains the integrity and originality of its reporting.
E-learning platforms using AI for content generation
- Scenario: An e-learning platform employs generative AI to generate educational content, such as summaries, quizzes, and explanatory text. The platform uses the Protected Material for Text feature to ensure the generated content does not include protected material from textbooks, articles, or academic papers.
- User: Educational content creators and compliance officers.
- Action: The platform integrates the feature to scan AI-generated educational materials. If any content matches known protected academic material, it's flagged for revision or automatically removed.
- Outcome: The platform maintains educational content quality and complies with copyright laws, avoiding the use of protected material in AI-generated learning resources.
AI-powered recipe generators
- Scenario: A food and recipe website uses generative AI to generate new recipes based on user preferences. To avoid generating content that matches protected recipes from famous cookbooks or websites, the website integrates the Protected Material for Text feature.
- User: Content managers and platform administrators.
- Action: The website uses Azure AI Content Safety to check AI-generated recipes against a database of known protected content. If a generated recipe matches a protected one, it's flagged and revised or blocked.
- Outcome: The website ensures that all AI-generated recipes are original, reducing the risk of copyright infringement.
- Scenario: A software development platform that utilizes generative AI to help developers write code integrates the Protected Material for Code feature to prevent the generation of code that replicates material from existing GitHub repositories.
- User: Platform administrators, developers.
- Action: The platform uses Azure AI Content Safety to scan AI-generated code. If any code matches protected material, it's flagged for review, revised, or blocked.
- Outcome: The platform ensures that all AI-generated code is original and complies with licensing agreements, reducing legal and compliance risks.
- Scenario: A development team uses generative AI to automate parts of their code writing. The team integrates the Protected Material for Code feature to prevent the accidental use of code snippets that match content from existing GitHub repositories, including open-source code with restrictive licenses.
- User: Software developers, DevOps teams.
- Action: Azure AI Content Safety checks the generated code against known material from GitHub repositories. If a match is found, the code is flagged and revised before it's incorporated into the project.
- Outcome: The team avoids potential copyright infringement and ensures the AI-generated code adheres to appropriate licenses.
AI-assisted Code Reviews
- Scenario: A software company integrates AI-assisted code review tools into its development process. To avoid introducing protected code from GitHub or external libraries, the company uses the Protected Material for Code feature.
- User: Code reviewers, software developers, compliance officers.
- Action: The company scans all AI-generated code for matches against protected material from GitHub repositories before final code review and deployment.
- Outcome: The company prevents the inclusion of protected material in their projects, maintaining compliance with intellectual property laws and internal standards.
- Scenario: An e-learning platform uses generative AI to generate example code for programming tutorials and courses. The platform integrates the Protected Material for Code feature to ensure that generated examples do not duplicate code from existing GitHub repositories or other educational sources.
- User: Course creators, platform administrators.
- Action: Azure AI Content Safety checks all AI-generated code examples for protected content. Matches are flagged, reviewed, and revised.
- Outcome: The platform maintains the integrity and originality of its educational content while adhering to copyright laws.
AI-powered Coding Assistants
- Scenario: A coding assistant tool powered by generative AI helps developers by generating code suggestions. To ensure that no suggestions infringe on code from GitHub repositories, the assistant tool uses the Protected Material for Code feature.
- User: Developers, tool administrators.
- Action: The tool scans all code suggestions for protected material from GitHub before presenting them to developers. If a suggestion matches protected code, it's flagged and not shown.
- Outcome: The coding assistant ensures that all code suggestions are free from protected content, fostering originality and reducing legal risks.
By integrating the Protected Material for Code feature, organizations can manage risks associated with AI-generated code, maintain compliance with intellectual property laws, and ensure the originality of their code outputs.
 
Protected material text examples
Refer to this table for details of the major categories of protected material text detection. All four categories are applied when you call the API.
| Category | Scope | Considered acceptable | Considered harmful | 
| Recipes | Copyrighted content related to Recipes. 
 Other harmful or sensitive text is out of scope for this task, unless it intersects with Recipes IP copyright harm.
 | Links to web pages that contain information about recipes  Any content from recipes that have no or low IP/Copyright protections: Lists of ingredientsBasic instructions for combining and cooking ingredients
Rejection or refusal to provide copyrighted content: Changing a topic to avoid sharing copyrighted contentRefusal to share copyrighted contentProviding nonresponsive information
 | Other literary content in a recipe Matching anecdotes, stories, or personal commentary about the recipe (40 characters or more)Creative names for the recipe that are not limited to the well-known name of the dish, or a plain descriptive summary of the dish indicating what the primary ingredient is (40 characters or more)Creative descriptions of the ingredients or steps for combining or cooking ingredients, including descriptions that contain more information than needed to create the dish, rely on imprecise wording, or contain profanity (40 characters or more)
Methods to access copyrighted content:Ways to bypass paywalls to access recipes
 | 
| Web Content | All websites that have webmd.comas their URL domain name. Only focuses on issues of copyrighted content around Selected Web Content.
 Other harmful or sensitive text is out of scope for this task, unless it intersects Selected Web Content harm.
 | Links to web pages Short excerpts or snippets of Selected Web Content as long as:They are relevant to the user's queryThey are fewer than 200 characters
 | Substantial content of Selected Web Content  Response sections longer than 200 characters that bear substantial similarity to a block of text from the Selected Web ContentExcerpts from Selected Web Content that are longer than 200 charactersQuotes from Selected Web Content that are longer than 200 characters
Methods to access copyrighted content:Ways to bypass paywalls or DRM protections to access copyrighted Selected Web Content
 | 
| News | Only focus on issues of copyrighted content around News. 
 Other harmful or sensitive text is out of scope for this task, unless it intersects News IP Copyright harm.
 | Links to web pages that host news or information about news, magazines, or blog articles as long as:They have legitimate permissionsThey have licensed news coverageThey are authorized platforms
Links to authorized web pages that contain embedded audio/video players as long as:They have legitimate permissionsThey have licensed news coverageThey are authorized streaming platformsThey are official YouTube channels
Short excerpts/snippets like headlines or captions from news articles as long as:They are relevant to the user's queryThey are not a substantial part of the articleThey are not the entire article
Summary of news articles as long as:It is relevant to the user's queryIt is brief and factualIt does not copy/paraphrase a substantial part of the articleIt is clearly and visibly cited as a summary
Analysis/Critique/Review of news articles as long as:It is relevant to the user's queryIt is brief and factualIt does not copy/paraphrase a substantial part of the articleIt is clearly and visibly cited as an analysis/critique/review
Any news content that has no IP/Copyright protections:News/Magazines/Blogs that are in the public domainNews/Magazines/Blogs for which Copyright protection has elapsed, been surrendered, or never existed
Rejection or refusal to provide copyrighted content:Changing topic to avoid sharing copyrighted contentRefusal to share copyrighted contentProviding nonresponsive information
 | Links to pdf or any other file containing full text of news/magazine/blog articles, unless:They are sourced from authorized platforms with legitimate permissions and licenses
News contentMore than 200 characters taken verbatim from any news articleMore than 200 characters substantially similar to a block of text from any news articleDirect access to news/magazine/blog articles that are behind paywalls
Methods to access copyrighted content:Steps to download news from an unauthorized websiteWays to bypass paywalls or DRM protections to access copyrighted news or videos
 | 
| Lyrics | Only focuses on issues of copyrighted content around Songs. 
 Other harmful or sensitive text is out of scope for this task, unless it intersects Songs IP Copyright harm.
 | Links to web pages that contain information about songs such as:Lyrics of the songsChords or tabs of the associated musicAnalysis or reviews of the song/music
Links to authorized web pages that contain embedded audio/video players as long as:They have legitimate permissionsThey have licensed musicThey are authorized streaming platformsThey are official YouTube channels
Short excerpts or snippets from lyrics of the songs as long as:They are relevant to the user's queryThey are not a substantial part of the lyricsThey are not the entire lyricsThey are not more than 11 words long
Short excerpts or snippets from chords/tabs of the songs as long as:They are relevant to the user's queryThey are not a substantial part of the chords/tabsThey are not the entire chords/tabs
Any content from songs that have no IP/Copyright protections:Songs/Lyrics/Chords/Tabs that are in the public domainSongs/Lyrics/Chords/Tabs for which Copyright protection has elapsed, been surrendered, or never existed
Rejection or refusal to provide copyrighted content:Changing topic to avoid sharing copyrighted contentRefusal to share copyrighted contentProviding nonresponsive information
 | Lyrics of a songEntire lyricsSubstantial part of the lyricsPart of lyrics that contain more than 11 words
Chords or Tabs of a songEntire chords/tabsSubstantial part of the chords/tabs
Links to webpages that contain embedded audio/video players that:Do not have legitimate permissionsDo not have licensed musicAre not authorized streaming platformsAre not official YouTube channels
Methods to access copyrighted content:Steps to download songs from an unauthorized websiteWays to bypass paywalls or DRM protections to access copyrighted songs or videos
 | 
Next step
To detect protected material, follow the quickstart to get started using Azure AI Content Safety.