Moderation API

How the Moderation Endpoint Works
Example Request
Example Response
Integrating Moderation into Your Application Workflow
Best Practices
Resources and References

The OpenAI Moderation API helps you detect policy-violating, harmful, or unsafe content in user inputs before sending them to a language model. Integrating this check early in your pipeline ensures compliance, protects end users, and maintains the integrity of your application.

How the Moderation Endpoint Works

When you submit a prompt to the Moderation API, it returns a JSON payload with three primary sections:

Field	Type	Description
flagged	boolean	`true` if any policy violation is detected; `false` otherwise
categories	object	A map of violation categories (e.g., `hate`, `self_harm`) to boolean
category_scores	object	Confidence scores (0.0–1.0) for each category

Example Request

curl https://api.openai.com/v1/moderations \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Your text to check for policy violations"
  }'

Example Response

{
  "id": "modr-XXXXX",
  "model": "text-moderation-004",
  "results": [
    {
      "flagged": false,
      "categories": {
        "hate": false,
        "harassment": false,
        "self_harm": false
      },
      "category_scores": {
        "hate": 0.01,
        "harassment": 0.02,
        "self_harm": 0.00
      }
    }
  ]
}

Use the confidence values in category_scores to prioritize human review of borderline cases.

Integrating Moderation into Your Application Workflow

Adopt a secure, four-step flow to vet user inputs before content generation:

Receive the user prompt.
Call the Moderation API.
- If flagged is true, return an error:
  “Your request violates our content policy and cannot be processed.”
- If flagged is false, continue.
Invoke the Generation API.
Return the generated response to the end user.

# Pseudocode example
def handle_user_prompt(prompt):
    mod_result = call_moderation_api(prompt)
    if mod_result["flagged"]:
        raise PolicyError("Content policy violation detected.")
    return call_generation_api(prompt)

Always enforce the moderation step. Skipping it may expose your system to disallowed or harmful content.

Best Practices

Batch multiple inputs in a single moderation request to reduce latency.
Monitor and log flagged inputs for auditing and continuous policy tuning.
Adjust internal thresholds based on category_scores trends to minimize false positives.

Resources and References

Watch Video

Section Intro

Demo Implementing Moderation

⌘I

Introduction

What is Generative AI

Understanding Prompt Engineering

Understanding Tokens and API Parameters

Implementing Word Completion

Implementing Code Completion

Building an Interactive Chatbot

Performing Text Processing and Analysis

Using Word Embeddings For Dynamic Context

Fine tuning GPT 3 with a Custom Dataset

Generating Images

Audio Transcription Translation

Moderating Prompts with Moderating API

Summary and Next Steps

Exploring Chat GPT

Getting Started with Open AI

How the Moderation Endpoint Works

Example Request

Example Response

Integrating Moderation into Your Application Workflow

Best Practices

Resources and References

Watch Video

Introduction

What is Generative AI

Understanding Prompt Engineering

Understanding Tokens and API Parameters

Implementing Word Completion

Implementing Code Completion

Building an Interactive Chatbot

Performing Text Processing and Analysis

Using Word Embeddings For Dynamic Context

Fine tuning GPT 3 with a Custom Dataset

Generating Images

Audio Transcription Translation

Moderating Prompts with Moderating API

Summary and Next Steps

Exploring Chat GPT

Getting Started with Open AI

​How the Moderation Endpoint Works

​Example Request

​Example Response

​Integrating Moderation into Your Application Workflow

​Best Practices

​Resources and References

Watch Video

How the Moderation Endpoint Works

Example Request

Example Response

Integrating Moderation into Your Application Workflow

Best Practices

Resources and References