Multi-lingual support for FM-Moderation Guardrails

Currently, our Moderation Guardrails support only English language, which limits their effectiveness in detecting and blocking multilingual jailbreak attacks. To address this, we propose extending support to multiple languages using a model-based translation approach.

**Why This Is Needed:**
Existing translation options like Google Translate and Azure Translate rely on external APIs, which may pose security risks.
Model-based guardrails perform better with English prompts and offer more secure and consistent results.
Multilingual support is essential to ensure robust moderation across diverse user inputs.

**Proposed Solution:**
Integrate the facebook/m2m_100_418M model, which supports 100+ languages.
Translate incoming prompts to English and detect the original language.
Focus on priority languages such as Dutch, French, German, Italian, and Spanish.
Currently available in the development environment.

**Feature Highlights:**
Secure, model-based translation pipeline.
Language detection and prompt translation.
Enables moderation guardrails to handle multilingual inputs effectively.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Multi-lingual support for FM-Moderation Guardrails #21

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Multi-lingual support for FM-Moderation Guardrails #21

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions