HybridFilter
Combine rule-based and ML-based detection for comprehensive content moderation
The HybridFilter class combines fast rule-based detection with ML-powered toxicity analysis. This gives you the best of both worlds: speed for common profanity and contextual understanding for subtle toxic content.
HybridFilter includes all rule-based Filter functionality plus optional ML integration. Use it as a drop-in replacement when you need ML capabilities.
Import
import { HybridFilter } from 'glin-profanity/ml';Constructor
constructor(config?: HybridFilterConfig)Creates a new HybridFilter with combined rule-based and ML configuration.
Configuration Options
The HybridFilterConfig extends all options from FilterConfig plus:
| Prop | Type | Default |
|---|---|---|
enableML? | boolean | false |
mlThreshold? | number | 0.85 |
mlLabels? | ToxicityLabel[] | All 7 labels |
preloadML? | boolean | false |
combinationMode? | 'or' | 'and' | 'ml-override' | 'rules-first' | 'or' |
borderlineThreshold? | number | 0.5 |
Combination Modes Explained
| Mode | Behavior | Best For |
|---|---|---|
'or' | Flag if either method detects | Maximum sensitivity |
'and' | Flag only if both methods agree | Maximum precision |
'ml-override' | Use ML result when available | ML-first approach |
'rules-first' | Use rules for speed, ML for borderline | Balanced performance |
Basic Usage
import { HybridFilter } from 'glin-profanity/ml';
// Enable ML with default settings
const filter = new HybridFilter({
enableML: true,
languages: ['english'],
detectLeetspeak: true,
});
await filter.initialize();
// Async check with both methods
const result = await filter.checkProfanityAsync('some text');
console.log(result.isToxic);
console.log(result.reason);Public Methods
Combination Mode Examples
// Maximum sensitivity - flag if either method detects
const filter = new HybridFilter({
enableML: true,
combinationMode: 'or', // Default
});
await filter.initialize();
// Catches explicit profanity via rules
await filter.isToxicAsync('what the fuck');
// true - rules detected
// Catches subtle toxicity via ML
await filter.isToxicAsync('you are worthless and nobody likes you');
// true - ML detected (no explicit words)
// Use when: You want maximum coverage and can tolerate some false positives// Maximum precision - both must agree
const filter = new HybridFilter({
enableML: true,
combinationMode: 'and',
});
await filter.initialize();
// Both agree - flagged
await filter.isToxicAsync('you fucking idiot');
// true - both rules AND ML detected
// Only rules detect - NOT flagged
await filter.isToxicAsync('damn this is good');
// false - ML doesn't see this as toxic
// Use when: You need high confidence and want to avoid false positives// ML takes precedence when available
const filter = new HybridFilter({
enableML: true,
combinationMode: 'ml-override',
});
await filter.initialize();
// ML says toxic - flagged regardless of rules
await filter.isToxicAsync('you should just disappear');
// true - ML detected threat/toxicity
// ML says clean - NOT flagged even if rules would match
await filter.isToxicAsync('damn good job!');
// false - ML understands positive context
// Use when: You trust ML judgment over word lists// Fast rules, ML for edge cases
const filter = new HybridFilter({
enableML: true,
combinationMode: 'rules-first',
borderlineThreshold: 0.5,
});
await filter.initialize();
// Rules detect - flagged immediately (fast)
await filter.isToxicAsync('what the fuck');
// true - rules detected, no ML needed
// Rules miss, but ML catches
await filter.isToxicAsync('go kys loser');
// true - rules missed, ML detected
// Use when: You want speed but need ML as a safety netComplete Example
import { HybridFilter } from 'glin-profanity/ml';
class ContentModerator {
private filter: HybridFilter;
constructor() {
this.filter = new HybridFilter({
// Rule-based config
languages: ['english', 'spanish'],
detectLeetspeak: true,
allowObfuscatedMatch: true,
// ML config
enableML: true,
mlThreshold: 0.85,
mlLabels: ['insult', 'threat', 'toxicity', 'identity_attack'],
// Combination strategy
combinationMode: 'rules-first',
borderlineThreshold: 0.5,
});
}
async initialize() {
await this.filter.initialize();
console.log('Content moderator ready');
}
async moderate(text: string) {
const result = await this.filter.checkProfanityAsync(text);
return {
allowed: !result.isToxic,
confidence: result.confidence,
reason: result.reason,
ruleMatches: result.ruleBasedResult.profaneWords,
mlCategories: result.mlResult?.matchedCategories ?? [],
};
}
// Fast sync check for simple cases
quickCheck(text: string) {
return !this.filter.isProfane(text);
}
}
// Usage
const moderator = new ContentModerator();
await moderator.initialize();
const result = await moderator.moderate('you are such a loser');
// {
// allowed: false,
// confidence: 0.89,
// reason: 'ML detected (rules missed): insult, toxicity',
// ruleMatches: [],
// mlCategories: ['insult', 'toxicity']
// }Performance Comparison
| Method | Latency | Use Case |
|---|---|---|
isProfane() (sync) | ~0.1ms | Quick rule-based checks |
checkProfanity() (sync) | ~0.2ms | Detailed rule-based analysis |
checkProfanityAsync() | ~30-100ms | Full hybrid analysis |
checkProfanityBatchAsync() | ~10-30ms/item | Bulk moderation |
For high-throughput systems, use sync methods as a first pass and only call async ML methods for borderline cases or when sync checks pass.
Cross-References
- ToxicityDetector — Standalone ML detection
- Filter Class — Rule-based detection API
- Combination Strategies — Detailed guide on choosing modes
- ML Integration Guide — Setup and best practices