GLINR Studio LogoTypeWeaver

HybridFilter

Combine rule-based and ML-based detection for comprehensive content moderation

Edit on GitHub

The HybridFilter class combines fast rule-based detection with ML-powered toxicity analysis. This gives you the best of both worlds: speed for common profanity and contextual understanding for subtle toxic content.

HybridFilter includes all rule-based Filter functionality plus optional ML integration. Use it as a drop-in replacement when you need ML capabilities.

Import

import { HybridFilter } from 'glin-profanity/ml';

Constructor

constructor(config?: HybridFilterConfig)

Creates a new HybridFilter with combined rule-based and ML configuration.

Configuration Options

The HybridFilterConfig extends all options from FilterConfig plus:

PropTypeDefault
enableML?
boolean
false
mlThreshold?
number
0.85
mlLabels?
ToxicityLabel[]
All 7 labels
preloadML?
boolean
false
combinationMode?
'or' | 'and' | 'ml-override' | 'rules-first'
'or'
borderlineThreshold?
number
0.5

Combination Modes Explained

ModeBehaviorBest For
'or'Flag if either method detectsMaximum sensitivity
'and'Flag only if both methods agreeMaximum precision
'ml-override'Use ML result when availableML-first approach
'rules-first'Use rules for speed, ML for borderlineBalanced performance

Basic Usage

import { HybridFilter } from 'glin-profanity/ml';

// Enable ML with default settings
const filter = new HybridFilter({
  enableML: true,
  languages: ['english'],
  detectLeetspeak: true,
});

await filter.initialize();

// Async check with both methods
const result = await filter.checkProfanityAsync('some text');
console.log(result.isToxic);
console.log(result.reason);

Public Methods

Combination Mode Examples

// Maximum sensitivity - flag if either method detects
const filter = new HybridFilter({
  enableML: true,
  combinationMode: 'or',  // Default
});
await filter.initialize();

// Catches explicit profanity via rules
await filter.isToxicAsync('what the fuck');
// true - rules detected

// Catches subtle toxicity via ML
await filter.isToxicAsync('you are worthless and nobody likes you');
// true - ML detected (no explicit words)

// Use when: You want maximum coverage and can tolerate some false positives
// Maximum precision - both must agree
const filter = new HybridFilter({
  enableML: true,
  combinationMode: 'and',
});
await filter.initialize();

// Both agree - flagged
await filter.isToxicAsync('you fucking idiot');
// true - both rules AND ML detected

// Only rules detect - NOT flagged
await filter.isToxicAsync('damn this is good');
// false - ML doesn't see this as toxic

// Use when: You need high confidence and want to avoid false positives
// ML takes precedence when available
const filter = new HybridFilter({
  enableML: true,
  combinationMode: 'ml-override',
});
await filter.initialize();

// ML says toxic - flagged regardless of rules
await filter.isToxicAsync('you should just disappear');
// true - ML detected threat/toxicity

// ML says clean - NOT flagged even if rules would match
await filter.isToxicAsync('damn good job!');
// false - ML understands positive context

// Use when: You trust ML judgment over word lists
// Fast rules, ML for edge cases
const filter = new HybridFilter({
  enableML: true,
  combinationMode: 'rules-first',
  borderlineThreshold: 0.5,
});
await filter.initialize();

// Rules detect - flagged immediately (fast)
await filter.isToxicAsync('what the fuck');
// true - rules detected, no ML needed

// Rules miss, but ML catches
await filter.isToxicAsync('go kys loser');
// true - rules missed, ML detected

// Use when: You want speed but need ML as a safety net

Complete Example

import { HybridFilter } from 'glin-profanity/ml';

class ContentModerator {
  private filter: HybridFilter;

  constructor() {
    this.filter = new HybridFilter({
      // Rule-based config
      languages: ['english', 'spanish'],
      detectLeetspeak: true,
      allowObfuscatedMatch: true,

      // ML config
      enableML: true,
      mlThreshold: 0.85,
      mlLabels: ['insult', 'threat', 'toxicity', 'identity_attack'],

      // Combination strategy
      combinationMode: 'rules-first',
      borderlineThreshold: 0.5,
    });
  }

  async initialize() {
    await this.filter.initialize();
    console.log('Content moderator ready');
  }

  async moderate(text: string) {
    const result = await this.filter.checkProfanityAsync(text);

    return {
      allowed: !result.isToxic,
      confidence: result.confidence,
      reason: result.reason,
      ruleMatches: result.ruleBasedResult.profaneWords,
      mlCategories: result.mlResult?.matchedCategories ?? [],
    };
  }

  // Fast sync check for simple cases
  quickCheck(text: string) {
    return !this.filter.isProfane(text);
  }
}

// Usage
const moderator = new ContentModerator();
await moderator.initialize();

const result = await moderator.moderate('you are such a loser');
// {
//   allowed: false,
//   confidence: 0.89,
//   reason: 'ML detected (rules missed): insult, toxicity',
//   ruleMatches: [],
//   mlCategories: ['insult', 'toxicity']
// }

Performance Comparison

MethodLatencyUse Case
isProfane() (sync)~0.1msQuick rule-based checks
checkProfanity() (sync)~0.2msDetailed rule-based analysis
checkProfanityAsync()~30-100msFull hybrid analysis
checkProfanityBatchAsync()~10-30ms/itemBulk moderation

For high-throughput systems, use sync methods as a first pass and only call async ML methods for borderline cases or when sync checks pass.

Cross-References