Designing Real-Time Text Moderation Without Freezing the Browser

I wanted real-time moderation in Flick.

Users type something. The system checks the content instantly. Harmful content gets blocked before publishing.

The idea looked simple.

The implementation was not.

My first approach used regex.

Every time the user typed:

evaluate the full paragraph
run multiple regex patterns
check censored variations
handle edge cases

At first, performance looked fine.

Then the banned word list grew.

The browser started lagging during typing.

Typing delays destroy user experience fast. Users notice input lag immediately.

The moderation system worked. The product felt broken.

That forced me to rethink the approach.

Why regex became expensive

The issue was not regex itself.

The issue was frequency.

Moderation ran on every content change:

every keystroke
every paste
every edit

The workload kept increasing:

more banned words
more regex patterns
larger paragraphs
more edge cases

I also wanted multiple moderation modes.

Example:

Non-strict mode:

bitch → blocked
b*tch → allowed

Strict mode:

b!tch → blocked
b I T C H → blocked

Then came substring issues.

Example:

ass should not block assistant

Regex patterns became larger and harder to maintain.

Performance dropped further.

At some point, the browser froze while evaluating content continuously.

The problem stopped being moderation logic.

The problem became text processing performance.

Rethinking the system

I needed:

single-pass matching
predictable performance
scalable pattern matching
real-time execution

Regex was not the right tool anymore.

I switched to the Aho-Corasick algorithm.

Why Aho-Corasick solved the issue

Aho-Corasick builds a trie from all banned words.

Instead of checking patterns separately, the algorithm scans the input once and detects all matches together.

This changed performance completely.

Before:

multiple regex executions
repeated scans
increasing slowdown

After:

single traversal
stable performance
responsive typing

The difference became obvious immediately.

Typing lag disappeared.

The browser stopped freezing.

Adding more banned words stopped affecting input performance heavily.

This mattered because moderation runs continuously while the user types.

Client-side moderation vs server-side moderation

Performance improved.

Trust did not.

Client-side moderation alone is insecure.

Users bypass frontend logic easily:

disable JavaScript
inject requests
modify payloads
bypass UI restrictions

So I split moderation into two layers.

Client-side moderation

The client handles:

instant feedback
live moderation
typing experience

The banned word dataset gets fetched from the server and evaluated locally.

This avoids:

API calls on every keystroke
backend overload
typing latency

Without local evaluation, the server would process moderation requests continuously during typing.

That approach would scale poorly.

Server-side moderation

The server performs final validation before publishing.

This prevents moderation bypasses.

Even if someone manipulates the frontend, the content still goes through server-side checks before storage.

The duplication is intentional.

The client handles speed.

The server handles trust.

Handling repeated evaluations

I noticed another issue.

Users often:

edit the same sentence repeatedly
paste identical content
trigger moderation for unchanged text

Running the full moderation pipeline repeatedly wastes resources.

I added moderation caching.

Previously evaluated content reuses cached results instead of re-processing the entire pipeline.

This reduced repeated work significantly.

Especially for expensive moderation stages later in the pipeline.

Moderation is also a UX problem

Most moderation discussions focus on detection accuracy.

Performance matters equally.

If moderation:

freezes typing
delays publishing
blocks valid content incorrectly

users lose trust in the platform.

A moderation system should stay fast enough to feel invisible during typing.

What I want to improve next

Current improvements I want:

incremental diff-based evaluation
Web Worker based processing
contextual moderation
better normalization pipeline
trust-score based moderation rules

The current system works well for real-time moderation.

User behavior keeps changing. Moderation systems need constant iteration.

Final takeaway

The hardest part was not detecting banned words.

The hardest part was running moderation continuously without damaging the typing experience.

Software Engineering Backend Development Content Moderation Web Development Performance