I wanted real-time moderation in Flick.
Users type something. The system checks the content instantly. Harmful content gets blocked before publishing.
The idea looked simple.
The implementation was not.
My first approach used regex.
Every time the user typed:
- evaluate the full paragraph
- run multiple regex patterns
- check censored variations
- handle edge cases
At first, performance looked fine.
Then the banned word list grew.
The browser started lagging during typing.
Typing delays destroy user experience fast. Users notice input lag immediately.
The moderation system worked. The product felt broken.
That forced me to rethink the approach.
Why regex became expensive
The issue was not regex itself.
The issue was frequency.
Moderation ran on every content change:
- every keystroke
- every paste
- every edit
The workload kept increasing:
- more banned words
- more regex patterns
- larger paragraphs
- more edge cases
I also wanted multiple moderation modes.
Example:
Non-strict mode:
bitch → blocked
b*tch → allowed
Strict mode:
b!tch → blocked
b I T C H → blocked
Then came substring issues.
Example:
ass should not block assistant
Regex patterns became larger and harder to maintain.
Performance dropped further.
At some point, the browser froze while evaluating content continuously.
The problem stopped being moderation logic.
The problem became text processing performance.
Rethinking the system
I needed:
- single-pass matching
- predictable performance
- scalable pattern matching
- real-time execution
Regex was not the right tool anymore.
I switched to the Aho-Corasick algorithm.
Why Aho-Corasick solved the issue
Aho-Corasick builds a trie from all banned words.
Instead of checking patterns separately, the algorithm scans the input once and detects all matches together.
This changed performance completely.
Before:
- multiple regex executions
- repeated scans
- increasing slowdown
After:
- single traversal
- stable performance
- responsive typing
The difference became obvious immediately.
Typing lag disappeared.
The browser stopped freezing.
Adding more banned words stopped affecting input performance heavily.
This mattered because moderation runs continuously while the user types.
Client-side moderation vs server-side moderation
Performance improved.
Trust did not.
Client-side moderation alone is insecure.
Users bypass frontend logic easily:
- disable JavaScript
- inject requests
- modify payloads
- bypass UI restrictions
So I split moderation into two layers.
Client-side moderation
The client handles:
- instant feedback
- live moderation
- typing experience
The banned word dataset gets fetched from the server and evaluated locally.
This avoids:
- API calls on every keystroke
- backend overload
- typing latency
Without local evaluation, the server would process moderation requests continuously during typing.
That approach would scale poorly.
Server-side moderation
The server performs final validation before publishing.
This prevents moderation bypasses.
Even if someone manipulates the frontend, the content still goes through server-side checks before storage.
The duplication is intentional.
The client handles speed.
The server handles trust.
Handling repeated evaluations
I noticed another issue.
Users often:
- edit the same sentence repeatedly
- paste identical content
- trigger moderation for unchanged text
Running the full moderation pipeline repeatedly wastes resources.
I added moderation caching.
Previously evaluated content reuses cached results instead of re-processing the entire pipeline.
This reduced repeated work significantly.
Especially for expensive moderation stages later in the pipeline.
Moderation is also a UX problem
Most moderation discussions focus on detection accuracy.
Performance matters equally.
If moderation:
- freezes typing
- delays publishing
- blocks valid content incorrectly
users lose trust in the platform.
A moderation system should stay fast enough to feel invisible during typing.
What I want to improve next
Current improvements I want:
- incremental diff-based evaluation
- Web Worker based processing
- contextual moderation
- better normalization pipeline
- trust-score based moderation rules
The current system works well for real-time moderation.
User behavior keeps changing. Moderation systems need constant iteration.
Final takeaway
The hardest part was not detecting banned words.
The hardest part was running moderation continuously without damaging the typing experience.