Building an email/chat filter at work : ecomms 101

There are plenty of reasons for a profanity filter : maybe you’re running a site to which children have access. Or maybe you’re a big company which wishes to protect itself and its employees against reputational risk from work emails and chats and the filter is part of the ecomms surveillance and monitoring process.

Building a great filter is a much harder task than it sounds, I’m afraid. Here are some of the lessons we learned along the way. Hope they help!


The first thing you need to determine is exactly what you wish to block :

  • Cusswords
  • Project codenames
  • Negative but non sweary phrases like ‘rip off’
  • Credit card details
  • etc.

Once you’ve decided and compiled a list the fun begins. Let’s assume all we’re going to want to block is the words ‘ass’ and ‘blue’ to make things really simple.


The first thing we might look at is character replacement. @$$ or 8lue or 8lu3 etc. This is a simple job and you can create a 1-2-1 lookup table. We’ve even found a link for you here.

Now we begin to think about character repetition. Bbbbbblue and so on. This also seems easy but is in fact a bit more complex and involves a bit of statistics and parsing.

mathjpg

In English, for example, there is no word containing the string ‘bbb’ so if we see a repetition like bbblue we can be certain this needs processing and can strip it back to just one ‘b’ i.e. ‘blue’. When we consider ‘sss’ however we know that many words contain ‘ss’ and especially after a vowel. We can imagine, for example, that folks might incorrectly type ‘crosssection’ and would be annoyed if we processed that to ‘crosection’. So – not only the letter itself – but the context of the letters before and after become important in determining when and how much to process repetitions.

Now it is time to move onto conjoined words. If we have ‘blue’ in our filter I’d want it to trigger bluehead, blueface, bluetastic and so on … maybe even #thisissoblue. But … if ‘ass’ is a trigger do we really want to trigger assume? This is what’s entertainingly known as the ‘Scunthorpe Problem‘ (look at the word and I’m sure you’ll understand why). So you need to figure out a way to stem words but reduce the number of false positives as they’ll really annoy users. A whitelist which overrides these triggers in cases like ‘assume’ is one way to approach this. You’d have to build your own whitelist, parsed from a dictionary to match your own trigger logic.

We’ve really only covered some of the basics so far. We’ve not begun grappling with things like:

  • Levenshtein distance
  • How to deal with words that sound like one another even if they don’t look similar orthographically like ‘fuk’
  • How to deal with context … ‘I ripped him off’ is wrong but ‘I ripped my jeans’ is fine
  • How to machine learn from user input and continually improve the algorithms
  • How to support other languages whose grammar and characters differ
  • How to reduce latency so searching and transformations when you have 40,000+ words don’t slow down the user’s PC
  • etc.

With luck you’ve now appreciated this is quite a big job. There’s also an important trade-off that you must always consider when designing a filter : how acceptable are false positives?

Our experience in working with bluechip firms is that, in a work environment, false positives should be avoided at all costs : stopping someone writing something or popping up a warning when they’re doing nothing wrong frustrates folks a lot.

So with the filter you’ll need to find the happy balance between having really loose logic and catching 99%  of offending input but also generating lots of false positives … or more focused logic that catches 95% of offending input but generates very few false positives (better, we find).


Of course … if you’d rather outsource this to us, we’d be pleased to hear from you.

We’ve overcome these kind of challenges before and are running our software at scale within FTSE100 companies so we have lots of experience in protecting companies against electronic communications risks via work emails and chat.