Spam filters: Making them work
- — 23 September, 2008 09:10
3. Invest in newer technologies. Trade old, keyword-based technologies for newer ones, such as graylisting tools (see story, next page), says Michael Briggs, director of information technology at George Washington University Law School.
4. Enlist users to help maintain your whitelist. Users are constantly developing relationships with new clients, vendors and other contacts. If you rely on a whitelist of trusted senders, remind users to keep you informed of new contacts so their messages get through quickly and don't risk being flagged as spam.
Better yet, let users set their own spam filter parameters, says Andrew Lochart, vice president of product marketing at e-mail security vendor Proofpoint. Some business travelers, for instance, might actually want weekly airline or car rental notices.
To minimize the false positives caused by spam filters, it helps to understand how these filters work. Here are some popular techniques, in rough chronological order of their development:
Keyword-based and Bayesian filters. The earliest filters searched a subject line and message for particular words, such as Viagra. More sophisticated versions employ Bayesian analyses, which combine keyword searches with techniques such as determining ratios of "good" to "bad" words.
Challenge-response. Unrecognized senders receive a reply asking them to validate themselves by supplying letters and characters that appear in images on screen, a technique also known as CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart).
Blacklisting, whitelisting and reputation listing. The filter evaluates not the message, but the characteristics of the sender.
- Blacklists are databases of the IP addresses of known spammers. The spam filter rejects e-mail from those addresses.
- Whitelists collect the IP addresses of trusted e-mail sources, and the filter automatically accepts e-mail from those addresses. Many spam filters use both blacklists and whitelists.
- Reputation lists broaden blacklists and whitelists by considering not only the sending IP address, but also the entire domain.
Graylisting. A system temporarily rejects e-mail from an unknown IP address and sends an automated response informing the sending system of the temporary failure. Theoretically, a "real" sender will resend the message; a spammer will lack the patience to do so.
Tarpitting. A service on the mail server slows down incoming connections for as long as possible. The delay is meant to discourage spammers by forcing them to take more time to send their spam. But legitimate e-mail also takes longer.
Recurrent pattern detection. These systems monitor the Internet for patterns in spam and maintain and update central databases of such patterns. Company e-mail systems using RPD query the database and reject e-mail identified as spam.