The “Vegan” Diet - For Firms that Can’t Stomach Spam
Every law firm is feeling the frustration of being inundated with excessive amounts of junky, offensive or virus-infected e-mail messages. Unfortunately, there is little that we can do to prevent over-zealous or unscrupulous senders from actually sending these kinds of messages to us, so we are forced to identify and filter messages at the gateway or user desktop. Many firms are finding that the majority of their e-messaging traffic is unwanted and are looking for ways to deal with this growing problem.
At the 2002 Linux SIG business meeting during ILTA’s 2002 Annual Conference, we talked about developing a white paper that would help member firms ease the pain of being connected to the Internet. Our goal was to provide a low-cost, high-performance and flexible method of spam and virus filtering, an e-mail message cleanser would be adaptable to any e-mail system by acting as an SMTP “proxy” to it. We code-named the project “Vegan”—a fun description of a mail system that can’t stomach spam.
The Many Faces of Spam
Before describing the Vegan e-mail system, let’s review the different types of unwanted e-mail messages that choke the gateways. These include incoming messages sent to former employees, mis-addressed messages, spam, viruses, file attachments, and outgoing mail delayed for any number of reasons. All this can overload messaging file systems, create bandwidth bottlenecks, and generally degrade the performance of the e-mail gateway.
Training—a Firm’s Best Weapon
One of the best ways to decrease the amount of spam is with education. Users should be instructed never to respond to spam. Even the seemingly friendly message, “Remove Name from Mailing Lists,” can verify the user address to the spammer. Users should also be advised to limit their use of the firm e-mail system for personal newsgroup postings and online purchases. The firm’s website should not list user addresses of attorneys and staff; rather, it should use form submissions for website generated contacts. Nonetheless, even the most knowledgeable and e-mail-savvy users can be found by some spammers.
Identifying and Blocking Spam
There are many tools and methods available to identify and block spam. The most secure spam gateway is one that can be configured to block it all (unless the source is explicitly allowed). An example is the “TMDA” project based at the SourceForge website. Other e-mail gateways identify spam based on the source of a message. To do this they use various blacklist databases (e.g., RBL or DNSBL) on the Internet, which track known spam sources and open relays. Yet another method of identifying spam is by examining the content of the message to determine whether it is spam-based on common phrases, words, combinations of words, and then scoring it on the placement or number of times these words, etc. occur in a message. An example of this type of filter is SpamAssassin, which scores messages and tags them above a definable threshold. Some of the newer Bayesian-based filtering software can even “learn” how to identify spam based on the expectations and decisions of the recipients.
In addition to the different methods of identifying spam, there are a number of different installation options. Spam control can occur at the enterprise gateway or at the user desktop. The enterprise gateway filter is typically the first choice for businesses, as the cost of managing a single instance of a spam filter is generally cheaper and easier to maintain.
What Vegan Is and How It Works
The Vegan system is Linux-based and uses popular software, most of which is free. Sendmail is used by the majority of Internet e-mail gateways and ISPs, and is highly flexible. Also used is the MailScanner product, which controls the flow in/out of the system. MailScanner passes the message to the virus scanner for virus checking, to SpamAssassin for spam checks, and also looks at its own rule sets for additional instructions on handling e-mail messages.
SpamAssassin uses an internal heuristic rule set to identify spam, and it can also consult online blacklists to compare the message source to known spam sources. The Sophos virus scanner is used for virus checking. These four packages form the heart of the Vegan system. Our white paper documents the configuration of the packages and also helps to identify many other details on configuring the Linux server for reliability and automation of various processes.
Sendmail is an MTA (Message Transfer Agent) that is feature-rich, flexible and complex. Sendmail is the first line of defense in preventing unwanted e-mail messages. In many cases we can use Sendmail to prevent such messages from even entering the system. If we know the exact source and destination of unwanted e-mail messages, Sendmail is configurable to discard these. The Vegan white paper has examples for using the Sendmail “access” database to drop messages for former employees and to drop messages for proven spam sources. This makes the need for additional spam checks and virus checks unnecessary, and it saves CPU time for other activity. Vegan also provides the option of consulting various free DNSBL (Domain Name Server Black List) sites for additional filtering.
MailScanner is the process controller that monitors and moves the messages through the various stages of filtering. MailScanner can also be used to tag disclaimers and confidentiality notices to messages. MailScanner also ensures the reliability of the process by restarting the processes periodically to prevent possibilities of memory leaks or other problems causing service outages.
Sophos produces an excellent virus-filtering program. Available in a Linux version, it is the only component package used by Vegan that is not free and open source. MailScanner passes the messages to Sophos for scanning and makes decisions on message routing based on the result output of the scan program. Using Sophos also provides a bit of “diversity” to the enterprise virus control system, which sometimes traps viruses undetected by other virus scanning software.
SpamAssassin is a premier free GPL (General Public License) software package that identifies spam. SpamAssassin only identifies possible spam, leaving MailScanner to interpret the results of the scan and to make routing decisions.
By default, the Vegan system is configured to delete messages that score over 20 points, tag anything over six points as spam, and then forward these tagged messages to the user. Depending on the results that SpamAssassin returns, MailScanner can prepend the string “SPAM?” to the subject line. This allows users freedom to create rules to delete the spam, move it to different folders, or read it if they wish. This method is relatively risk-free for the IT staff, because all but the most blatant spam is forwarded to the user for personal evaluation.
You Can’t Argue with Success
At Lukins & Annis, P.S., we’ve been successfully running this system for over a year. The only cost associated with the system has been the purchase and maintenance of the Sophos antivirus software. The server is a Pentium 300 with 128 MB of RAM. It handles approximately 20,000 messages a week, of which approximately 50 percent is spam, at less than a one-percent average CPU load. The system is very effective. Using these methods has considerably cut down on the number of unwanted e-mail messages cluttering the system, at very little cost.
The Vegan white paper can be downloaded from ILTANet (ILTA’s members only extranet) and is free for ILTA members. The project is peer- supported on the Linux SIG listserv, as well as the Linux discussion area on ILTANet.
About our author...
David Nevala is the Information Systems Administrator for Lukins & Annis, P.S. in Spokane, Washington and currently serves as ILTA’s Linux SIG Chair. He can be reached at
dnevala@lukins.com.