We were in Sheridan, Wyoming, half way across the country to Jill's grandmother's house. I logged into my email to find something I hadn't seen in a long time: more spam than real messages. There were a couple dozen spams in my Inbox, and only half a dozen real messages. What happened to my spam filter?
I originally thought it was just a new type of spam not yet recognized by the filter. But then I looked closer and realized that the little signature my spam filter adds to each message was missing-these messages had not even been checked. No wonder they were getting through in such large quantities!
It's only when the tools fail that you come to recognize how valuable they are. In the 5 hours the server spam filter was out, I received more than 50 spams, and many of my customers also noticed immediately. The cause of the outage was a power flicker in the nasty weather Seattle was getting that weekend, which made that server shut down. Fortunately, we had this contingency (and many others) covered for our vacation, and were able to get everything back up and running.
Get that spam out of my Inbox!
We all have to deal with spam in our email, so let's start there. There are three basic places we can currently block spam:
-
In your email program
-
On the email server, after it's been received
-
Before the email server accepts the mail
Client-side filtering
Many email programs have some sort of spam fighting features built in. These work to varying degrees. The spam filter built into Outlook, by most accounts, does not seem to be very effective. The one in Mozilla Thunderbird is more effective over time, but less effective at first because it's able to learn as you go. Some spam-fighting add-ins, such as ... can be very effective, because it uses humans to tell the difference between spam and not spam. Basically you have everyone else on the service reporting their spam, and if you get a spam someone else has already identified, the software removes it from your Inbox automatically.
There are several disadvantages to blocking spam in your email program:
-
You have to download all the spam before it can be filtered. This is especially a problem if you're on a slow connection, or travel to locations with varying connections.
-
it only works for that single e-mail program. It won't block spam for your web mail, or other computers you might want to use, or your Blackberry.
-
It can reduce the effectiveness of server-based spam filters, especially those that rely on you telling it when it makes a mistake.
If you only have a single computer, always use the same software in the same way, and don't have better options available to you through your email service, client-side filtering may be your best option. If that's the case, you'll have the most success with one of the distributed spam tagging services like CloudMark, formerly SpamTag.
Server-side filtering
For most people, filtering on the mail server is the most effective place to block spam. There are dozens of programs that scan your mail as it comes in, identifies whether it's spam or not spam, and either diverts the spam into a quarantine you can check at your leisure, or flags the spam so your email software can trash it.
The problem here is that your options are limited to whatever is available for the server that manages your email. This is generally completely up to your ISP or web host. There are many different anti-spam packages available for Linux servers, and a few available for Windows. These filters use different strategies for identifying spam:
-
Rule-based filters. Perhaps the most popular , and effective, of these is a program called SpamAssassin. These programs have hundreds or thousands of rules to identify how "spammy" a message is. It looks for particular words, phrases, email headers, and other identifying features. Each rule that matches contributes to a score. After running all the rules, if the mail has a high enough score, it's marked as spam. The downside of this approach is that you've got a lot of different settings to manage, lots of dials to turn and knobs to push. That and the rules need constant updating, much like virus filters, to catch the latest generation of spam.
-
Statistical filters. These are also called Bayesian filters, after the statistical calculation that lies at the heart of most of them. They work almost through brute force-they take all of the words that make up a message, find the most significant ones, and compares the frequency of those words with whether they have appeared in more spam messages, or good ones. Unlike the rule-based filters, statistical filters can learn by themselves over time, based on your particular tastes. The downside is that these filters know nothing about spam at first, and it may take some time before they're truly effective. On the plus side, however, after they've been trained, they are extremely accurate and self-correcting, with very little ongoing maintenance.
Figure 1: The Dspam Quarantine after a few minutes
-
Black lists. This type of filter shares information with other people, attempting to block spam at its source. When someone sends a spam, the server they used to send the spam gets added to a "black list" that many servers on the Internet share. From that point on, all mail from the spamming server gets blocked. Not for everyone-just those who subscribe to the black list. The big problem with these services is that lots of innocent people get blocked by these services. If one person complains about your company's newsletter, your mail server could get blacklisted for spamming, and you'll have trouble sending out to a large part of the Internet. This type of filter causes more problems than it solves, putting a lot of innocent people in the position of being guilty until proven innocent. Furthermore, if you happen to share a web host with someone who has sent out spam, you may be blocked even if you've never done anything wrong.
Blocking spam before it arrives
A different approach to blocking spam is to assume everything is spam and only accept mail from people you know. These systems are called "white lists", the opposite of black lists-you specify a list of people who are allowed to send you mail, and the server blocks everything else. Usually these systems will have some sort of "challenge-response" mechanism, so that if you're not on the white list, you can follow a link or something that confirms you're a human before forwarding your message for approval.
Personally, I think this approach is rude. It does transfer the work of identifying spam to the sender of each email, instead of the recipient. If all you ever get is spam, that may be fine-but if you ever hope to gain business from your web site, you'd better not force your potential customer to verify their address. It's akin to asking for ID before you'll talk to somebody. This approach throws potential business out with the viagra.
Free Software of the Month: Dspam
This month we'd like to highlight our favorite spam filter: Dspam. It's a server-side pure statistical spam filter. That means it learns spam according to how you train it. All you have to do is tell it when it's wrong.
We've been running Dspam for several years on our mail server. To date, it has caught over 300,000 spams. We still get a spam or four more days than not, but compared to the hundreds a day it catches, we can live with that.
Figure 2: Dspam adds a signature to the bottom of all of your mail. That's how you can tell whether the filter scanned it...
What's interesting is how the filter learns. For the last couple months, all the spams getting through to my mailbox were messages with text similar to my everyday mail, and a picture with text containing the actual spam, usually a stock market scam. After training a few dozen of these messages, I no longer get them and my inbox is quiet again. The next technique the spammers come up with will surely make it into my Inbox again for a while, but just as surely Dspam will figure it out and keep the spam headaches to a minimum.
Since the first couple weeks I used it, I've had a fairly constant number of spams in my Inbox, one or two a day, at most five. Meanwhile, the amount of spam the filter identified and quarantined has steadily risen, starting at around 50 spams and lately as much as 600 per day. That's why when the spam filter went down for a few hours in Wyoming, I could tell so quickly!
Dspam is server software that runs as part of the email chain on a Linux mail server. You wouldn't install it on your laptop or desktop, and it takes quite a bit of mail server knowledge to be able to get running in the first place. Once it's up and running, it's super easy to use--you just drag any spam you find in your Inbox to a special folder called "Junk", and the server handles the rest. And then periodically go to the web site quarantine to make sure it didn't catch any good mail, and to empty the quarantine. That's how we have Dspam set up, though there are many other ways it can be configured.
Dspam is available on all our email hosting accounts. If you're using Freelock for email and getting more spam than you'd like, drop us a line and we'll turn it on for you. If you're using another host or ISP for email, drop them a line to ask what they are doing to keep spam out of your inbox.
Wrapping up
Spam is a tough problem these days. Fortunately, there's a lot of help available if you know where to look.
All the major providers of free email accounts, such as GMail, Yahoo, Hotmail, etc. have reasonably effective spam filtering available--just make sure you have it turned on.
If you're using your ISP for email, you're stuck with the spam fighting options they make available to you. For most individuals and home users, I would suggest switching your email over to one of the free services with better filtering.
For businesses with a regular domain, your best bet is to get your email host to implement an effective spam filter, or configure your mail to relay through a spam filtering service such as Postini.
Or contact us at Freelock Computing--we provide both hosted email with spam and virus filtering, and anti-spam relay computers. We can convert an old computer you're not using anymore to a spam and virus filter for your whole company. It can sit at the edge of your network protecting your Exchange Server or whatever you want to run as your primary mail server.
Client side filters like those that come with antivirus filters or are built into Outlook or Thunderbird are a last-ditch approach to fighting spam, and generally less effective. If you must go with one of these, try the built-in Junk Mail feature in Mozilla Thunderbird. Or if you must use Outlook or Outlook Express, you could subscribe to CloudMark.
Freelock News
We just became a System Integrator Partner with SpikeSource, an open source startup that puts together, tests, and supports specific configurations of software for medium and large businesses. This adds some serious enterprise muscle to our small business offerings--we now have access to pre-built content management systems, contact management systems, and several other preconfigured options that could be exactly what many of our larger customers are looking for.
Which means we're still hiring. We're looking for people with some solid Linux administration skills. See our career page for more details.
We would like to thank all our clients for their ongoing support of our business throughout the year. We really appreciate it, and wouldn't be here without your referrals and ongoing business. For 2006, we had 50 clients in all, and our business has grown by about 40% since 2005. With more people on board, we're expecting to grow even faster in 2007, but more importantly, be able to provide you with even better service.
Hope you had a great Christmas/Holiday season, and have a great new year!
About Freelock Computing
We're technology experts who provide the running rigging for your business, helping you get that edge you need to compete in today's business world. We specialize in Linux, Apache, MySQL, and PHP, what is known as the "LAMP stack," and in open source consulting.
In short, if you know anyone who needs help installing or administering a Linux server, finding and implementing free software for business, or developing custom web applications, we can help. Your referrals are much appreciated!
Until next time,
John Locke
Manager, Freelock, LLC
Add new comment