Due to popular demand (many would say necessity), UCO/Lick has filters in-place as part of our e-mail service. The objective of this filter system is to remove two types of mail: unsolicited bulk mail (spam), and mail containing a virus. Our system uses Postfix as the primary mail service, with spam scanning provided by dspam, and virus scanning from ClamAV.
With this webpage, we hope to answer most of your questions about the filters in place on our mail system. The most critical thing that you, the UCO/Lick mail user, need to be aware of is that no system is perfect: while we filter out some 95% of incoming spam, and 99% of incoming viruses, no system will ever reach 100% effectiveness. If you have question that are not covered in this document, please contact, as always, nics@ucolick.org.
What does the server do automatically?
There are three things that the server may do to an incoming message: reject it, store it as "maybe" spam, or deliver the message to your inbox.
In what circumstances will the server reject a message?
- Critical failure: the sender formatted the message in such a way that it is not deliverable; perhaps they sent it to a user that no longer exists, or they sent it from a domain that does not exist. This is an unrecoverable error, and the server will respond immediately with an error code that indicates why the message was not deliverable.
- Computer virus: the sender included a virus in the e-mail. The sender will not receive a notice that their message contained a virus, nor will the recipient be notified that they were sent a virus. The flying majority of e-mail-borne viruses use forged e-mail headers (among other things), and the results are generally not actionable in any way.
Sender black-listed: the e-mail was sent from a source included on a domain name service-based spam-blocking list (DNSBL). The server will respond immediately with an error code that indicates why the message was not deliverable. For general information on DNSBLs, please see the DNSBL Resource website. For specific information on the lists used by UCO/Lick Observatory, please see the SBL and XBL sections of the Spamhaus Project website.
Is there anything else that would cause a message to not be delivered locally?
Mail forwarding: any users that have requested their e-mail be forwarded to another address, either by setting up forwarding themselves or by e-mailing NICS, will receive all e-mail originally addressed to their @ucolick.org address. The only exceptions are e-mails that were rejected in the circumstances described above; in particular, no spam filtering is done on forwarded e-mail.
If the incoming message does not trigger any of the above actions, the message may still get filtered as being spam.
How does the UCO/Lick spam filter work?
- All incoming messages are characterized by the spam filter system; dspam builds a set of heuristics that it applies to every incoming message, decides whether it is "ham" (real e-mail) or "spam" (unsolicited bulk e-mail) and adjusts each of the applied heuristics appropriately. If an incoming message matches enough of the spam heuristics, it will be classified as spam, and automatically stored in your spam folder. A typical user profile may have some 300,000 tokens at its disposal to characterize a message; a typical message may have 15-30 heuristics applied to it.
- If you received a message in your inbox that the spam scanner missed, you can educate it! This is the best way to insure that your spam filter remains current and effective.
To train the filter about a missed spam message, bounce, redirect, or forward the message to spam@ucolick.org, and it will correct your personal spam profile.
- When you first start training your personal spam profile, the effectiveness of the filter will ramp up with the quality of the training you provide. A global profile is in-place to assist in the early training stages, but it will never be as effective as a profile tailored to your personal e-mail.
- Messages you received more than 30 days ago cannot be re-trained.
- You should check your spam folder periodically to make sure that no real messages wound up there instead of in your inbox.
The default location for this folder, as seen via IMAP, is INBOX/spam/year, where "year" is the current year-- for example, INBOX/spam/2006.
If a message was misclassified as spam, bounce, redirect, or forward that message to notspam@ucolick.org (or ham@ucolick.org), and it will re-train your personal spam profile. Note that this will only train the spam filter; the message itself will not be relocated to your inbox. You can manipulate messages in this spam folder just like any normal e-mail folder; you can delete any spam you don't feel like keeping, and move any misclassified messages back into your inbox where they belong-- after you've trained them, of course.
- In both cases (either spam or ham), do not include multiple training errors in a single message. dspam learns one message at a time; if you include multiple messages, the filter will only train on the first message it finds. Typically, a bounce or redirect function in your mail client will not combine multiple messages, but the forward function usually will.
If at any point you feel that your personal spam profile has completely lost touch with reality, and that no amount of training will return it to an effective state, NICS can reset it for you. All that's required is that you e-mail NICS and request that your spam profile be reset. You will not be able to re-train any messages that were received before your profile is reset.
To recap: to train a spam message that mistakenly went to your inbox, send (bounce, redirect, forward) it to spam@ucolick.org. To train a real message that mistakenly landed in your spam mailbox, send it to ham@ucolick.org. You have up to 30 days to correct the spam filter's mistakes.
If a message hasn't run astray by now, it will be delivered to your inbox, after going through your personal procmail configuration. See below for more information on procmail.
Can I modify the spam filter for my personal e-mail?
The best way to update your spam filter is to train it regularly with any missed spam/ham messages (see above). Any additional filtering has to happen either via procmail, or in your mail client (such as Thunderbird).
We encourage you to use the system-wide spam filter instead of using any local client-side "junk mail" features. While the spam filters in some mail clients can be quite good, they are not as effective as the system-wide filter. Some of our users receive more than 2,000 spam messages per day, and the system-wide filter catches all but one or two per day.
We strongly discourage using the system-wide filter in conjunction with any spam filters on your local mail client. Either opt out of the system-wide filter, or disable your client-side junk mail feature. Using two filters that perform essentially the same type of filtering will eventually result in both filters being less effective than either one would be individually.
How can I configure my mail client to automatically hide/remove any filtered spam?
In a departure from previous systems, our new spam filtering system will automatically hide any messages that are classified as spam. If you receive spam messages in your inbox, you should re-train the spam filter as explained above.
How do I opt out of using the UCO/Lick spam filter?
Log onto any NICS-managed UNIX host, and do:
cd /home/public/username touch .spam.to.inbox
This will cause any messages flagged as spam by dspam to be delivered to your inbox instead of your spam box. This is especially important to do if you are planning to use a client-side junk mail feature instead of our system-wide filter, but be aware that the system-wide filter will continue to insert information in your e-mail headers about its opinion of the message.
How do I configure procmail?
Procmail is enabled by default for all users, though you do not need to configure it unless you are interested in doing so. The default procmail behavior is to perform no additional filtering.
The configuration file that procmail will read when delivering your e-mail is /home/public/username/.procmailrc. This file is only read for messages that are not identified as spam; spam is delivered via an alternate mechanism that usurps your personal procmail configuration. There are a few short examples and pointers in the default file, but more advanced recipes are beyond the scope of both our default file and this document. For more information about procmail, an excellent place to start is the procmail FAQ, and follow the links at the end of their FAQ if you have further questions. Here's a mirror for the FAQ in case the previous link is busted. There's also all kinds of excellent procmail information over at Infinite Ink.
You can also contact NICS if you have questions about procmail, we use it fairly extensively.
