Skip to content.

musings

Sections
Personal tools
You are here: Home » Technical » Ifile

Ifile

Document Actions
How I use ifile to filter mail automatically into the right folders

Ifile is an intelligent mail filtering program which uses a bayesian learning algorithm to categorise incoming mail automatically to the user's preferences. This allows it to learn where the user likes his/her incoming mail without requiring him/her to specify a strict set of rules. It was originally designed to work with the mh mailing system and the slocal filtering system, but Jason Rennie, the author, with help from the Net community, has developed it into a program of general use. I use it with mutt and procmail to automatically filter my mail.

This page explains the scripts I use in conjunction with ifile to make my mail filtering automatic. They're heavily based on Martin Macok's scripts for spam elimination, but where he only uses ifile to filter spam/non-spam email, I use it to filter into over 30 different categories.

Overview

The system works like this:

  • Mail comes in and is passed to procmail
  • Procmail runs ifile to determine which folder to put it in
  • I read mail in mutt, and move it to the correct folder if it's been incorrectly classified
  • Every night, a cron job runs which looks for misclassified mail and passes it back to ifile to 'relearn' the message into the right category

Details

Assume that you have downloaded and installed ifile and Martin Macok's scripts, and that they are to be found somewhere in your $PATH. I keep them in $HOME/bin.

I keep all my mails in a flat hierarchy under ~/Mail, with related mails in the same mailbox. It's therefore simple to build the ifile database to begin with, by running

user@host:~$ for mbox in $(ls ~/Mail);

> do
> [ -f ~/Mail/$mbox ] && ifile.learn.mailbox $mbox ~/Mail/$mbox;
> done

Then, I use procmail to pass all incoming mail to Martin Macok's script ifile.inject-learn.message which adds a X-Ifile-Hint header to the mail indicating which category it should be put in, which equates exactly to which folder it should be put in. Here's my procmailrc.

When I'm reading mail, whether with mutt or through IMAP, I move any misclassified mail into the correct folder. With mutt, this is a matter of a couple of keystrokes, and it doesn't require any special setup on the part of the mail client.

Every night, a cron job runs the script refile.learn, which passes my folders one-by-one through a special procmail script refile.learn.rc. This script looks for messages in the current folder which have been classified by ifile to be in a different folder (X-Ifile-Hint: other_folder) and not yet learned to the current folder (X-Ifile-Learned-To: current_folder), and passes them to Martin's script ifile.relearn.message to correct ifile's terrible misapprehension. Then, formail is called to insert the X-Ifile-Learned-To header to prevent it being processed again.

This combination of tools has classified over 96% of my last 2100 mails correctly (at the time of writing). I don't keep track of any convoluted statistics to see how many mails it misclassifies, but it does an exceptional job. Highly recommended.

All the files you need to implement my solution can be found in a combination of these three places:

Other mail filters

There are essentially two sorts of mail filters. The most common types use a series of defined rules to determine where to file (or whether or not to filter) your mail. They can work very well, but in this age of increasing spam, they're not sufficient to detect all your spam unless you spend a great deal of time updating them.

Recently, much more effort has gone into Bayesian filters, which use statistical techniques based on looking at what similar tokens (words) are to be found in messages which are grouped together. They can be surprisingly effective.

Some mail filters:

  • SpamAssassin is a rules-based mail filter at the Rolls-Royce end. Apparently very effective
  • Bogofilter is Eric Raymond's Bayesian spam filter
  • Spamprobe is Brian Burton's Bayesian spam filter

Note that all of these are pure and simple (effective) spam filters, rather than general filters like ifile.


Created by jack
Last modified 2005-02-23 17:56
Log in
 
« August 2008 »
Su Mo Tu We Th Fr Sa
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31            
 
 

Powered by Plone

This site conforms to the following standards: