Friday, February 24, 2006

Bayesian Priority Filtering?

Outlook 2003 has a great feature where you can give a message a red flag for follow-up. (Gmail also has the pretty much the same thing, but it's a star instead.) To deal with my overflow of e-mail at work, I have a filter set up to automatically flag some e-mails so I can jump to them first. Any e-mail that's directly addressed to me and has the phrases "can you" or "could you" gets automatically flagged. I find this works really well, except that all my follow-up replies get flagged when I don't need necessarily need them to.
So this got me thinking - computers are smart. Can't my computer figure out what should be flagged and what shouldn't? Shouldn't it learn? Well, it does with spam.
Most of us know about Bayesian Filtering to filter spam - it's probably one of the most common ways of doing so. It's what Gmail uses anyway, and it seems to work pretty well. Basically, instead of building a whole bunch of criteria manually to detect spam upfront, you just say 'report spam' every time you see a spam email. If something gets marked as spam by mistake, you hit 'not spam.' Every time you do that, an algorithm looks at the similarities between spam messages to see what makes them spam, and what makes non-spam non-spam. The more you identify, the better it gets.
So, connect the dots here, and you've potentially got a way of automatically prioritizing e-mail. Every time you flag (or star) something, the ol' Bayesian filter gets a little better. If it flags by mistake, you just tell it it's made a mistake. I'm sure the existing algorithms wouldn't even need to be re-written all that much for them to work for this purpose.
So, to everyone at Google and Microsoft - get to it! (Don't worry, Yahoo, no one expects much from you guys anyway.) This could really change e-mail for the better. Everything that's most important would always be at the top, and those forwards with hilarious videos of singing penises would always be at the bottom.

No comments: