This post originated from an RSS feed registered with Python Buzz
by Aaron Brady.
Original Post: More Handling Mail in Python
Feed Title: insom.me.uk
Feed URL: http://feeds2.feedburner.com/insommeuk
Feed Description: Posts related to using Python. Some tricks and tips, observations, hacks, and the Brand New Things.
I have updated some hacks filed under mailtools.
This includes one new hack, a POP3 mail checker that uses Python for filtering mail.
I used to struggle with the procmail syntax for filtering mail, and now that I'm using
PINE on win32 for my primary mail platform, I don't have a good way of checking
and filtering mail.
These two combined lead to getmail.py, a
Pythonic POP3 Processor (phew!). This simple script stands on the shoulders
of the standard library giants, and checks POP3 accounts, downloading them
into mbox files.
Aside from being written in Python, it also uses Python as its filtering
language. Each mail will call a function of your choice, pass it
the account details and the body of the message, and write it out
according to the result.
I'd advise looking at the admittedly messy source, but here's a quick
example script to use to check a few accounts with lists:
import getmail
import re
import sys
def default_filter(account, body):
return open('default', 'ab')
def unknown_list_filter(account, body):
if re.search("^X-List.*" ,body, re.M):
return open('unknown_list', 'ab')
def y_groups_filter(account, body):
if re.search("^(To)|(Cc)|(From):.*@.*" + \
"(groups.yahoo.com)|(yahoogroups.com)",
body, re.M|re.I):
return open('y_groups', 'ab')
def to_cs_filter(account, body):
if re.search("^(To)|(Cc)|(From):.*@crestsource.com",
body, re.M|re.I):
return open('cs', 'ab')
def dump_attachments_filter(account, body):
if re.search("multipart/mixed", body, re.M):
return getmail.null_file() #dump it
def dump_big_mails_filter(account, body):
if len(body) > 100000:
return getmail.null_file()
def chain(account, body, chain):
for link in chain:
r = link(account, body)
if r:
return r
def chain1(account, body):
return chain(account, body, [to_cs_filter,
unknown_list_filter, default_filter])
def chain2(account, body):
return chain(account, body, [dump_big_mails_filter,
y_groups_filter, dump_attachments_filter,
default_filter])
accounts = [
('mail1.insom.me.uk', 'insom', 'pass', chain1),
('mail2.insom.me.uk', 'simon', 'pass', chain2),
('mail3.insom.me.uk', 'mison', 'pass', 'mail3'),
]
getmail.check(accounts, out=sys.stdout)
Caveats: This is a fragile program. Dropped TCP connections
will throw fatal exceptions, as will unopenable files or full filesystems.
That said, it won't DELE until it RETRs, so you shouldn't lose mail.
Also, and this is a big point, it reads the whole mail into memory -
twice. If you are in the habit of receiving 50Mb 2-page Word
documents then this may not be for you.