Recently I have been faced with the problem, that I haven't read all mails send to me. My sieve-rules are set to mark all SPAM as read if the score is above a certain X-Spam-Score level. It is a practial solution to get them out of your sight, and keep the new mail notifications of my mail reader low. I don't go through every SPAM mail I receive, so it happened that I've missed a view mails really ment for me. One solution, which I've implemented, is to parse all mail located in the SPAM folders and generate a report and send it as mail every 24h.
The following scripts are ment to be used on mailboxes using maildir format. Please note, both scripts are quick-hacks and not optimized. If you have any recommandation for improvements, do not hesitate to leave a comment or send me an email.
#!/usr/bin/env python import email import shelve import datetime import sys import os.path VERSION = '0.0.2' CHECKDIR = ['cur', 'new', 'tmp'] count_spam = 0 def scan_maildir(arg, dirname, names): global count_spam if dirname[len(dirname)-3:] in CHECKDIR and dirname.find('.Junk') >= 0: for file in names: filename = os.path.join(dirname, file) fp = open(filename, 'r') if fp: msg = email.message_from_file(fp) fp.close() #for k in msg.keys(): # print '>>>', k, msg.get(k) id = msg.get('Message-id') if id not in db_junk.keys(): db_junk[id] = {'Date': msg.get('Date'), 'From': msg.get('From'), 'To': msg.get('To'), 'Subject': msg.get('Subject'), 'Score': msg.get('X-Spam-Score'), 'filename': filename, 'lastseen': str(datetime.date.today()), 'reported': False} count_spam += 1 if len(sys.argv) > 1: maildir = sys.argv[1] db = shelve.open(os.path.join(maildir, 'junkmail.db'), 'c') try: if db['version'] != VERSION: print 'Database version missmatch!' print ' Database:', db['version'] print ' Script :', VERSION print 'Aborted.' sys.exit(2) except KeyError: #print 'Initializing database version', VERSION db['version'] = VERSION try: db_junk = db['junk'] except KeyError: db['junk'] = {} db_junk = db['junk'] os.path.walk(maildir, scan_maildir, None) db['junk'] = db_junk # write changes to shelve else: print 'Not enough parameters. Specify path to scan.' sys.exit(1) db.close()
# Example script invocation: $ ./spamsum.py ~/Maildir
The script above only takes the directory to parse as parameter. This directory is used as root where it stores all information about SPAM in a file called junkmail.db. I haven't implemented anything to purge the database, if mails are deleted.
#!/usr/bin/env python import shelve import datetime import sys import os.path import smtplib from email.mime.text import MIMEText VERSION = '0.0.2' SENDERMAIL = 'spamreport@localhost' if len(sys.argv) < 3: print 'Not enough parameters. Specify path to scan.' print 'Usage:\n\t%s <maildir> <email>' % sys.argv[0] sys.exit(1) maildir = sys.argv[1] email = sys.argv[2] db = shelve.open(os.path.join(maildir, 'junkmail.db'), 'w') try: if db['version'] != VERSION: print 'Database version missmatch!' print ' Database:', db['version'] print ' Script :', VERSION print 'Aborted.' sys.exit(2) except: print 'Database is empty.' sys.exit(3) try: db_junk = db['junk'] except: print 'Database is empty.' sys.exit(3) sep_line = '=' * 78 spam_count = 0 line = 'Summary of unreported SPAM mails:\n%s' % sep_line senders = [] text = [line] for id in db_junk: if not db_junk[id]['reported']: junk = db_junk[id] #print id, db_junk[id] line = ' [ ] From: %s --> To: %s\n\ Subject: %s\n\ Date: %s\n\ Spam-Score: %s\n' % (junk['From'], junk['To'], junk['Subject'], junk['Date'], junk['Score']) text.append(line) senders.append((junk['From'], junk['Date'], junk['Score'])) junk['reported'] = True spam_count += 1 db['junk'] = db_junk # sync changes db.close() text.append('\n\n') text.append('Summary of unreported SPAM sender addresses:\n%s' % sep_line) for sender in senders: line = ' [ ] From: %s\n\ Date: %s\n\ Score: %s\n' % (sender[0], sender[1], sender[2]) text.append(line) msg = MIMEText('\n'.join(text)) msg['Subject'] = '%d reported Mails by GoatPr0n SPAM Report' % spam_count msg['From'] = SENDERMAIL msg['To'] = email if spam_count > 0: s = smtplib.SMTP('localhost:10025') # bypass spamassassin s.sendmail(SENDERMAIL, [email], msg.as_string()) s.quit()
# Example script invocation: $ ./spamreport.py ~/Maildir admin@localhost
This script needs to be run with two parameters. The first one is the root directory where junkmail.db is located and the second one is the email address the genereted report is sent to.
#!/bin/sh set -e PATH=/usr/bin:/usr/local/sbin MAILDIRS=/path/to/maildirs for maildirs in $MAILDIRS/*; do for maildir in $maildirs/*; do email=`basename $maildir` spamsum.py $maildir spamreport.py $maildir $email done done
I have placed it in /etc/cron.daily.
Return-Transfer-Encoding: 7bit
Subject: 1 reported Mails by GoatPr0n SPAM Report
From: spamreport@localhost
To: admin@localhost
Summary of unreported SPAM mails:
==============================================================================
[ ] From: "Drugstore #1" <root@localhost> --> To: admin@localhost
Subject: *****SPAM***** Welcome, admin. Everything on -80% today
Date: Tue, 23 Mar 2010 21:56:44 -0500
Spam-Score: 10.5
Summary of unreported SPAM sender addresses:
==============================================================================
[ ] From: "Drugstore #1" <root@localhost>
Date: Tue, 23 Mar 2010 21:56:44 -0500
Score: 10.5
The email addresses in these samples have been altered, so hopefully email crawlers use them to send the spam to their selfs…