Quick solution for SPAM reports on your mailserver

Recently I have been faced with the problem, that I haven't read all mails send to me. My sieve-rules are set to mark all SPAM as read if the score is above a certain X-Spam-Score level. It is a practial solution to get them out of your sight, and keep the new mail notifications of my mail reader low. I don't go through every SPAM mail I receive, so it happened that I've missed a view mails really ment for me. One solution, which I've implemented, is to parse all mail located in the SPAM folders and generate a report and send it as mail every 24h.

The following scripts are ment to be used on mailboxes using maildir format. Please note, both scripts are quick-hacks and not optimized. If you have any recommandation for improvements, do not hesitate to leave a comment or send me an email.

Spam gathering

#!/usr/bin/env python
 
import email
import shelve
import datetime
import sys
import os.path
 
VERSION = '0.0.2'
CHECKDIR = ['cur', 'new', 'tmp']
 
count_spam = 0
 
def scan_maildir(arg, dirname, names):
    global count_spam
 
    if dirname[len(dirname)-3:] in CHECKDIR and dirname.find('.Junk') >= 0:
        for file in names:
            filename = os.path.join(dirname, file)
            fp = open(filename, 'r')
            if fp:
                msg = email.message_from_file(fp)
                fp.close()
                #for k in msg.keys():
                #    print '>>>', k, msg.get(k)
                id = msg.get('Message-id')
                if id not in db_junk.keys():
                    db_junk[id] = {'Date': msg.get('Date'),
                                   'From': msg.get('From'),
                                   'To': msg.get('To'),
                                   'Subject': msg.get('Subject'),
                                   'Score': msg.get('X-Spam-Score'),
                                   'filename': filename,
                                   'lastseen': str(datetime.date.today()),
                                   'reported': False}
                    count_spam += 1
 
if len(sys.argv) > 1:
    maildir = sys.argv[1]
 
    db = shelve.open(os.path.join(maildir, 'junkmail.db'), 'c')
    try:
        if db['version'] != VERSION:
            print 'Database version missmatch!'
            print '  Database:', db['version']
            print '  Script  :', VERSION
            print 'Aborted.'
            sys.exit(2)
    except KeyError:
        #print 'Initializing database version', VERSION
        db['version'] = VERSION
    try:
        db_junk = db['junk']
    except KeyError:
        db['junk'] = {}
        db_junk = db['junk']
 
    os.path.walk(maildir, scan_maildir, None)
    db['junk'] = db_junk # write changes to shelve
else:
    print 'Not enough parameters. Specify path to scan.'
    sys.exit(1)
db.close()

# Example script invocation:
$ ./spamsum.py ~/Maildir

The script above only takes the directory to parse as parameter. This directory is used as root where it stores all information about SPAM in a file called junkmail.db. I haven't implemented anything to purge the database, if mails are deleted.

Report generation

#!/usr/bin/env python
 
import shelve
import datetime
import sys
import os.path
import smtplib
from email.mime.text import MIMEText
 
VERSION = '0.0.2'
SENDERMAIL = 'spamreport@localhost'
 
if len(sys.argv) < 3:
    print 'Not enough parameters. Specify path to scan.'
    print 'Usage:\n\t%s <maildir> <email>' % sys.argv[0]
    sys.exit(1)
 
maildir = sys.argv[1]
email = sys.argv[2]
 
db = shelve.open(os.path.join(maildir, 'junkmail.db'), 'w')
try:
    if db['version'] != VERSION:
        print 'Database version missmatch!'
        print '  Database:', db['version']
        print '  Script  :', VERSION
        print 'Aborted.'
        sys.exit(2)
except:
    print 'Database is empty.'
    sys.exit(3)
 
try:
    db_junk = db['junk']
except:
    print 'Database is empty.'
    sys.exit(3)
 
sep_line = '=' * 78
 
spam_count = 0
 
line = 'Summary of unreported SPAM mails:\n%s' % sep_line
senders = []
text = [line]
for id in db_junk:
    if not db_junk[id]['reported']:
        junk = db_junk[id]
        #print id, db_junk[id]
        line = '   [ ] From: %s --> To: %s\n\
      Subject: %s\n\
      Date: %s\n\
      Spam-Score: %s\n' % (junk['From'], junk['To'], junk['Subject'],
                           junk['Date'], junk['Score'])
 
        text.append(line)
        senders.append((junk['From'], junk['Date'], junk['Score']))
        junk['reported'] = True
        spam_count += 1
db['junk'] = db_junk # sync changes
db.close()
 
text.append('\n\n')
text.append('Summary of unreported SPAM sender addresses:\n%s' % sep_line)
for sender in senders:
    line = '   [ ] From: %s\n\
       Date: %s\n\
       Score: %s\n' % (sender[0], sender[1], sender[2])
    text.append(line)
 
msg = MIMEText('\n'.join(text))
msg['Subject'] = '%d reported Mails by GoatPr0n SPAM Report' % spam_count
msg['From'] = SENDERMAIL
msg['To'] = email
 
if spam_count > 0:
    s = smtplib.SMTP('localhost:10025') # bypass spamassassin
    s.sendmail(SENDERMAIL, [email], msg.as_string())
    s.quit()

# Example script invocation:
$ ./spamreport.py ~/Maildir admin@localhost

This script needs to be run with two parameters. The first one is the root directory where junkmail.db is located and the second one is the email address the genereted report is sent to.

Cron job

#!/bin/sh
set -e
 
PATH=/usr/bin:/usr/local/sbin
MAILDIRS=/path/to/maildirs
 
for maildirs in $MAILDIRS/*; do
    for maildir in $maildirs/*; do
        email=`basename $maildir`
 
        spamsum.py $maildir
        spamreport.py $maildir $email
    done
done

I have placed it in /etc/cron.daily.

The resulting report

Return-Transfer-Encoding: 7bit
Subject: 1 reported Mails by GoatPr0n SPAM Report
From: spamreport@localhost
To: admin@localhost

Summary of unreported SPAM mails:
==============================================================================
   [ ] From: "Drugstore #1" <root@localhost> --> To: admin@localhost
      Subject: *****SPAM***** Welcome, admin. Everything on -80% today
      Date: Tue, 23 Mar 2010 21:56:44 -0500
      Spam-Score: 10.5



Summary of unreported SPAM sender addresses:
==============================================================================
   [ ] From: "Drugstore #1" <root@localhost>
       Date: Tue, 23 Mar 2010 21:56:44 -0500
       Score: 10.5

The email addresses in these samples have been altered, so hopefully email crawlers use them to send the spam to their selfs…

Comments




If you can't read the letters on the image, download this .wav file to get them read to you.
Posted 2010/03/24 14:53 · Julian Knauer
blog/2010/03/24.quick.solution.for.spam.reports.on.your.mailserver.txt · Last modified: 2010/03/24 14:53 by jpk
CC Attribution-Noncommercial-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0