Enhancing TIP Spamtrap Data Collection

Filed Under: Projects, TIP by buffer — Leave a comment

July 3, 2009

A new spamtrap submodule is currently under development. Its targets are spamtraps located on mailservers which I administer. Few of these mailservers generate huge amounts of spam mails and this leads to great performance troubles if you try to download them by POP3/IMAP and then parse. A different approach was thought for situations like these. In fact, I developed a small agent which has to be run on the mailserver host. This agent loops listing the spam files in the maildir and parsing them without any network-based data transfer. When it has done, it saves the interesting data in a serialized form on the filesystem (through the Python cPickle module) and assigns to this data a version number. This allows a remote agent to ask the last version and download just the missing versions. This submodule was developed using Twisted Perspective Broker directly serializing on the wire saved data and currently defines a basic authentication mechanism too. While developing this submodule I was thinking that it could be nice to use it for sharing data between researchers coming from multiple spamtraps. Suggestions are welcome!

« TIP Fast-Flux Tracking module new design Being for the benefit of Mr. Kite! »

Permanent Link