Tag Archive: Projects

Thug 0.4.0 was released on June, 8th 2012 and a huge number of really important features were added since then. During the last two years I had a lot of fun thinking and designing the future of the project and I’m really proud of what Thug is now. I have to thank a lot of persons who contributed with their suggestions, ideas, bug reports and sometimes patches. You know who you are. Really thanks!

But I have decided that it’s time to start a new branch. Thug 0.5.0 will be hopefully released before the end of July and the branch 0.5 will be more focused on performance, scalability and efficiency optimizations.

Moreover I decided to start writing a Know Your Tools (KYT) paper about Thug. Please take a look at the KYE/KYT papers published by the Honeynet Project at https://www.honeynet.org/papers). But we were thinking about using a different approach this time. Historically there exists an Honeynet Project KYE/KYT committee which takes care of the quality of the paper through a strong review process. And this is good obviously. But the paper is not public until it is published. Obviously.

But if I take a look at Thug I realize there are a lot of persons out there which are using it daily sometimes in unexpected ways. And their feedback could be useful as well. So the idea is starting writing the paper and update it in the same GitHub tree (https://github.com/buffer/thug). This could allow everyone to easily contribute to the paper through GitHub pull requests. The KYE/KYT committee will still guarantee the high quality of the paper through its review job but this is the first experiment of a collaborative paper. So if you are a Thug user and want to share some of your experiences, tips and tricks you are welcome to contribute!

I’m glad to announce I publicly released a brand new low-interaction honeyclient I’m working on from a few months now. The project name is Thug and it was publicly presented during the Honeynet Project Security Workshop in Facebook HQ in Menlo Park. Please take a look at the presentation for details about Thug.

Just a few highlights about Thug:

  • DOM (almost) compliant with W3C DOM Core and HTML specifications (Level 1, 2 and partially 3) and partially compliant with W3C DOM Events and Style specifications
  • Google V8 Javascript engine wrapped through PyV8
  • Vulnerability modules (ActiveX controls, core browser functionalities, browser plugins)
  • Currently 6 IE personalities supported
  • Hybrid static/dynamic analysis
  • MITRE MAEC native logging format
  • HPFeeds and MongoDB logging

The source code is available here.

Feedback and comments welcome.

Have fun!

A new improvement in PHoneyC DOM emulation code was committed in SVN r1624. The idea is to better emulate the DOM behaviour depending on the selected browser personality. Let’s take a look at the code starting from the personalities definition in config.py.

39 UserAgents = [
40     (1,
41      “Internet Explorer 6.0 (Windows 2000)”,
42      “Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727)”,
43      “Mozilla”,
44      “Microsoft Internet Explorer”,
45      “4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727)”,
46      “ie60″,
47     ),
48     (2,
49      “Internet Explorer 6.1 (Windows XP)”,
50      “Mozilla/4.0 (compatible; MSIE 6.1; Windows XP; .NET CLR 1.1.4322; .NET CLR 2.0.50727)”,
51      “Mozilla”,
52      “Microsoft Internet Explorer”,
53      “4.0 (compatible; MSIE 6.1; Windows XP; .NET CLR 1.1.4322; .NET CLR 2.0.50727)”,
54      “ie61″,
55     ),
56     (3,
57      “Internet Explorer 7.0 (Windows XP)”,
58      “Mozilla/4.0 (Windows; MSIE 7.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)”,
59      “Mozilla”,
60      “Microsoft Internet Explorer”,
61      “4.0 (Windows; MSIE 7.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)”,
62      “ie70″,
63     ),
64     (4,
65      “Internet Explorer 8.0 (Windows XP)”,
66      “Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; (R1 1.5); .NET CLR 1.1.4322; .NET CLR 2.0.50727)”,
67      “Mozilla”,
68      “Microsoft Internet Explorer”,
69      “4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; (R1 1.5); .NET CLR 1.1.4322; .NET CLR 2.0.50727)”,
70      “ie80″,
71     ),
72 ]

It’s important to realize that each personality was added a tag (i.e. ie80). Taking a look at DOM/Window.py the following code can be seen.

229     def __init_methods(self):
230         for attr in dir(self):
231             prefix = “_Window__window_%s_” % (config.browserTag, )
232             if attr.startswith(prefix):
233                 p = attr.split(prefix)[1]
234                 self.__dict__[‘__cx’].add_global(p, getattr(self, attr))
235                 self.__dict__[‘__cx’].execute(“window.%s = %s;” % (p, p, ))

Let’s consider an example and assume the Internet Explorer 8.0 personality was selected. It’s easy to realize that the prefix would assume the value _Window__window_ie80_. A few simple wrappers were created, one per personality, to each method as shown in the following code.

340     def __window_back(self):
341         “””
342         Returns the window to the previous item in the history.
343         Syntax
345         window.back()
347         Parameters
349         None.
350         “””
351         pass
353     def __window_ie60_back(self):
354         self.__window_back()
356     def __window_ie61_back(self):
357         self.__window_back()
359     def __window_ie70_back(self):
360         self.__window_back()
362     def __window_ie80_back(self):
363         self.__window_back()
365     def __window_firefox_back(self):
366         self.__window_back()

This is a quite simple situation but what if you want to define addEventListener method just for Firefox personalities and attachEvent just for Internet Explorer ones? Really simple to do!

1191     def __window_attachEvent(self, sEvent, fpNotify):
1192         if dataetc.isevent(sEvent, ‘window’):
1193             self.__dict__[sEvent] = fpNotify
1195     def __window_ie60_attachEvent(self, sEvent, fpNotify):
1196         self.__window_attachEvent(sEvent, fpNotify)
1198     def __window_ie61_attachEvent(self, sEvent, fpNotify):
1199         self.__window_attachEvent(sEvent, fpNotify)
1201     def __window_ie70_attachEvent(self, sEvent, fpNotify):
1202         self.__window_attachEvent(sEvent, fpNotify)
1204     def __window_ie80_attachEvent(self, sEvent, fpNotify):
1205         self.__window_attachEvent(sEvent, fpNotify)
1208     def __window_detachEvent(self, sEvent, fpNotify):
1209         if sEvent in self.__dict__ and self.__dict__[sEvent] == fpNotify:
1210             del self.__dict__[sEvent]
1212     def __window_ie60_detachEvent(self, sEvent, fpNotify):
1213         self.__window_detachEvent(sEvent, fpNotify)
1215     def __window_ie61_detachEvent(self, sEvent, fpNotify):
1216         self.__window_detachEvent(sEvent, fpNotify)
1218     def __window_ie70_detachEvent(self, sEvent, fpNotify):
1219         self.__window_detachEvent(sEvent, fpNotify)
1221     def __window_ie80_detachEvent(self, sEvent, fpNotify):
1222         self.__window_detachEvent(sEvent, fpNotify)
1225     def __window_addEventListener(self, type, listener, useCapture = False):
1226         if dataetc.isevent(type, ‘window’):
1227             self.__dict__[type] = listener
1229     def __window_firefox_addEventListener(self, type, listener, useCapture = False):
1230         self.__window_addEventListener(type, listener, useCapture = False)
1233     def __window_removeEventListener(self, type, listener, useCapture = False):
1234         if type in self.__dict__ and self.__dict__[type] == listener:
1235             del self.__dict__[type]
1237     def __window_firefox_removeEventListener(self, type, listener, useCapture = False):
1238         self.__window_removeEventListener(type, listener, useCapture = False)

Moreover this approach could allow to insert specific code within the wrappers if needed while implementing the method functionalities in the higher level __window_<method_name> wrapper.

“Dionaea is meant to be a Nepenthes successor, embedding Python as scripting language, using libemu to detect shellcodes, supporting IPv6 and TLS” (taken from Dionaea homepage). Besides being the most interesting project for trapping malware exploiting vulnerabilities, Dionaea supports a really cool feature which allows it to log to XMPP services as described here. TIP now exploits this feature receiving and storing such logs (really thanks to Markus Koetter for his help and support). Just an example of what happened today…

2010-08-11 10:44:21+0200 [XmlStream,client] [Malware Sample] MD5: e4736922939a028384522b17e9406474
2010-08-11 10:44:21+0200 [XmlStream,client] [Malware Sample] SHA-1: 920b67cb250abdb593b1104a9922e2468b0fe252
2010-08-11 10:44:21+0200 [XmlStream,client] [Malware Sample] PEHash: 40891becb5ec8780f1c5e51f3971c9fb2cc17dab

Another great step forward was taken. Stay tuned!

A few weeks ago I started reviewing the PHoneyC DOM emulation code and realized it was turning to be hard to maintain and debug due to a huge amount of undocumented (and sometimes awful) hacks. For this reason I decided it was time to patch (and sometimes rewrite from scratch) such code. These posts will describe how the new DOM emulation code will work. The patch is not available right now since I’m testing the code but plans exists to commit it in the PHoneyC SVN in the next days.

In this first post we will take a look at the Window object in DOM/Window.py. During object inizialization, the following code is executed.

156     def __init_context(self):
157         “””
158             Spidermonkey Context initialization.
159         “””
160         document = Document(self)
161         self.__dict__[‘__cx’] = self.__dict__[‘__rt’].new_context(alertlist = [])
162         self.__dict__[‘__sl’] = []
163         self.__dict__[‘__fl’] = [document]
165         self.__init_properties(document)
166         self.__init_methods()
167         self.__finalize_context()

Let’s go into further details. First of all Window object properties are initialized through the __init_properties method.

181     def __init_properties(self, document):
182         self.__dict__[‘__cx’].add_global(‘window’, self)
183         self.__dict__[‘__cx’].add_global(‘self’  , self)
184         self.__dict__[‘__cx’].execute(“window.window = window;”)
186         self.__dict__[‘__cx’].add_global(‘document’, document)
187         self.__dict__[‘__cx’].execute(“window.document = document;”)
189         self.__dict__[‘__cx’].add_global(‘location’, document.location)
190         self.__dict__[‘__cx’].execute(“window.location = location;”)
192         self.__dict__[‘__cx’].add_global(“ActiveXObject”, ActiveXObject)
194         self.__dict__[‘__cx’].add_global(“navigator”, Navigator())
195         self.__dict__[‘__cx’].execute(“window.navigator = navigator;”)
197         self.__dict__[‘__cx’].add_global(“screen”, unknown())
198         self.__dict__[‘__cx’].execute(“window.screen = screen;”)
200         if ‘top_window’ in self.__dict__[‘__root’].__dict__:
201             if self.__dict__[‘__referrer’]:
202                 top = self.__dict__[‘__referrer’]
203             else:
204                 top = self.__dict__[‘__root’].top_window
205         else:
206             top = self
208         self.__dict__[‘__cx’].add_global(“top”, top)
209         self.__dict__[‘__cx’].execute(“window.top = top;”)
211         self.__dict__[‘__cx’].add_global(“parent”, top)
212         self.__dict__[‘__cx’].execute(“window.parent = parent;”)
214         self.__dict__[‘__cx’].add_global(“history”, History(document))
215         self.__dict__[‘__cx’].execute(“window.history = history;”)
217         self.__dict__[‘__cx’].execute(“window.innerWidth = 400;”)
218         self.__dict__[‘__cx’].execute(“window.innerHeight = 200;”)
220         self.__init_undefined_properties()
222     def __init_undefined_properties(self):
223         properties = (‘external’, ‘opera’, )
225         for prop in properties:
226             self.__dict__[‘__cx’].execute(“window.%s = undefined;” % (prop, ))

The code should be straightforward to understand. The idea beyond it is really simple. Simply stated this code allows Python objects’ variables and methods to be accessible from JS. Let’s move to most interesting stuff. Following the __init_methods method is called.

228     def __init_methods(self):
229         for attr in dir(self):
230             if attr.startswith(‘_Window__window’):
231                 p = attr.split(‘_Window__window_’)[1]
232                 self.__dict__[‘__cx’].add_global(p, getattr(self, attr))
233                 self.__dict__[‘__cx’].execute(“window.%s = %s;” % (p, p, ))

Not so easy to understand? Let’s take a look to the definition of a method.

322     def __window_back(self):
323         “””
324         Returns the window to the previous item in the history.
325         Syntax
327         window.back()
329         Parameters
331         None.
332         “””
333         pass

This is a private class method since its name starts with __. “If you try to call a private method, Python will raise a slightly misleading exception, saying that the method does not exist. Of course it does exist, but it’s private, so it’s not accessible outside the class. Strictly speaking, private methods are accessible outside their class, just not easily accessible. Nothing in Python is truly private; internally, the names of private methods and attributes are mangled and unmangled on the fly to make them seem inaccessible by their given names.” (taken from Dive Into Python). We can access the __window_back method of the Window class by the name _Window__window_back. This is the black magic __init_methods use for initializing methods. It’s quite easy to realize that adding a new method is really easy. All you need is to simply define a method named __window_<window_method_name> and match the signature of such method. How to emulate such method it’s up to you but a simple pass could do the trick.

The last step happens in __finalize_context method.

169     def __finalize_context(self):
170         self.__dict__[‘__cx’].execute(“Event = function(){}”)
171         self.__dict__[‘__cx’].execute(“function CollectGarbage() {};”)
172         self.__dict__[‘__cx’].execute(“function quit() {};”)
173         self.__dict__[‘__cx’].execute(“function prompt() {};”)
175         for clsname in dataetc.classlist:
176             inits = {‘window’ : self,
177                      ‘tagName': dataetc.classtotag(clsname),
178                      ‘parser’ : None}
179             self.__dict__[‘__cx’].add_global(clsname, DOMObjectFactory(clsname, inits))

The most interesting code is in lines 175-179. First of all let’s take a look at the DOMObjectFactory code (DOM/ClassFactory.py) which is a genuine Python hack.

3 class DynamicDOMObject(DOMObject):
4     def __init__(self):
5         self.__dict__.update(self.inits)
6         DOMObject.__init__(self, self.window, self.tagName, self.parser)
8 def DOMObjectFactory(name, initializers):
9     return type(name, (DynamicDOMObject,), {‘inits’ : initializers})

After reading Python documentation it should be easy to understand how this code works and how it’s able to dynamically add new DOM objects to the context.

type(name, bases, dict)

Return a new type object. This is essentially a dynamic form of the class statement. The name string is the class name and becomes the __name__ attribute; the bases tuple itemizes the base classes and becomes the __bases__ attribute; and the dict dictionary is the namespace containing definitions for class body and becomes the __dict__ attribute. For example, the following two statements create identical type objects:

>>> class X(object):
… a = 1

>>> X = type(‘X’, (object,), dict(a=1))

What about the Window event handlers? They are handled with a different mechanism which can be fully understood just by analyzing how the new DOM emulation code preparses the pages deferring code execution until the last possible moment. I’ll analyze such feature in a future post in greater detail. Right now what you have to know is that if the handler for the event <event> is set, the Window attribute on<event> is set and contains the handler code. Once you understand it, the following code in DOM/DOM.py used for event handling should be easy to understand.

171     def get_event_func(self, name, f):
172         begin = str(f).index(‘{‘) + 1
173         s = str(f)[begin:].split(‘}’)
174         script = ‘}’.join(s[:-1]) + s[-1]
175         return script
177     def event_handler(self, window, name, f):
178         if name in window.__dict__:
179             try:
180                 script = self.get_event_func(name, f)
181                 window.__dict__[‘__cx’].execute(script)
182             except:
183                 #print str(f)
184                 traceback.print_exc()
185                 pass
187     def handle_events(self, window):
188         window.__dict__[‘__warning’] = False
189         self.event_handler(window, ‘onabort’         , window.onabort)
190         self.event_handler(window, ‘onbeforeunload’  , window.onbeforeunload)
191         self.event_handler(window, ‘onblur’          , window.onblur)
192         self.event_handler(window, ‘onchange’        , window.onchange)
193         self.event_handler(window, ‘onclick’         , window.onclick)
194         self.event_handler(window, ‘onclose’         , window.onclose)
195         self.event_handler(window, ‘oncontextmenu’   , window.oncontextmenu)
196         self.event_handler(window, ‘ondragdrop’      , window.ondragdrop)
197         self.event_handler(window, ‘onerror’         , window.onerror)
198         self.event_handler(window, ‘onfocus’         , window.onfocus)
199         self.event_handler(window, ‘onhashchange’    , window.hashchange)
200         self.event_handler(window, ‘onkeydown’       , window.onkeydown)
201         self.event_handler(window, ‘onkeypress’      , window.onkeypress)
202         self.event_handler(window, ‘onkeyup’         , window.onkeyup)
203         self.event_handler(window, ‘onload’          , window.onload)
204         self.event_handler(window, ‘onmousedown’     , window.onmousedown)
205         self.event_handler(window, ‘onmousemove’     , window.onmousemove)
206         self.event_handler(window, ‘onmouseout’      , window.onmouseout)
207         self.event_handler(window, ‘onmouseover’     , window.onmouseover)
208         self.event_handler(window, ‘onmouseup’       , window.onmouseup)
209         self.event_handler(window, ‘onmozorientation’, window.onmozorientation)
210         self.event_handler(window, ‘onpaint’         , window.onpaint)
211         self.event_handler(window, ‘onpopstate’      , window.onpopstate)
212         self.event_handler(window, ‘onreset’         , window.onreset)
213         self.event_handler(window, ‘onresize’        , window.onresize)
214         self.event_handler(window, ‘onscroll’        , window.onscroll)
215         self.event_handler(window, ‘onselect’        , window.onselect)
216         self.event_handler(window, ‘onsubmit’        , window.onsubmit)
217         self.event_handler(window, ‘onunload’        , window.onunload)
218         self.event_handler(window, ‘onpageshow’      , window.onpageshow)
219         self.event_handler(window, ‘onpagehide’      , window.onpagehide)
220         window.__dict__[‘__warning’] = True

Today I was in need for fun and so I started adding a new API call which allows to check if a domain is malicious or not. The check avoids to hit the database at all but just queries the search index. The results I got are quite surprising. Take a look at it considering  that code 409 means ‘Object already exists’ while code 410 means ‘Object does not exist’.

Let’s start with a benign domain not tracked by TIP.

buffer@alnitak ~ $ time wget http://xxxx.xxxx.xx/api/check/domain/test@@it/
HTTP request sent, awaiting response… 410 GONE
2010-07-22 17:46:58 ERROR 410: GONE.

real    0m0.017s
user    0m0.001s
sys    0m0.001s

Now let’s move to a malicious domain tracked by TIP.

buffer@alnitak ~ $ time wget http://xxxx.xxxx.xx/api/check/domain/hazelpay@@ru/
HTTP request sent, awaiting response… 409 CONFLICT
2010-07-22 17:47:07 ERROR 409: CONFLICT.

real    0m0.022s
user    0m0.000s
sys    0m0.002s

It’s really a long time I do not post about TIP. The good news is that TIP is starting growing really fast and this is mainly due to its modular design which allows to plug different kind of tracking modules with minimum effort. In this post I’ll provide a brief overview of the new still integrated features and the upcoming ones.

First of all, a new TIP Collector module named Malware was integrated and currently it handles data coming from GLSandbox, a sandbox for automated malware analysis written by Guido Landi. Other than just analyzing malware samples behavior, the idea is to collect additional data coming from such analysis too. An example of such interesting data is related to C&C identification which can be automatically handled by a botnet monitoring tool for further analysis. Another example is related to information about domains which could lead to the identification of new fast-flux domains.GLSandbox code is currently not public but plans exist to release it in the next future. A search engine was integrated in TIP in the last version! The idea is to index the database in order to be able to search into it with great efficiency and performance. In order to implement it, Haystack was used. The first tests were done using Apache Solr (deployed as Apache Tomcat application) as backend and confirm it works like a charm!  A new REST API was designed and realized in order to be able to more easily search and share data with other users and/or applications. The API was realized using Django-Piston and supports OAuth authentication. Moreover the last version of TIP supports Django 1.2 and stops supporting previous versions (due to some incompatible changes between versions 1.1 and 1.2) and introduces support to migrations using South in order to more easily make changes to the database schema while developing.

A lot of new cool features, a lot of upcoming cool ones! Stay tuned!

About two months ago I started contributing PhoneyC, a pure Python honeyclient implementation originally developed by Jose Nazario. The perception is that our development efforts are moving on the right track. The code can be downloaded here. If you’re interested take a look at the different development branches and give us your feedback. Moreover if you’re interested in technical details about PhoneyC please read this paper by Jose Nazario.

It’s long time since I don’t write about TIP and its evolution. A lot of things have changed during these last months in order to make TIP more efficient and scalable. So maybe it’s worth to talk about it! First of all, TIP really exploits the Twisted Plugin System as best as it can. As shown below, the Tracking Intelligence Project services are now Twisted commands implemented through the plugin system.

buffer@alnitak ~/tipproject/tip/core $ twistd –help
Usage: twistd [options]
–savestats              save the Stats object rather than the text output of the profiler.
-o, –no_save           do not save state on shutdown
-e, –encrypted        The specified tap/aos/xml file is encrypted.
–nothotshot             DEPRECATED. Don’t use the hotshot profiler even if it’s available.
-n, –nodaemon        don’t daemonize, don’t use default umask of 0077
-q, –quiet                 No-op for backwards compatibility.
–originalname         Don’t try to change the process name
–syslog                    Log to syslog, not to file
–euid                       Set only effective user-id rather than real user-id.
-l, –logfile=              log to a specified file, – for stdout
-p, –profile=            Run in profile mode, dumping results to specified file
–profiler=                Name of the profiler to use (profile, cprofile, hotshot). [default: hotshot]
-f, –file=                  read the given .tap file [default: twistd.tap]
-y, –python=           read an application from within a Python file (implies -o)
-x, –xml=               Read an application from a .tax file (Marmalade format).
-s, –source=          Read an application from a .tas file (AOT format).
-d, –rundir=           Change to a supplied directory before running [default: .]
–report-profile=     DEPRECATED.

Manage –report-profile option, which does nothing currently.

–prefix=                use the given prefix when syslogging [default: twisted]
–pidfile=               Name of the pidfile [default: twistd.pid]
–chroot=              Chroot to a supplied directory before running
-u, –uid=              The uid to run as.
-g, –gid=              The gid to run as.
–umask=              The (octal) file creation mask to apply.
–help-reactors     Display a list of possibly available reactor names.
–version               Print version information and exit.
–spew                   Print an insanely verbose log of everything that happens. Useful when debugging freezes or locks in complex code.
-b, –debug            run the application in the Python Debugger (implies nodaemon), sending SIGUSR2 will drop into debugger
-r, –reactor=         Which reactor to use (see –help-reactors for a list of possibilities)
–help                    Display this help and exit.
tip-fastflux           Tracking Intelligence Project Fast-Flux Tracking service.
tip-collector        Tracking Intelligence Project Collector service
ftp                           An FTP server.
telnet                      A simple, telnet-based remote debugging service.
socks                     A SOCKSv4 proxy service.
manhole-old          An interactive remote debugger service.
portforward           A simple port-forwarder.
web                       A general-purpose web server which can serve from a filesystem or application resource.
inetd                     An inetd(8) replacement.
xmpp-router         An XMPP Router server
words                   A modern words server
toc                       An AIM TOC service.
dns                      A domain name server.

This is really useful since it allows to run just the needed modules fine tuning their behaviour as shown below.

buffer@alnitak ~/tipproject/tip/core $ twistd tip-collector –help
Usage: twistd [options] tip-collector [options]
-o, –one-shot                      Run the collector just one time
-c, –concurrency-level=     Set maximum concurrency level [default: 1]
-s, –reschedule-interval=   Set collector restart interval [default: 21600]
–help                                   Display this help and exit.

buffer@alnitak ~/tipproject/tip/core $ twistd tip-fastflux –help
Usage: twistd [options] tip-fastflux [options]
-w, –whitelist-force-refresh  Force white-list domain refreshing at every commit
-s, –hot-restart=                 Set hot tracking process restart interval [default: 14400]
-t, –cold-restart=                Set cold tracking process restart interval [default: 7200]
-m, –hot-schedule=            Set hot tracking scheduling interdomain delay [default: 0.1]
-n, –cold-schedule=           Set cold tracking scheduling interdomain delay [default: 0.2]
-k, –cold-delay=                Set cold tracking first-start delay [default: 300]
–help                                 Display this help and exit.

Moreover I’m definitely satisfied about the Fast-Flux Tracking module design which is explained in the commit log reported below.

commit 9ebf0d1b8ac73997f35d70435bdd3da52da6cd5d
Author: Angelo Dell’Aera <buffer@antifork.org>
Date:   Tue Aug 4 10:04:52 2009 +0200

Fast-Flux Tracking Module Domain Queues

. Fast-Flux Tracking Module was modified in order to allow two concurrent domain queues. The first queue is used just for domains which are still known to be fluxy. This is the most I/O intensive queue since it requires most frequently database operations for storing the collected data. These blocking operations are realized through a thread pool and the tests done on the previous version of the module showed these have a detrimental impact on the domain scheduling process slowing it too much. So the second queue was added and it is used for domains not still classified as fluxy. The idea is to minimize blocking operations so if a domain is not fluxy there are no blocking operations at all. If a domain is fluxy, the collected data are saved and then the tracking path ends in such a way that when the first queue will restart it will take care of this new domain. It’s worth noting that this approach allows really frequent restarts of both queues with no destructive interference among them and with a really low memory consumption.

A prerelease is coming. Stay tuned!

A new spamtrap submodule is currently under development. Its targets are spamtraps located on mailservers which I administer. Few of these mailservers generate huge amounts of spam mails and this leads to great performance troubles if you try to download them by POP3/IMAP and then parse. A different approach was thought for situations like these. In fact, I developed a small agent which has to be run on the mailserver host. This agent loops listing the spam files in the maildir and parsing them without any network-based data transfer. When it has done, it saves the interesting data in a serialized form on the filesystem (through the Python cPickle module) and assigns to this data a version number. This allows a remote agent to ask the last version and download just the missing versions. This submodule was developed using Twisted Perspective Broker directly serializing on the wire saved data and currently defines a basic authentication mechanism too. While developing this submodule I was thinking that it could be nice to use it for sharing data between researchers coming from multiple spamtraps. Suggestions are welcome!

Bad Behavior has blocked 15 access attempts in the last 7 days.