Tag Archive: TIP


“Dionaea is meant to be a Nepenthes successor, embedding Python as scripting language, using libemu to detect shellcodes, supporting IPv6 and TLS” (taken from Dionaea homepage). Besides being the most interesting project for trapping malware exploiting vulnerabilities, Dionaea supports a really cool feature which allows it to log to XMPP services as described here. TIP now exploits this feature receiving and storing such logs (really thanks to Markus Koetter for his help and support). Just an example of what happened today…

2010-08-11 10:44:21+0200 [XmlStream,client] [Malware Sample] MD5: e4736922939a028384522b17e9406474
2010-08-11 10:44:21+0200 [XmlStream,client] [Malware Sample] SHA-1: 920b67cb250abdb593b1104a9922e2468b0fe252
2010-08-11 10:44:21+0200 [XmlStream,client] [Malware Sample] PEHash: 40891becb5ec8780f1c5e51f3971c9fb2cc17dab

Another great step forward was taken. Stay tuned!

Today I was in need for fun and so I started adding a new API call which allows to check if a domain is malicious or not. The check avoids to hit the database at all but just queries the search index. The results I got are quite surprising. Take a look at it considering  that code 409 means ‘Object already exists’ while code 410 means ‘Object does not exist’.

Let’s start with a benign domain not tracked by TIP.

buffer@alnitak ~ $ time wget http://xxxx.xxxx.xx/api/check/domain/test@@it/
HTTP request sent, awaiting response… 410 GONE
2010-07-22 17:46:58 ERROR 410: GONE.

real    0m0.017s
user    0m0.001s
sys    0m0.001s

Now let’s move to a malicious domain tracked by TIP.

buffer@alnitak ~ $ time wget http://xxxx.xxxx.xx/api/check/domain/hazelpay@@ru/
HTTP request sent, awaiting response… 409 CONFLICT
2010-07-22 17:47:07 ERROR 409: CONFLICT.

real    0m0.022s
user    0m0.000s
sys    0m0.002s

It’s really a long time I do not post about TIP. The good news is that TIP is starting growing really fast and this is mainly due to its modular design which allows to plug different kind of tracking modules with minimum effort. In this post I’ll provide a brief overview of the new still integrated features and the upcoming ones.

First of all, a new TIP Collector module named Malware was integrated and currently it handles data coming from GLSandbox, a sandbox for automated malware analysis written by Guido Landi. Other than just analyzing malware samples behavior, the idea is to collect additional data coming from such analysis too. An example of such interesting data is related to C&C identification which can be automatically handled by a botnet monitoring tool for further analysis. Another example is related to information about domains which could lead to the identification of new fast-flux domains.GLSandbox code is currently not public but plans exist to release it in the next future. A search engine was integrated in TIP in the last version! The idea is to index the database in order to be able to search into it with great efficiency and performance. In order to implement it, Haystack was used. The first tests were done using Apache Solr (deployed as Apache Tomcat application) as backend and confirm it works like a charm!  A new REST API was designed and realized in order to be able to more easily search and share data with other users and/or applications. The API was realized using Django-Piston and supports OAuth authentication. Moreover the last version of TIP supports Django 1.2 and stops supporting previous versions (due to some incompatible changes between versions 1.1 and 1.2) and introduces support to migrations using South in order to more easily make changes to the database schema while developing.

A lot of new cool features, a lot of upcoming cool ones! Stay tuned!

It’s long time since I don’t write about TIP and its evolution. A lot of things have changed during these last months in order to make TIP more efficient and scalable. So maybe it’s worth to talk about it! First of all, TIP really exploits the Twisted Plugin System as best as it can. As shown below, the Tracking Intelligence Project services are now Twisted commands implemented through the plugin system.

buffer@alnitak ~/tipproject/tip/core $ twistd –help
Usage: twistd [options]
Options:
–savestats              save the Stats object rather than the text output of the profiler.
-o, –no_save           do not save state on shutdown
-e, –encrypted        The specified tap/aos/xml file is encrypted.
–nothotshot             DEPRECATED. Don’t use the hotshot profiler even if it’s available.
-n, –nodaemon        don’t daemonize, don’t use default umask of 0077
-q, –quiet                 No-op for backwards compatibility.
–originalname         Don’t try to change the process name
–syslog                    Log to syslog, not to file
–euid                       Set only effective user-id rather than real user-id.
-l, –logfile=              log to a specified file, – for stdout
-p, –profile=            Run in profile mode, dumping results to specified file
–profiler=                Name of the profiler to use (profile, cprofile, hotshot). [default: hotshot]
-f, –file=                  read the given .tap file [default: twistd.tap]
-y, –python=           read an application from within a Python file (implies -o)
-x, –xml=               Read an application from a .tax file (Marmalade format).
-s, –source=          Read an application from a .tas file (AOT format).
-d, –rundir=           Change to a supplied directory before running [default: .]
–report-profile=     DEPRECATED.

Manage –report-profile option, which does nothing currently.

–prefix=                use the given prefix when syslogging [default: twisted]
–pidfile=               Name of the pidfile [default: twistd.pid]
–chroot=              Chroot to a supplied directory before running
-u, –uid=              The uid to run as.
-g, –gid=              The gid to run as.
–umask=              The (octal) file creation mask to apply.
–help-reactors     Display a list of possibly available reactor names.
–version               Print version information and exit.
–spew                   Print an insanely verbose log of everything that happens. Useful when debugging freezes or locks in complex code.
-b, –debug            run the application in the Python Debugger (implies nodaemon), sending SIGUSR2 will drop into debugger
-r, –reactor=         Which reactor to use (see –help-reactors for a list of possibilities)
–help                    Display this help and exit.
Commands:
tip-fastflux           Tracking Intelligence Project Fast-Flux Tracking service.
tip-collector        Tracking Intelligence Project Collector service
.
ftp                           An FTP server.
telnet                      A simple, telnet-based remote debugging service.
socks                     A SOCKSv4 proxy service.
manhole-old          An interactive remote debugger service.
portforward           A simple port-forwarder.
web                       A general-purpose web server which can serve from a filesystem or application resource.
inetd                     An inetd(8) replacement.
xmpp-router         An XMPP Router server
words                   A modern words server
toc                       An AIM TOC service.
dns                      A domain name server.

This is really useful since it allows to run just the needed modules fine tuning their behaviour as shown below.

buffer@alnitak ~/tipproject/tip/core $ twistd tip-collector –help
Usage: twistd [options] tip-collector [options]
Options:
-o, –one-shot                      Run the collector just one time
-c, –concurrency-level=     Set maximum concurrency level [default: 1]
-s, –reschedule-interval=   Set collector restart interval [default: 21600]
–version
–help                                   Display this help and exit.

buffer@alnitak ~/tipproject/tip/core $ twistd tip-fastflux –help
Usage: twistd [options] tip-fastflux [options]
Options:
-w, –whitelist-force-refresh  Force white-list domain refreshing at every commit
-s, –hot-restart=                 Set hot tracking process restart interval [default: 14400]
-t, –cold-restart=                Set cold tracking process restart interval [default: 7200]
-m, –hot-schedule=            Set hot tracking scheduling interdomain delay [default: 0.1]
-n, –cold-schedule=           Set cold tracking scheduling interdomain delay [default: 0.2]
-k, –cold-delay=                Set cold tracking first-start delay [default: 300]
–version
–help                                 Display this help and exit.

Moreover I’m definitely satisfied about the Fast-Flux Tracking module design which is explained in the commit log reported below.

commit 9ebf0d1b8ac73997f35d70435bdd3da52da6cd5d
Author: Angelo Dell’Aera <buffer@antifork.org>
Date:   Tue Aug 4 10:04:52 2009 +0200

Fast-Flux Tracking Module Domain Queues

. Fast-Flux Tracking Module was modified in order to allow two concurrent domain queues. The first queue is used just for domains which are still known to be fluxy. This is the most I/O intensive queue since it requires most frequently database operations for storing the collected data. These blocking operations are realized through a thread pool and the tests done on the previous version of the module showed these have a detrimental impact on the domain scheduling process slowing it too much. So the second queue was added and it is used for domains not still classified as fluxy. The idea is to minimize blocking operations so if a domain is not fluxy there are no blocking operations at all. If a domain is fluxy, the collected data are saved and then the tracking path ends in such a way that when the first queue will restart it will take care of this new domain. It’s worth noting that this approach allows really frequent restarts of both queues with no destructive interference among them and with a really low memory consumption.

A prerelease is coming. Stay tuned!

A new spamtrap submodule is currently under development. Its targets are spamtraps located on mailservers which I administer. Few of these mailservers generate huge amounts of spam mails and this leads to great performance troubles if you try to download them by POP3/IMAP and then parse. A different approach was thought for situations like these. In fact, I developed a small agent which has to be run on the mailserver host. This agent loops listing the spam files in the maildir and parsing them without any network-based data transfer. When it has done, it saves the interesting data in a serialized form on the filesystem (through the Python cPickle module) and assigns to this data a version number. This allows a remote agent to ask the last version and download just the missing versions. This submodule was developed using Twisted Perspective Broker directly serializing on the wire saved data and currently defines a basic authentication mechanism too. While developing this submodule I was thinking that it could be nice to use it for sharing data between researchers coming from multiple spamtraps. Suggestions are welcome!

Few days ago I started thinking about the scalability limits of the TIP Fast-Flux Tracking module and realized its design was really awful. The approach was based on the idea of assigning a monitoring thread to each fluxy domain. This approach is well suited if the number of threads is quite small but not for what I was just realizing. First of all, when the number of threads starts growing the performance starts decreasing due to the Python Global Interpreter Lock which limits concurrency of a single interpreter process with multiple threads (and there are no improvements in running the process on a multiprocessor system). Moreover, it’s really hard to guarantee each thread enough stack space for running not raising segmentation faults. For these reasons I decided to rewrite the module from scratch and currently I’m testing it. The new design is really simple, effective and scalable and I have to thank Jose Nazario, Marcello Barnaba and Orlando Bassotto for the really interesting talks we had about this matter. Just one process and no monitoring threads. The code is written is such a way not to have blocking calls thus realizing a really asynchronous module. But when a domain starts being monitored there’s the need to access to backend database thus requiring blocking calls. When this happens, the blocking calls are delegated to the Twisted thread pool with a cloned copy of the collected data in order not to compromise code scalability with not necessary locks. Moreover the module is now turning to be a Twisted Application of its own and the first tests done using the Twisted Epoll Reactor are absolutely encouraging. Stay tuned!

In the last days, the inner workings of TIP changed too much. In fact, as soon as I plugged in the new Spamtrap module, I realized that the core engine was far from perfect. In particular, it was designed when I had no precise idea of the work load it had to face and this forced me to rethink about it from scratch. First of all the new implementation is based on the Twisted Application Framework. Using this infrastructure freed me from having to write a large amount of boilerplate code by hooking the application into existing tools that manage daemonization, logging, choosing a reactor and more. Moreover, TIP is moving towards a component-based architecture by using the interfaces and adapters created by the Zope3 team for developing the submodules. The current implementation scales much better than the previous one because every time a module is scheduled, it runs inside its own subprocess controlled by the twistd master process. This design allows to avoid any kind of memory leakage issue which is exactly the reason why I moved towards a new scheduler design. Each subprocess is independent from the others and the main aim of the master process is to synchronize the subprocesses and free resources when they complete their tasks. Another important change which is worth mentioning is about the Fast Flux tracking module which is now handled as a two-pass subprocess in such a way to free resources as soon as it completes the domain fluxiness classification. Right now the first tests are running. Stay tuned!

I spent my last days working on a subtle bug in TIP which didn’t allow a correct engine rescheduling and thus a correct information sources updating. The bug has gone now but I’m realizing how hard is working always close to the limits of the operating system and the database management system. But it’s a nice challenge to face every day so I think I’ll not stopping having fun for a while! While going crazy in realizing where the bug was I introduced a new interesting feature which lets you discover virtual domains associated to an IP address through a SOAP request to the Windows Live Search. I think that this feature could be quite useful in the company I work for in order to easily handle security incidents. Moreover I spent a good amount of time in creating a comfortable Web 2.0 interface for the daily working. I’m not so cool in Ajax and similar matters but I feel quite satisfied about the result. Keep a look at it!

screenshot

Today I came back from my Christmas holidays with the precise idea of rewriting the Fast Flux Tracking module from scratch. In fact, in the last days I observed strange behaviors during its working when the number of domains to monitor exceeded a few thousands. A deep investigation of the code revelead to me the sad truth. While using the monitoring threads I forgot cleaning an object related to asynchronous DNS requests at the thread exit. This lead to a great number of unused socket descriptors flying around thus causing the process to quickly hit the limit of the operating system. Three lines of code were added and everything works fine with about 24000 domains monitored right now. Moreover I think few improvements in the module are on the way. Stay tuned!

Eppur si muove!

TIP (Tracking Intelligence Project) is taking its first steps. In my most beautiful dreams, TIP should be an information gathering framework whose purpose is to autonomously collect Internet threat trends. Currently, TIP is closely monitoring information derived from few publicly available blacklists thus identifying malicious domains and networks. To reach its goal, TIP core engine was designed to be totally asynchronous in order to handle common situations where few thousands of running monitoring threads are needed. It’s a nice challenge but something is moving. Have a look at this Fast-Flux Network that TIP is tracking right now (few information are skipped for obvious reasons).

Stay tuned!

Current Datetime:  2008-12-19 12:01:14.890779
Domain: XXXXXX.XX
set([(‘24.99.40.14′, ‘7922’, ‘US’), (‘24.170.188.201′, ‘13343’, ‘US’), (‘65.78.225.126′, ‘15227’, ‘US’), (‘70.249.156.136′, ‘7132’, ‘US’), (‘12.74.195.185′, ‘7018’, ‘US’), (‘68.80.105.44′, ‘33287’, ‘US’), (‘69.212.242.67′, ‘7132’, ‘US’), (‘75.57.204.104′, ‘7132’, ‘US’), (‘24.196.173.208′, ‘20115’, ‘US’), (‘65.102.56.213′, ‘209’, ‘US’), (‘71.84.127.132′, ‘20115’, ‘US’), (‘76.188.63.80′, ‘11060’, ‘US’), (‘70.230.233.165′, ‘7132’, ‘US’), (‘75.134.56.185′, ‘20115’, ‘US’), (‘68.125.30.251′, ‘7132’, ‘US’), (‘70.235.23.96′, ‘7132’, ‘US’), (‘69.183.233.1′, ‘7132’, ‘US’), (‘24.99.40.14′, ‘7725’, ‘US’), (‘65.65.115.103′, ‘7132’, ‘US’), (‘75.75.104.133′, ‘21508’, ‘US’), (‘68.80.105.44′, ‘7922’, ‘US’), (‘76.243.206.63′, ‘7132’, ‘US’), (‘76.31.181.115′, ‘33662’, ‘US’), (‘68.112.81.129′, ‘19115’, ‘US’), (‘76.100.63.146′, ‘7922’, ‘US’), (‘98.200.194.173′, ‘7922’, ‘US’), (‘65.68.29.83′, ‘7132’, ‘US’), (‘69.214.1.18′, ‘7132’, ‘US’), (‘99.4.106.71′, ‘7132’, ‘US’), (‘76.100.166.114′, ‘7922’, ‘US’), (‘70.242.120.139′, ‘7132’, ‘US’), (‘99.147.192.180′, ‘7132’, ‘US’), (‘67.38.1.229′, ‘7132’, ‘US’), (‘24.216.181.139′, ‘20115’, ‘US’), (‘65.78.225.66′, ‘15227’, ‘US’), (‘70.154.82.100′, ‘6389’, ‘US’), (‘99.14.234.37′, ‘7132’, ‘US’), (‘99.185.120.153′, ‘7132’, ‘US’), (‘208.104.118.101′, ‘14615’, ‘US’), (‘74.138.219.230′, ‘36727’, ‘US’), (‘96.28.227.194′, ‘36727’, ‘US’), (‘76.73.237.59′, ‘12083’, ‘US’), (‘70.252.189.177′, ‘7132’, ‘US’), (‘98.209.249.15′, ‘33668’, ‘US’), (‘165.166.236.74′, ‘21766’, ‘US’), (‘75.14.2.240′, ‘7132’, ‘US’), (‘70.255.31.131′, ‘7132’, ‘US’), (‘98.196.113.58′, ‘33662’, ‘US’), (‘67.190.147.1′, ‘33652’, ‘US’), (‘69.66.237.74′, ‘30160’, ‘US’), (‘75.140.65.220′, ‘20115’, ‘US’), (‘70.245.236.32′, ‘7132’, ‘US’), (‘68.92.101.61′, ‘7132’, ‘US’), (‘68.202.88.12′, ‘13343’, ‘US’), (‘64.205.9.114′, ‘4565’, ‘US’), (‘68.249.101.241′, ‘7132’, ‘US’), (‘12.74.196.251′, ‘7018’, ‘US’), (‘76.31.181.115′, ‘7922’, ‘US’), (‘76.100.166.114′, ‘33657’, ‘US’), (‘75.75.104.133′, ‘7922’, ‘US’), (‘98.196.113.58′, ‘7922’, ‘US’), (‘66.168.247.70′, ‘20115’, ‘US’), (‘76.31.18.86′, ‘33662’, ‘US’), (‘173.17.180.79′, ‘6478’, ‘US’), (‘68.88.237.35′, ‘7132’, ‘US’), (‘24.165.123.218′, ‘12262’, ‘US’), (‘66.40.18.206′, ‘11388’, ‘US’), (‘75.57.76.156′, ‘7132’, ‘US’), (‘68.46.94.202′, ‘33287’, ‘US’), (‘67.10.192.229′, ‘11427’, ‘US’), (‘72.81.245.3′, ‘19262’, ‘US’), (‘97.102.118.61′, ‘10994’, ‘US’), (‘66.61.12.107′, ‘11060’, ‘US’), (‘72.29.41.120′, ‘7018’, ‘US’), (‘70.238.63.194′, ‘7132’, ‘US’), (‘99.140.238.111′, ‘7132’, ‘US’), (‘12.174.145.169′, ‘7018’, ‘US’), (‘173.16.99.131′, ‘6478’, ‘US’), (‘68.58.0.197′, ‘33491’, ‘US’), (‘68.120.80.194′, ‘7132’, ‘US’), (‘98.140.114.227′, ‘16810’, ‘US’), (‘72.48.182.104′, ‘7459’, ‘US’), (‘70.143.32.104′, ‘7132’, ‘US’), (‘76.124.170.244′, ‘7922’, ‘US’), (‘24.10.74.199′, ‘33651’, ‘US’), (‘76.123.76.113′, ‘20214’, ‘US’), (‘76.217.109.205′, ‘7132’, ‘US’), (‘76.114.200.211′, ‘33657’, ‘US’), (‘68.114.165.229′, ‘20115’, ‘US’), (‘151.118.181.151′, ‘3909’, ‘US’), (‘98.200.194.173′, ‘33662’, ‘US’), (‘98.21.234.37′, ‘7029’, ‘US’), (‘24.151.161.136′, ‘20115’, ‘US’), (‘64.179.154.169′, ‘20412’, ‘US’), (‘99.149.194.36′, ‘7132’, ‘US’), (‘76.243.199.248′, ‘7132’, ‘US’), (‘76.27.140.172′, ‘7725’, ‘US’), (‘99.150.11.135′, ‘7132’, ‘US’), (‘64.91.14.27′, ‘5668’, ‘US’), (‘165.166.236.74′, ‘2711’, ‘US’), (‘69.14.27.151′, ‘29737’, ‘US’), (‘68.251.37.64′, ‘7132’, ‘US’), (‘68.121.22.131′, ‘7132’, ‘US’), (‘68.122.57.79′, ‘7132’, ‘US’), (‘70.242.25.29′, ‘7132’, ‘US’), (‘76.124.170.244′, ‘33287’, ‘US’), (‘69.176.46.57′, ‘3801’, ‘US’), (‘205.209.232.253′, ‘13693’, ‘US’), (‘99.139.206.54′, ‘7132’, ‘US’), (‘68.117.155.101′, ‘20115’, ‘US’), (‘98.209.249.15′, ‘7922’, ‘US’), (‘76.252.105.50′, ‘7132’, ‘US’), (‘67.197.98.249′, ‘14615’, ‘US’), (‘76.31.18.86′, ‘7922’, ‘US’), (‘76.100.63.146′, ‘33657’, ‘US’)])

Bad Behavior has blocked 15 access attempts in the last 7 days.