When Data Become Radar:
Tracing Spammers and Phishers Through the
Abuse of the Internet Infrastructure
Klaus Steding-Jessen
CERT.br / NIC.br / CGI.br
[email protected]
Wagner Meira Jr.
e-Speed / DCC / UFMG
[email protected]
APWG CeCOS IV, São Paulo, Brazil – May 11–13, 2010 – p. 1/22
Agenda
SpamPots Project Objectives
Architecture Overview
Mining Spam Campaigns
Ongoing Work
Monitoring Phishings and Fraud Abuses
References
APWG CeCOS IV, São Paulo, Brazil – May 11–13, 2010 – p. 2/22
SpamPots Project Objectives
Better understand the abuse of the Internet infrastructure by
spammers
• measure the problem from a different point of view: abuse of
infrastructure X spams received at the destination
• Help develop the spam characterization research
• Measure the abuse of end-user machines to send spam
• Provide data to trusted parties
– help the constituency to identify infected machines
– identify malware and scams targeting their constituency
• Use the spam collected to improve antispam filters
• Develop better ways to
– identify phishing and malware
– identify botnets via the abuse of open proxies and relays
• Sensors at: AU, AT, BR, CL, NL, TW, US and UY
APWG CeCOS IV, São Paulo, Brazil – May 11–13, 2010 – p. 3/22
Architecture Overview
Spammers, bots
malware, etc
Honeypots emulating
open proxies and open relays
Data Collection:
Data Analysis:
Collects all data periodically;
Data mining process;
Checks honeypots status.
Generate analysis based
on spam content.
Storage
Storage
Members Portal:
Statistics;
Global distribution of spam
campaings;
Sample e−mails, URLs, etc.
Data Warehouse
APWG CeCOS IV, São Paulo, Brazil – May 11–13, 2010 – p. 4/22
Case Study
• IP from Nigeria
• abuse SOCKS Proxy in Brazil
• connects at an ISP in Germany
• to authenticate with a stolen credential
• to send a phishing to .uk victims
• with a link to a phony Egg bank site
• using a South Africa domain
• hosted at an IP address allocated to “UK’s largest
web hosting company based in Gloucester ”
APWG CeCOS IV, São Paulo, Brazil – May 11–13, 2010 – p. 5/22
Case Study (cont.)
From: "Egg Bank Plc"<[email protected]>
Subject: Online Banking Secure Message Alert!
Date: Mon, 19 Apr 2010 14:46:29 +0100
X-SMTP-Proto: ESMTPA
X-Ehlo: user
X-Mail-From: [email protected]
X-Rcpt-To: <victim1>@yahoo.co.uk
X-Rcpt-To: <victim2>@yahoo.com
X-Rcpt-To: <victim3>@yahoo.co.uk
X-Rcpt-To: <victim4>@hotmail.co.uk
(...)
X-Rcpt-To: <victimN>@aol.com
APWG CeCOS IV, São Paulo, Brazil – May 11–13, 2010 – p. 6/22
Case Study (cont.)
X-Sensor-Dstport: 1080
X-Src-Proto: SOCKS 5
X-Src-IP: 41.155.50.138
X-Src-Hostname: dial-pool50.lg.starcomms.net
X-Src-ASN: 33776
X-Src-OS: unknown
X-Src-RIR: afrinic
X-Src-CC: NG
X-Src-Dnsbl: zen=PBL (Spamhaus)
X-Dst-IP: 195.4.92.9
X-Dst-Hostname: virtual0.mx.freenet.de
X-Dst-ASN: 5430
X-Dst-Dstport: 25
X-Dst-RIR: ripencc
X-Dst-CC: DE
APWG CeCOS IV, São Paulo, Brazil – May 11–13, 2010 – p. 7/22
Case Study (cont.)
<table width="561">
<tbody><tr><td><br><font face="Arial" size="2">
You have 1 new Security Message Alert!
<br><br>
Log In into your account to review the new credit limit
terms and conditions..<br>
</font><p><font face="Arial" size="2"><br><font face="Arial">
</font></font><font face="Arial"><a rel="nofollow" target="_blank"
href="http://www.mosaic.org.za/images/index.html">
Click here to Log In</a></font></p>
<font face="Arial">
</font><font face="Arial" size="2">
</font><p><font face="Arial" size="2"><br><br>
Egg bank Online Service<br> </font></p>
<font face="Arial" size="2">
</font><hr>
<font face="Arial" size="2">
<font color="999999" size="1"> Egg bank Security
Department</font></font></td></tr></tbody></table>
APWG CeCOS IV, São Paulo, Brazil – May 11–13, 2010 – p. 8/22
Case Study (cont.)
APWG CeCOS IV, São Paulo, Brazil – May 11–13, 2010 – p. 9/22
Mining
Spam Campaigns
APWG CeCOS IV, São Paulo, Brazil – May 11–13, 2010 – p. 10/22
Motivation
• Spampots collect a huge volume of spams (2 million
spams/day)
• How to make sense of all this data?
– Data Mining!
– Cluster spam messages into Spam Campaigns to
isolate the traffic associated to each spammer
– Correlate spam campaign attributes to unveil different
spamming strategies
APWG CeCOS IV, São Paulo, Brazil – May 11–13, 2010 – p. 11/22
The Pattern Tree Approach
• Features are extracted from spam messages
(subject, URLs, layout etc)
• We organize them hierarquically inserting more
frequent features on the top levels of the tree
• Campaigns delimited by sequence of invariants
APWG CeCOS IV, São Paulo, Brazil – May 11–13, 2010 – p. 12/22
Data reduction
1. The Pattern Tree grouped 350M spam messages into
60K spam campaigns;
2. Obfuscation patterns are naturally discovered!
3. Automatically deals with new and unknown campaign
obfuscation techniques
Pajek
APWG CeCOS IV, São Paulo, Brazil – May 11–13, 2010 – p. 13/22
Pajek
Some Findings
Correlation of campaign language, source and target
unveil spamming strategies, e.g:
1. Campaign Source=BR, ⇒ Campaign
Language=Chinese, Campaign Target=yahoo.com.tw
(confidence=87%)
APWG CeCOS IV, São Paulo, Brazil – May 11–13, 2010 – p. 14/22
Some Findings (2)
1. URLs are the most frequently features obfuscated on
spams; layout remains quite unchanged
2. 10% of spammers abuse both open proxies and open
relays on the same campaign
3. Spammers chain open proxies with open relays to
conceal their identities over the network
4. Windows machines abuse open proxies, Linux abuse
open relays
APWG CeCOS IV, São Paulo, Brazil – May 11–13, 2010 – p. 15/22
Mining Target Address Lists
1. Spamming IPs can be grouped according to the
overlap on their e-mail address lists
2. Complementary to Spam Campaign Analysis
3. Evolution of Spam Campaigns associated to the
same address list
122.116.112.123
122.116.112.163
205.209.142.138205.209.142.52
205.209.143.133
64.56.64.40
205.209.142.67
205.209.142.66
61.158.167.74
61.158.167.61
64.56.64.47
205.209.161.210
218.61.7.29 218.61.7.6
205.209.161.101
64.56.64.52
64.56.64.62
205.209.161.174
205.209.161.226
218.61.7.7
218.161.127.172 218.161.127.123
218.167.101.88
218.161.124.180
61.217.60.177
59.112.196.46
59.115.16.62
205.209.161.62
218.161.120.23
218.167.103.7
218.161.120.210
205.209.161.186
205.209.161.22
205.209.161.146
205.209.161.189
205.209.161.158
205.209.161.227
61.231.49.166
61.217.62.239
205.209.161.214
205.209.161.106
205.209.161.178
205.209.161.14
205.209.161.99
110.232.160.8
112.109.11.10
110.44.130.10 110.232.160.21
112.109.4.214
110.44.136.214
110.232.160.3
112.109.5.10 110.44.137.100
115.166.85.10 110.44.131.10
118.102.37.101
115.166.84.10
110.232.160.14
115.166.87.10
110.44.139.10
113.20.176.10 112.109.7.10
110.44.137.
113.20.187.10
110.232.160.19
110.232.160.13
118.102.35.253
113.20.185.10
112.109.9.10
118.102.37.104118.102.37.100
110.44.128
110.232.160.5
64.56.64.56 110.44.138.10
110.232.160.6
110.44.129.10
120.143.132.53
110.232.160.18
110.232.160.24
113.20.186.10
110.232.160.
110.44.131.100
118.102.37.103
118.102.34.252
120.143.132.173
110.232.163.214
110.232.160.9 110.232.160.20
112.109.8.214
113.20.160.10
113.20.160.214 110.232.160.22
110.232.160.11
110.232.160.2
112.109.11.214
110.232.160.4
113.20.163.214
110.232.160.7
113.20.176.214
113.20.178.214
115.166.86.10
118.102.37.102
110.232.160.15
113.20.187.100
110.232.160.12
113.20.178.10
74.222.1.42
59.112.197.46
61.217.161.211
59.115.17.112
59.112.198.168
59.112.196.128
59.112.198.215
61.228.8.144
59.112.192.17
59.112.192.210
59.112.193.25
59.112.194.234
59.112.198.150
190.64.90.234
123.204.76.37 61.217.60.244
61.217.62.237
61.228.8.95
190.64.67.110
61 217 154 147
APWG CeCOS IV, São Paulo, Brazil – May 11–13, 2010 – p. 16/22
61.62.28.80
Pajek
Ongoing Work
1. combining the views provided from different
spampots
2. factorial design experiment to determine effects of
spampots’ parameters
3. investigating the connection between bots and open
proxies / open relays
APWG CeCOS IV, São Paulo, Brazil – May 11–13, 2010 – p. 17/22
Monitoring Phishings
and Fraud Abuses
APWG CeCOS IV, São Paulo, Brazil – May 11–13, 2010 – p. 18/22
Comparing Brazilian Phishings x US Phishings
• Brazilian Phishing Dataset provided by University of Sao
Paulo
• US Phishing Dataset provided by Jose Nazario (Arbor
Networks)
Tabela: Ocurrence of phishing indicators on Brazilian / US Phishings
dataset
# of phishings
IP-based URLs
Nonmatching URLs
URL Redirection
Malicious Attachment
Suspicious Text
BR
9,475
5%
3%
0.5%
9%
89%
US
4,576
28%
15%
5%
0.1%
70%
Brazilian Phishing less sophisticated; user education could be
highly effective?
APWG CeCOS IV, São Paulo, Brazil – May 11–13, 2010 – p. 19/22
Detecting phishing campaigns with spampots
1. we extracted phishing features from phishing datasets
2. incremental tree update algorithm to detect spam/phishing
campaigns in real time
Phishing
Datasets
Phishing
Features
APWG CeCOS IV, São Paulo, Brazil – May 11–13, 2010 – p. 20/22
References
• A Campaign-based Characterization of Spamming
Strategies. Pedro H. Calais Guerra, Douglas Pires, Dorgival
Guedes, Wagner Meira Jr., Cristine Hoepers, Klaus
Steding-Jessen (CEAS ’08)
• Spamming Chains: A New Way of Understanding
Spammer Behavior. Pedro H. Calais Guerra, Dorgival
Guedes, Wagner Meira Jr., Cristine Hoepers, Marcelo H. P. C.
Chaves, Klaus Steding-Jessen (CEAS ’09)
• Spam Miner: A Platform for Detecting and Characterizing
Spam Campaigns. Pedro H. Calais Guerra, Douglas Pires,
Marco Ribeiro, Dorgival Guedes, Wagner Meira Jr., Cristine
Hoepers, Marcelo H. P. C. Chaves, Klaus Steding-Jessen
(ACM KDD’09 demo paper)
APWG CeCOS IV, São Paulo, Brazil – May 11–13, 2010 – p. 21/22
References
• Brazilian Internet Steering Comittee – CGI.br
http://www.cgi.br/
• Computer Emergency Response Team Brazil – CERT.br
http://www.cert.br/
• Previous presentations about the project
http://www.cert.br/presentations/
• SpamPots Project white paper (in Portuguese)
http://www.cert.br/docs/whitepapers/spampots/
APWG CeCOS IV, São Paulo, Brazil – May 11–13, 2010 – p. 22/22
Download

Tracing Spammers and Phishers Through the