|
|
Annoying spammers with pf and spamd
Introduction
I don't like getting spam. The problem is not detecting it automatically,
that works very well with tools like
SpamAssassin and
bmf.
Even though I can automatically delete spam without reading it, the
spammers still successfully deliver their mails and get paid by
volume. I want to hurt them. They should not be able to deliver their
mails, and waste as much of their resources as possible attempting to
do so.
Tarpits
Tarpits like
spamd
are fake SMTP servers, which accept connections but don't
deliver mail. Instead, they keep the connections open and reply very
slowly. If the peer is patient enough to actually complete the SMTP
dialogue (which will take ten minutes or more), the tarpit returns
a 'temporary error' code (4xx), which indicates that the mail could
not be delivered successfully and that the sender should keep the
mail in his queue and retry again later. If he does, the same procedure
repeats. Until, after several attempts, wasting both his queue space
and socket handles for several days, he gives up. The resources I have
to waste to do this are minimal.
If the sender is badly configured, an uncooperative recipient
might actually delay his entire queue handling for several
minutes each time he connects to the tarpit. And many spammers use
badly configured open relays.
Obviously, I only want known spammers to get connected to my tarpit
instead of my real MTA.
Blacklists
I can use an externally maintained list of spammers like
spamhaus to redirect senders
to the tarpit selectively. But such lists may be either too slow to
include new spamming hosts, or too aggressive for my taste. Some
blacklists will not only include single hosts, but entire networks
that contain a single spamming host, willingly hurting innocent customers
of an ISP to pressure the ISP to terminate the spammer. The blacklist
maintainers document such policies, and if I agree with them, it's my
decision to block mail from such networks by using their blacklist.
But even if I'm comfortable with blocking mail from innocent bystanders
and use the most aggressive blacklists combined, there will still be
spammers getting mails delivered to me through newly discovered open
relays. Those spam mails will of course be detected by my spam
filters, so I'd like to use these IP addresses to build my own
blacklist.
Building my own blacklist
Assume I have the following
procmail
configuration in place to detect (and file) spam:
:0fw
| /usr/local/bin/spamc
:0:
* ^X-Spam-Status: Yes
in-x-spam
Each incoming mail is piped through the spam detector. If it classifies
the mail as spam, the message gets stored in a separate file.
I could delete them instead, but I might want to check
the mails for false positives every once in a while. Once the classifier
is tuned right, there will be almost no false positives, and almost all
spam is detected. I'm reaching 99.95% accuracy here, with maybe 0.01% false
positives, which is fine for me.
Analyzing Received: headers
I'm using one additional tool,
relaydb,
to build a database of all hosts that send me mail.
This is done after the classification by the spam detector,
so I can tell the database whether the sender was sending spam or
legitimate mail.
I add the following part to my procmail configuration:
:0fw
| /usr/local/bin/spamc
:0c
* ^X-Spam-Status: Yes
| /home/dhartmei/bin/relaydb -b
:0:
* ^X-Spam-Status: Yes
in-x-spam
:0c
| /home/dhartmei/bin/relaydb -w
So, detected spam gets piped through relaydb -b (blacklist), and legitimate
mail through relaydb -w (whitelist). Note that only copies of mails get
piped through relaydb, the program never modifies or drops a mail. All it
does is build a database of hosts that sent me mail, counting spam and
legitimate mail from each one.
relaydb traverses all Received: headers in a mail from top (nearest relay) to
bottom. It only acts on valid numerical IP addresses in [] brackets, which is
the only reliable part. And it's only reliable when I trust the previous relay
in the chain, as spammers often add fake Received: headers.
So relaydb starts with the top-most relay in the header and consults its
database to see whether it is a known host, and if so, whether it sent me
legitimate mail before. If that's the case, it increases the respective
counter (spam or legitimate, as told through the -b/-w option) for that host
and continues with the next relay found in the header. If the relay is a
known spammer, traversal ends, as further headers cannot be trusted.
After I run this setup for a while, relaydb has built both a blacklist
and a whitelist. One important detail is that a legitimate mail has more
weight than than a spam mail. I regularly receive spam through mailing
lists. Of course, I don't consider the mailing list server a spamming
host. Yet, each spam I receive through it will increase the spam counter
for that server. Therefore, relaydb only reports hosts as blacklisted
when their spam counter is at least three times as high as the counter
for legitimate mail (and the factor can be adjusted, of course). So
a relay doesn't get blacklisted as long as it sends me legitimate mail
to compensate for spam it sends, which covers mailing list servers. But
if I get a spam from a host that never sent me anything before, that will
cause it to get blacklisted immediately (1 >= 0*3).
Completing the puzzle
Now I'm building my own blacklist, based on the evidence I've seen
myself, classified by my own spam detector configuration. The only
politics involved in someone getting blacklisted are my own, I don't
have to trust a third party to make fair decisions.
And I use this blacklist to redirect hosts to the tarpit, using pf and
some cronjobs:
$ pfctl -sn
rdr inet proto tcp from <spammers> to any port 25 -> 127.0.0.1 port 8025
$ relaydb -lb | pfctl -t spammers -T replace -f -
This requires OpenBSD 3.3 or newer.
Instead of just loading the relaydb blacklist to redirect to spamd,
I could combine it with spamhaus. Or I can use the whitelist to prevent
hosts which have sent me legitimate mail before from getting redirected
to spamd due to a spamhaus listing, etc. There are many interesting
combinations.
Currently, I'm using the following script to assemble a blacklist from
relaydb and external RBLs, then add the relaydb whitelist (overriding
the external RBLs).
#!/bin/sh
# assemble blacklist from relaydb, SBL and CBL
relaydb -lb >~/spammers.tmp
cat ~/rbl/sbl/SBL.cidr | grep -v '#' | cut -f 1 >>~/spammers.tmp
cat ~/rbl/cbl/list.txt | grep -v '#' | grep -v '^:' >>~/spammers.tmp
pfctl -t spammers -Tr -f ~/spammers.tmp
# use relaydb whitelist to remove/negate entries
relaydb -lw | \
pfctl -t spammers -vvTt -f - | \
grep "^M " | grep -v "/" | \
awk '{ printf("%s\n", $2); }' | \
pfctl -t spammers -Td -f -
relaydb -lw | \
awk '{ printf("!%s\n", $1); }' | \
pfctl -t spammers -Ta -f -
spamd-setup(8) can be used to merge multiple black- and whitelists as configured in
spamd.conf(5). This also allows spamd(8) to provide more specific error message to each sender, explaining which blacklist it matched.
And how well does it work?
I'm getting several dozen connections redirected to the tarpit per hour,
and most peers waste about ten minutes per connection, and retry
several times, for multiple days. The impact on my own resources is
minimal. Here's an example:
Aug 24 22:22:29 spamd: 213.30.181.11: connected (9)
Aug 24 22:38:08 spamd: 213.30.181.11: <s.robertson@laposte.net> -> <pf@benzedrine.ch>
Aug 24 22:52:12 spamd: 213.30.181.11: Subject: =?iso-8859-1?Q?CONGRATULATIONS!!!_YOU_HAVE_WON_THE_LOTTE
Aug 24 22:54:57 spamd: 213.30.181.11: From: "=?iso-8859-1?Q?s.robertson@laposte.net?=" <s.robertson
Aug 24 22:55:21 spamd: 213.30.181.11: To: "=?iso-8859-1?Q?s.robertson?=" <s.robertson@laposte.net>
Aug 24 22:56:52 spamd: 213.30.181.11: Body: --_=_XaM3_Bdry.1061517084.2A.222095.42.24084.52.42.1010.609
Aug 24 22:57:44 spamd: 213.30.181.11: Body: Content-Type: text/plain; charset=iso-8859-1
Aug 24 22:58:25 spamd: 213.30.181.11: Body: Content-Transfer-Encoding: quoted-printable
Aug 24 23:00:05 spamd: 213.30.181.11: Body: =0D=0A AGENCY (ACCREDITED LICENSED AGENT TO GLOBAL LOTTER
Aug 24 23:03:16 spamd: 213.30.181.11: Body: L).=0D=0A =0D=0A=0D=0A=0D=0AReference Number:9002478/347=0D
Aug 24 23:04:45 spamd: 213.30.181.11: Body: rs:214/879/551/CM44.=0D=0ASir/Madam, =0D=0AWe are pleased
Aug 24 23:06:56 spamd: 213.30.181.11: Body: f the result of the Lottery Winners International programs
Aug 24 23:09:43 spamd: 213.30.181.11: Body: th August, 2003 that your e-mail address attached to ticket
Aug 24 23:10:13 spamd: 213.30.181.11: disconnected after 2864 seconds.
This spammer got stuck for 47 minutes. Current spamd sets its socket
receive buffer size to one character, forcing the sender to send one
TCP packet for each byte of data, even if its a non-compliant "dump
and disconnect" mailer. Of course, the spammer nearly immediately
tries to retransmit the spam. Repeatedly.
Best of all, I regularly get spam through a mailing list and the sender
(not the mailing list server!) gets blacklisted. Then the same spammer
connects to me directly, too, as it harvested my address like the one
of the mailing list. And it gets stuck in the tarpit. For long. And
many times.
Remember, I'm doing all of this not to reduce the amount of incoming
spam. That gets detected and filed very reliably, anyway. The sole purpose
is to hurt the spammers. And I'm thoroughly enjoying watching my spamd
log now, as I'm perfectly sure that each of those connections comes from
a spammer who has spammed me before.
"Spam me once, shame on you. Spam me twice, shame on me." :)
If you have questions or comments, write to
daniel@benzedrine.ch. And
all you spammers harvesting email addresses from pages like this,
please spam me. My trap is awaiting you.
Update: Since pf, spamd and relaydb have been
ported, the same setup
works on FreeBSD, see
ports
security/pf,
mail/relaydb and
mail/spamd.
History
1.9: February 3, 2015
Properly match entire 172.16/12 in address_private(), found by Alexander Latukhin
Related links
|