Archive for the ‘tips & tricks’ Category

MMVI Spam Filters

Thursday, February 14th, 2008

I’ve never thought that I would one day involve myself with the fight against spam. In general I don’t like the idea of computers examining the content of our email and trying to decide for us whether we want to receive that email or not. Computers are appallingly bad at interpreting human writing and images, especially so if the data at hand was created with the specific purpose of fooling them.

The change came with the realisation that the vast majority of spam these days is sent from virus-infested home-computers. I have extensive experience with these drones (as we call them) from my activities as an IRC operator and it soon came to me that it must be possible to differentiate between these end-user computers and proper email-servers. That way, it is possible to accept or deny email based on its source, rather than its content.

Besides allowing you to deny the email even before the actual body text is sent, it also allows you to do so with the sender still on the line. This is a very important benefit. If your filters decide to deny the delivery of the message, your server can tell this directly to the sending party. Since the From and Sender header lines of spam are usually faked, this is the only time you know for sure that you are talking back to the spammer. This way, if it was truly spam, the spammer may notice that you’re not buying it, but more importantly, if it wasn’t spam (a so-called false-positive), then the legitimate sender is also notified, instantly. It may be bothersome for a legitimate sender if their email didn’t make it through, but it’s much worse if their message was just silently discarded, for it may take them a long while to realise that it didn’t arrive. Not sending notifications of failed delivery to the faked From-address is very good practice. If you do, you’re just adding to the problem and run the risk of ending up on blacklists like backscatterers.org.

I’ve been developing the MMVI spam filters for 2 years now, so the ruleset has become quite complicated. But the 3 basic principles it is based on are:

  • denying mail from hosts with obvious generic hostnames or consumer-identifying tokens in their hostnames (for instance 123-123-123-123.isp.com or 294a7g2.adsl.isp.com).
  • denying mail from hosts that send a very wrong HELO/EHLO name in the SMTP transaction. Many home-computers are behind a NAT device, making them unable to know their external IP-address. As such they have difficulties with properly introducing themselves. Also, a lot of spamming software is a hastily hacked-together mess, that tends to mess up things like this.
  • denying mail from hosts that have no or incorrectly configured reverse DNS. These tend to be hosts on poorly configured networks, that seems to go hand-in-hand with poorly secured as well. In many cases, you don’t want anything to do with these.

These rules are not black-and-white on MMVI, I’ve made a lot of attempts to redeem proper mailservers that have slight misconfigurations. The only thing I’ve not yet discussed is the lonely few who run their own mailserver off of their ADSL line at home. Being one of those, I do feel sympathy for them. To make it possible for them to keep doing so, each denial is sent with a specifically generated whitelist address, that can be used by a legitimate sender to whitelist their own hostname. Since both the rejected message and the whitelist request must be delivered by the same system (as they will be in the above scenario), this is very difficult to do for a spammer, who utilises thousands of systems but rebuilds the target list of email-addresses centrally.

Admittedly, this system would not hold up if spammers were to adapt their software specifically against our defenses. At that point though, we could switch the whitelisting procedure to something that properly ensures human-intervention, for instance a Captcha-like mechanism. For now though, small as we are, working with cheap hardware and free software, we enjoy being 99+% spam-free.

Apache blackhole

Wednesday, December 12th, 2007

This tip helps limit the server resources (CPU and bandwidth) taken by worms, probes and misconfigurations hitting your Apache webserver. I’m assuming you already have virtual hosting and mod_rewrite enabled (most sites do).

Generally, requests to the server without a valid server hostname (like www.yourserver.com) will be answered by the topmost entry in your vhost configuration. In almost all cases, this will be either a worm just connecting to your IP-address directly without knowing which sites you run, or a DNS misconfiguration of someone else has sent an unwitting user to your doorstep. In both cases, serving out your ‘default’ website is pointless. You don’t want the worm to probe around your site looking for vulnerabilities and if it really is a misdirected user, they’re not likely to be interested in your default site.

My solution is to create a default site that isn’t actually a site, but rather a short, simple message saying that the web address entered (if any) is wrong. This can simply be done by adding the following vhost entry at the top of the vhost configuration (in the case of apache2: in the file ‘default’ in /sites-available), just below “NameVirtualHost *”:

<VirtualHost *>
  ServerName nohost
  ErrorDocument 403 "The website you requested was not found on this server"
  RewriteEngine on
  RewriteRule . - [F]
</VirtualHost>