Archive for the ‘mmvi’ Category

The Two Servers

Wednesday, January 7th, 2009

After moving our email facilities to the new server already in Juli, we finally got around to moving the various websites we host in November. Having moved all mission-critical services away from the old machine enabled us to give it a serious update. Since it still ran the by-now unsupported Debian Sarge OS, with a 2.4 linux kernel, this was somewhat overdue. With a fresh Debian Lenny install, a bit more RAM and some improvements on the cooling system, it’s now ready for action again. The package selection and configuration was mostly copied from the new server, so we now have, as far as software is concerned, two identical machines, which makes the management a lot easier.

The next step is configuring the old machine as a proper backup server. Using it as a secondary (fallback) email server is of course fairly easy, though it needs to be able to filter spam just as well as the primary, or else our excellent spam defense will break. Providing failover capabilities for other services is not always that straightforward. Dynamic websites need to have their databases in sync and mailboxes need to be consistent between (potentially concurrent) sessions as well. Because our servers are in seperate datacentres, making this synchronisation work is a bit tricky. We’re determined to make this work, however, because we want to have failover capabilities beyond a single datacentre. Modern datacentres have very robust power and network infrastructures, but still it is not inconceivable that whole racks or even cages suffer an outage. The only way to avoid those problems and the risk of administrative mistakes is to spread your equipment across different locations and hosting companies.

Doing automatic failover between locations is difficult, but not impossible. Lacking the use of hardware load-balancers, DNS needs to be used to spread the load on machines and also to move clients away from failing servers.  There will always be some delay there that cannot be avoided. Keeping the TTL low in your DNS zones helps, but values below 1 hour are not always honoured. The problem the other way around, clients arriving at inactive backup servers, we intend to solve using tunnels between the servers. That way, the client can be served from the active machine running the authoritative database, without the need to notify or otherwise redirect the client.

MMVI Spam Filters

Thursday, February 14th, 2008

I’ve never thought that I would one day involve myself with the fight against spam. In general I don’t like the idea of computers examining the content of our email and trying to decide for us whether we want to receive that email or not. Computers are appallingly bad at interpreting human writing and images, especially so if the data at hand was created with the specific purpose of fooling them.

The change came with the realisation that the vast majority of spam these days is sent from virus-infested home-computers. I have extensive experience with these drones (as we call them) from my activities as an IRC operator and it soon came to me that it must be possible to differentiate between these end-user computers and proper email-servers. That way, it is possible to accept or deny email based on its source, rather than its content.

Besides allowing you to deny the email even before the actual body text is sent, it also allows you to do so with the sender still on the line. This is a very important benefit. If your filters decide to deny the delivery of the message, your server can tell this directly to the sending party. Since the From and Sender header lines of spam are usually faked, this is the only time you know for sure that you are talking back to the spammer. This way, if it was truly spam, the spammer may notice that you’re not buying it, but more importantly, if it wasn’t spam (a so-called false-positive), then the legitimate sender is also notified, instantly. It may be bothersome for a legitimate sender if their email didn’t make it through, but it’s much worse if their message was just silently discarded, for it may take them a long while to realise that it didn’t arrive. Not sending notifications of failed delivery to the faked From-address is very good practice. If you do, you’re just adding to the problem and run the risk of ending up on blacklists like backscatterers.org.

I’ve been developing the MMVI spam filters for 2 years now, so the ruleset has become quite complicated. But the 3 basic principles it is based on are:

  • denying mail from hosts with obvious generic hostnames or consumer-identifying tokens in their hostnames (for instance 123-123-123-123.isp.com or 294a7g2.adsl.isp.com).
  • denying mail from hosts that send a very wrong HELO/EHLO name in the SMTP transaction. Many home-computers are behind a NAT device, making them unable to know their external IP-address. As such they have difficulties with properly introducing themselves. Also, a lot of spamming software is a hastily hacked-together mess, that tends to mess up things like this.
  • denying mail from hosts that have no or incorrectly configured reverse DNS. These tend to be hosts on poorly configured networks, that seems to go hand-in-hand with poorly secured as well. In many cases, you don’t want anything to do with these.

These rules are not black-and-white on MMVI, I’ve made a lot of attempts to redeem proper mailservers that have slight misconfigurations. The only thing I’ve not yet discussed is the lonely few who run their own mailserver off of their ADSL line at home. Being one of those, I do feel sympathy for them. To make it possible for them to keep doing so, each denial is sent with a specifically generated whitelist address, that can be used by a legitimate sender to whitelist their own hostname. Since both the rejected message and the whitelist request must be delivered by the same system (as they will be in the above scenario), this is very difficult to do for a spammer, who utilises thousands of systems but rebuilds the target list of email-addresses centrally.

Admittedly, this system would not hold up if spammers were to adapt their software specifically against our defenses. At that point though, we could switch the whitelisting procedure to something that properly ensures human-intervention, for instance a Captcha-like mechanism. For now though, small as we are, working with cheap hardware and free software, we enjoy being 99+% spam-free.

Dell PowerEdge R200

Wednesday, January 23rd, 2008

The Dell PowerEdge R200, a 1U rackserver, has been delivered. It’s wonderfully engineered. The case opens by just one hand-tightened screw. Inside is room for two 3.5″ harddrives, that can be serviced by just pulling out one pin and then taking out the HDD frame. Cooling is done by two sideways-blowing fans, taking the air through the tunnel-shaped CPU heatsink and a seperate flow past the memory DIMMs. A single smaller fan takes care of the cooling for the PSU. The system comes configured with an advanced diagnostics program, available as a boot option from the BIOS. This program does take up 2.2 GB on two primary partitions of the first harddisk.

Both the harddrive as well as the slimline DVD/CDRW combo drive are SATA. This gave us a little trouble, since the linux version we intended to use on it (Debian Etch, the current stable branch) does not properly support these. The installer would not be able to read packages from the cdrom drive, nor would the harddisk appear in the partitioning menu. Trying the various related boot options we could find online did not solve the problem. In the end, we downloaded the Debian testing version (Lenny). This one does support both types of SATA drives, right out of the box. Lenny already seems rather stable and also offers the benefit of the most modern software versions, so we’ve decided to go ahead with it. With luck, by the time we’re ready to go live with the machine, Lenny may have reached stable.

Second server

Wednesday, December 12th, 2007

Joeri and I are looking to make the next step up in the MMVI project by adding a second server. This will give us some redundancy to prevent outages and also a lot more bandwidth so that we can hopefully put everything online that we always wanted to.

Initially we intended to build the second server ourselves (like the previous one) from used hardware. However, some very tempting offers from Dell have made us reconsider. It turns out that we can get a new, Dell-assembled and -tested machine for relatively little extra money. We’re now looking at different hosts, aiming as high as 1 TB (terabyte) of allowed data transfer per month.