|Only patch if you really need to and only if you understand exactly what the patch does.|
I'm a qmail enthusiast. I've even written a book about it (buy a copy!). One of the things that any Qmail administrator ought to know is that (for a variety of historical reasons) Qmail has a lot of available patches. There are good reasons why you might need each patch, but in general you should try to use as few as possible. It is the general consensus that the more patches you use, the more likely there are to be bugs that crop up and conflicts between patches. The general rule for patching qmail is highlighted across the top of this page. Ignore this rule it at your peril.
That said, I have found a collection of patches that work for me. I have installed them for reasons I explain below, which are not necessarily reasons that matter to everyone. This is a list of the patches I use (and some others I have run across), with descriptions and reasons for using them. Most of these patches were pulled from qmail.org, which has many many more and is worth looking over if you're facing a situation you can't handle.
Some people feel that qmail has certain shortcomings (like non-conformance to RFCs) and either like complaining or have found a solution, or have developed patches to fix the problem. Trust me, the issues have been hashed over again and again and again on the qmail mailing list, and the current state of things seems to keep the most people happy. In most cases, things are the way they are on purpose (please feel free to search the qmail list archives for the explanation of any particular detail!).
These patches all work on netqmail, which you should be using anyway. While vanilla qmail is as cool and unbreachable as ever, netqmail is a convenient packaging of some of the patches that have cropped up as being very important. It is not officially the same thing as qmail, but is a convenience packaging of qmail. For more information, go here.
One final note, some of these patches conflict (or seem to), and resolving them takes a little bit of knowledge of C. If and when I get a chance, I'll put up an über patch collecting and reconciling the patches that I use. For any others, you're on your own.
Apply these patches with the following commands:
patch -p1 < /path/to/patch
40x errors mean, in essence, "try again later". We can quibble over whether that's the smartest thing to do, but as-such, qmail's behavior is consistent with the RFC. The usual complaint is to point out that RFC 2821 says (in section 5):
To provide reliable mail transmission, the SMTP client MUST be able to try (and retry) each of the relevant addresses in this list in order, until a delivery attempt succeeds.Qmail is able, and does so if the lowest-preference MX is unreachable. The RFC does not specify the relationship between failure type and MX choice (for example, the language it uses in section 4.2.1 suggests that retrying within the same connection is acceptable), so again there, qmail is consistent with the RFC's stricture. I think there's a real point to be made out of the fact that the RFC authors require that the client must be able to try other MX entries, rather than saying that the client must always try other MX entries. It leaves plenty of room for choosing different retry policies based on the type of error.
The policy at issue here, however, is regarding
errors given as part of the
greeting. When qmail connects to a
mail server and is greeted with an error (instead of
250 Hi there!”), how should that be
interpreted, and how should it be handled? What
should it mean for that particular message you're
trying to deliver? (This is (primarily) what Matthias
Andree's patch changes.)
Some (vocal) mail administrators seem to be of the opinion that a greeting error is a resonable thing to use to indicate an overload situation (for example, that the server is overwhelmed by a spam attack, and cannot handle additional email at the moment). But consider: is this a reasonable thing to do? If a server cannot (or will not) accept email, and this is known at connection time, why accept the connection? It wastes bandwidth, it wastes server resources, it wastes time. Why would anyone use scarce resources—in the middle of being overloaded—to tell senders about it? Why accept a connection when you cannot accept email? It's more efficient to simply refuse the connection. Imagine if taxicabs worked on the same principle. When they're hired and full, they cannot accept new riders. The easiest and most direct way of not accepting new riders is to ignore the folks on the sidewalk waving at the taxi. The idea that the taxi would pull over to tell them "sorry, I'm busy" seems downright goofy (almost as if the taxi driver is taunting the people on the sidewalk). Similarly, if a server is overloaded, the idea that it would accept connections for the sole purpose of telling the sender "sorry, I'm busy" also seems goofy. If you're busy, you should be using your resources to do your job rather than using them to tell everyone how terribly busy you are. So it seems reasonable to conclude that if a server is willing to accept new connections, then it's probably not overloaded.
However, the more important issue is that when a server KNOWS it cannot accept email for whatever reason, what should it do? First it needs to decide if it wants that email delivery attempt be retried immediately or later? And, in either case, should the sender re-contact the most-preferable MX (i.e. the one that is currently overloaded) or a less-preferable MX when it retries? And once the answers to those questions are determined, what is the correct (and/or most reliable) means of expressing that intent?
Answering the latter question requires answering another question first: what do backup MX records mean? Overload situations (or, more typically, spammers) are NOT the reason to have secondary/backup MX servers (with higher MX priorities). If you have multiple servers available to deal with high load, why not assign them all the same priority and use them ALL during low-load situations as well (e.g. to decrease latency)? Waiting for one server to become overloaded before involving another (that you already had available) is a lousy management policy because it leads to wasted resources (read: wasted money) during normal-load situations (i.e. most of the time). Additionally, overload is particularly undesirable because it leads to slow response time; it's better to avoid overload completely—if you can—by using all the resources at your disposal. Using backup servers to handle overload is not impossible, obviously, because some people do it, but it is not a smart use of resources.
So, what IS the reason to have secondary MX servers? Since SMTP-compliant senders are already required to queue undeliverable messages and retry later, the primary benefit of a backup MX is to reduce latency in catastrophic situations. For example, you may have an arrangement where you can tell the backup MX to deliver all of the messages it's holding for you in a single block immediately after your primary email system comes back online. That way the messages get delivered as soon as you're ready, rather than waiting for all the myriad of senders to realize that you're back online and retry—depending on their retry schedules, that could take hours. So what would you use as a backup MX? If it's just another server you have in the machine room, there's no reason not to use it as one of your primary mail servers... To reiterate, there has to be a reason why the backup is less preferable and is not part of your primary mail system. By knowing that, we can know what sort of penalty is associated with using a backup MX, and how much effort should be put into avoiding paying that penalty.
In my opinion, using the MX preference ordering system as an overload failover system is a bad setup. The best way to handle overload is to avoid ever being overloaded, rather than by arranging a failover. Additionally, it seems wiser to assume that the admin is smart and isn't using the MX ordering as a bad overload compensation technique. We ought to assume that the admin is using the MX preference for a good reason (such as "the backup is an arrangement we have with another company in case of emergencies") rather than a bad reason (such as "the admin doesn't know what he's doing"). Thus, it is reasonable to avoid using the backup MX records unless the primary cannot be contacted at all.
Now then, if one of the primary MX servers cannot accept additional email and intends for deliveries to be retried, what is the most effective and efficient means of ensuring that deliveries are retried to the next-most-preferable MX record? Simple: refuse to accept the connection. Accepting the connection just to emit an error message is not only wasteful, but unreliable. At best, it depends on a particular interpretation of a "proposed standard" revision of the email RFC that not everyone agrees on.
For what it's worth, qmail always tries to get the most-preferable MX. Unreachable IPs, however, get added to a 1-hour do-not-try list (see qmail-tcpto man page; it's slightly more complicated, but not much). If the lowest MX was unreachable, qmail won't retry that unreachable IP for an hour, and so will retry the higher MXs. After the one hour timeout, qmail will allow itself to retry the lower MX to see if it came back. Since the standard retry schedule has two retries in less than an hour (one 6 minutes after the first, and the next 26 minutes after the first), if the lowest MX was unreachable, the next two retries will be to the higher MX, but the fourth delivery attempt will start with the lowest MX again.
Once upon a time, back in 1996, there was a really unfortunate bug in the most popular DNS server software (BIND 4.9.3): it did not respond correctly to "CNAME" requests (that is to say, requests for any CNAME data about a particular domain name). This is critical information that an email server needs to know to do its job. Thankfully, there was a way to work around the problem: "ANY" requests. These requests ask the DNS server, essentially, for any and ALL information it has about the domain name in question, including CNAME information.
These ANY queries have two big problems:
This patch reverts that workaround, and uses only CNAME requests instead of ANY requests. Even for big domains that use a CNAME redirection, this answer is tiny.
Technically, using this patch risks being unable to deliver mail to anyone using such an ancient version of BIND. Back in 2002, that was less than 2% of all DNS servers. Today? It's hard to imagine anyone still does.
user@remoteaddress@thismailserver, for example, or
user%remoteaddress@thismailserver(aka "the percent hack"). Unfortunately, qmail doesn't reject all of these attempts out of hand, but instead accepts them and generates a bounce message. This behavior is technically valid, but is unwise: it can be used to create bounce-spam. It's important to state: this does NOT make qmail a relay... but it can be used as a bounce-spam source (though the content of the bounce is not entirely dictated by the sender). Some automated relay testing software assumes that if the message is accepted then it will be delivered/relayed instead of bounced (or black-holed), and as a result will provide an inaccurate diagnosis: that your server is an open-relay. Such relay testers are incorrect. However, as a result, you may get onto one or two blacklists that are based on such relay checks, even if you delete such email messages. To avoid this hassle, use this patch, written by Russel Nelson, to reject such relay attempts. This patch precludes using the percent hack, but you shouldn't be allowing that anyway so it's no big loss. (local copy) (qmail.org)
make certbefore running
make setup check). Also, you must create a cron job to rebuild the certs daily (because otherwise, over time, an attacker could figure out what they are). Commonly, when someone indicates that they want qmail to support SSL/STARTTLS they will be referred to a project like mailfront. While mailfront is a worthy project, it doesn't solve the entire problem. Specifically, it doesn't enable qmail to use SSL for sending mail to other servers that support STARTTLS (this is a problem of privacy; but keep in mind that if the email is being relayed, it may be transmitted via an unencrypted communication later—if you're really worried, use PGP). This patch, however, does enable qmail to do that. (local copy) (inoa.net)
More recently, Amitai Schlier pulled together a selection of useful tools for doing recipient validation using a variety of criteria. You can see his work here.
Using such a patch, it is trivial to implement things like three-tuple greylisting (based on RECIPIENT, SENDER, and TCPREMOTEIP). You can also, as Soffian suggests, use a script that queries another server to see if the recipient is valid. I like this script in part because I can use different verification techniques depending on the domain (for example, I can do one kind of check for lists.memoryhole.net, and another for memoryhole.net itself). I often hear questions on the qmail mailing list that could be solved simply and easily with this (relatively trivial) patch, and every time, I am re-impressed with the power and flexibility that this patch provides. It doesn't get enough credit.
Here's how it works: when the environment variable RCPTCHECK is set, qmail-smtpd will execute the program specified in that variable. Before the program is executed, the recipient in question is stored in the RECIPIENT environment variable, and the sender is stored in the SENDER environment variable. The exit code of the specified program determines whether qmail views the recipient as valid or not. Possible exit codes include:
A trivial example script would be something like this:
#!/bin/sh GoodRecipient=0 BadRecipient=100 Error=120 if grep "$RECIPIENT" /var/qmail/control/goodrcptto >/dev/null; then exit $GoodRecipient else exit $BadRecipient fi
Here's a slightly more complex example:
#!/bin/sh GoodRecipient=0 BadRecipient=100 Error=120 User=$( echo "$RECIPIENT" | cut -d@ -f1 ) Domain=$( echo "$RECIPIENT" | cut -d@ -f2- ) if ! type id >/dev/null 2>&1 ; then # id program not in PATH exit $Error fi if id "$User" >/dev/null 2>&1 ; then exit $GoodRecipient else exit $BadRecipient fi
A basic example of how to do greylisting with this patch is here (txt) (based on Kelly French's script). Note that you'd want to combine that script with some other kind of recipient validation as well, and needs a cron job to clean up after itself.
Integrating that script into your qmail setup is simple: rename the qmail-remote program to qmail-remote.orig and put that script in as qmail-remote (make sure it's readable and executable by everyone—note that that is different from the original qmail-remote permissions). The script uses two programs to do its job: the dktest program that comes with libdomainkeys and dkimsign.pl that comes with Perl's Mail::DKIM module. (That script expects that dkimsign.pl accepts the --key argument; if yours does not, you can use this simple patch (txt) to modify your dkimsign.pl so that it does accept the --key argument, or download the copy below.)
If you're interested in verifying both DKIM and DomainKey signatures, a similar script that can be used in much the same way as Russ Nelson's program is here (txt). This script relies on dktest as well, but also requires a script called dkimverify.pl. A similarly named script comes with Mail::DKIM, but is not particularly useful; I wrote one that generates some useful headers, which is available below.
Generic copies of those scripts can be had here: dkimsign.pl (txt) and dkimverify.pl (txt)
todo") and the delivery side. In vanilla qmail, the program that spawns email delivery agents to empty the delivery side of the queue is the same program that drains the ingestion side of the queue. It must recognize that new email has been added to the mail ingestion queue, parse it, generate the necessary metadata, and places the message into the mail delivery queue. When email comes into the server extremely quickly, qmail can sometimes spend so much time draining the ingestion queue that deliveries don't get scheduled and the email starts accumulating in the "ingest" side of the mail queue. If this is sustained, most of your email queue may be in an undeliverable state in the
todoqueue rather than the delivery queue (note that this is only a problem when you get massive amounts of email at the same time (many per second; the threshold is system-specific)). This behavior is often referred to as "silly qmail syndrome." André Opperman wrote a patch to solve the problem. It creates a separate qmail-todo program, whose only job is processing the ingestion/todo queue. This allows draining the todo queue to happen asynchronously and thus does not prevent deliveries from ocurring at the same time. The code of the solution is a bit complicated, and is not worth applying unless you are experiencing "silly qmail syndrome." This patch is referred to as the ext_todo patch. (local copy) (nrg4u.com)
todoqueue, an inefficient filesystem that has trouble with large directories may restrict the speed of todo processing significantly. Such filesystems were standard back in 1998, and qmail had to work around this problem for the majority of the mail queue. It did so by creating many sub-directories (in essence, hash buckets), to limit the number of messages that would likely be in any one directory. For whatever reason, this workaround wasn't applied to the todo queue. The best way to address this is to use a filesystem capable of handling directories with large numbers of files. Ext3, for instance, uses a hash-based data structure to implement directories, which handles large numbers of files in a single directory with ease. But upgrading/modernizing your filesystem is not always possible, depending on your situation. Russ Nelson wrote a patch to make qmail use the same multi-directory hashing system qmail uses in its main queue in the todo portion of the queue as well. This patch is referred to as the "big-todo" patch. There's no reason to apply this patch unless you really really need it, because getting a better filesystem (or enabling the right features on the filesystem you're already using) is a more efficient option (though using the multi-directory scheme won't really cause any *trouble* on newer filesystems either); this patch can require that your queue be rebuilt, and so applying it to a running system can be a bit of a pain. (local copy) (qmail.org)