« PGI Compiler Bug | Main | A C Lock-Free Hash Table Implementation »

Gmail, DKIM, and DomainKeys

I recently spent a bunch of time trying to resolve some delivery problems we had with Gmail. Some of it was dealing with idiosyncratic issues associated with our mail system, and some of it, well, might benefit others.

In our mail system, we use qmail-qfilter and some custom scripts to manipulate incoming mail, along with a custom shell script I wrote to manipulate outbound mail. Inbound mail, prior to this, was prepended with three new headers: DomainKey-Status, DKIM-Status (and friends), and X-Originating-IP. Outbound mail was signed with both a DomainKey and a DKIM signature. All of my DomainKey-based manipulation was based on libdomainkeys and, in particular, their dktest utility. Yes, that library is technically out-of-date, but for a long time there were more DomainKey-compliant servers out there than DKIM-compliant servers, so… it made sense. The DKIM-based manipulation is all based on Perl’s Mail::DKIM module, which seems to be quite the workhorse.

Our situation was this: we have several users that use Gmail as a kind of “back-end” for their mail on this server. All of their mail needs to be forwarded to Gmail, and when they send from Gmail, it uses SMTP-AUTH to relay their mail through our server. This means that their outgoing mail is signed first by gmail, then by us. The domain of the outgoing signature is defined by the sender.

So, first problem: we use procmail to forward mail. This means that all mail that got sent to these Gmail users got re-transmitted with a return-address of nobody@our-domain.com (the procmail default). Thus, we signed all of this relayed mail (because the sender was from one of the domains we have a secret-key for). This became a problem because all spam that got sent to these users got relayed, and signed, and so we got blamed for it (thus causing gmail to blacklist us occasionally).

Gmail has a few recommendations on this subject. Their first recommendation is to stop changing the return address (which is exactly the opposite of the recommendation of SPF-supporters, I’d like to point out). They also suggest doing our own spam detection and putting “SPAM” in the subject of messages our system thinks is spam. I used Gmail’s recommended solution (which would also prevent us from signing outbound spam), adding the following lines to our procmailrc:

SENDER=`formail -c -x Return-Path`

This caused new problems. All of a sudden, mail wasn’t getting through to some of the Gmail users AT ALL. Gmail would permanently reject the messages with the error message:

555 5.5.2 Syntax error. u18si57222290ibk.46

It turns out that messages sent From the Gmail users often had multiple Return-Path headers. The same is true of messages from many mailing lists (including Google Apps mailing lists). This means that formail would dutifully print out a multi-line response, which would then cause garbage (more or less) into the sendmail binary, thereby causing invalid syntax, which is why Gmail was rejecting messages. On top of that, formail doesn’t strip off the surrounding wockas, which caused sendmail to encode the Return-Path header incorrectly, like this:

Return-Path: <<mailinglist@somedomain.com>

This reflects what would happen during the SMTP conversation with Gmail’s servers: the double-wockas would be there as well, which is, officially, invalid SMTP syntax. The solution we’re using now is relatively trivial and works well:

SENDER=`formail -c -x Return-Path | head -n 1 | tr -d'<>'`

Let me re-iterate that, because it’s worth being direct. Using Gmail’s suggested solution caused messages to DISAPPEAR. IRRETRIEVABLY.

Granted, that was my fault for not testing it first. But still, come on Google. That’s a BAD procmail recommendation.

There were a few more problems I had to deal with, relating to DomainKeys and DKIM, but these are someone idiosyncratic to our mail system (but it may be of interest for folks with a similar setup). Here I should explain that when you send from Gmail through another server via SMTP-AUTH, Gmail signs the message with its DK key, both with a DKIM and with a DomainKeys header. This is DESPITE the fact that the Return-Path is for a non-gmail domain, but because the Sender is a gmail.com address, this behavior is completely legitimate and within the specified behavior of DKIM.

The first problem I ran into was that, without a new Return-Path, the dktest utility from DomainKeys would refuse to sign messages that had already been signed (in this case, by Gmail). Not only that, but it would refuse in a very bad way: instead of spitting out something that looks like a DomainKey-Signature: header, it would spit out an error message. Thus, unless my script was careful about only appending things that start with DomainKey-Signature: (which it wasn’t), I would get message headers that looked like this:

Message-Id: <4d275412.6502e70a.3bf6.0f6dSMTPIN_ADDED@mx.google.com>
do not sign email that already has a dksign unless Sender was found first
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gmail.com; h=mime-version

That’s an excerpt, but you can see the problem. It spit an invalid header (the error) into the middle of my headers. This kind of thing made Gmail mad, and rightly so. It made me mad too. So mad, in fact, that I removed libdomainkeys from my toolchain completely. Yes, I could have added extra layers to my script to detect the problem, but that’s beside the point: that kind of behavior by a tool like that is malicious.

The second problem I ran into is, essentially, an oversight on my part. My signing script chose a domain (correctly, I might add), and then handed the signing script a filename for the private key of that domain. HOWEVER, since I didn’t explicitly tell it what domain the key was for, it attempted to discover the domain based on the other headers in the message (such as Return-Path and Sender). This auto-discovery was only accurate for users like myself who don’t use Gmail to relay mail through our server. But for messages from Gmail users, who relay via SMTP-AUTH, the script would detect that the mail’s sender was a Gmail user (similar problems would arise for mailing lists, depending on their sender-rewriting behavior). So what it would do is assume that the key it had been handed was for that sender’s domain (i.e. gmail.com), and would create an invalid signature. This, thankfully, was easy to fix: merely adding an explicit --domain=$DOMAIN argument to feed to the signing script fixed the issue. But it was a weird one to track down! It’s worth pointing out that the libdomainkeys dktest utility does not provide a means of doing this.

Anyway, at long last, mail seems to be flowing to my Gmail users once again. Thank heaven!


TrackBack URL for this entry:

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)


This page contains a single entry from the blog posted on January 10, 2011 2:04 PM.

The previous post in this blog was PGI Compiler Bug.

The next post in this blog is A C Lock-Free Hash Table Implementation.

Many more can be found on the main index page or by looking through the archives.

Creative Commons License
This weblog is licensed under a Creative Commons License.
Powered by
Movable Type 3.34