Computers Archives

June 27, 2005

A Spam Idea

Something I’ve been nothing, and this obviously isn’t a serious problem on a system as small as the one I administer, is that spam bounces frequently fill up the queue.

What happens is that some spammer sends spam to one of my legitimate users (illegitimate users are rejected right away, which may make gathering legitimate names easy, but avoids illegitimate bounces). Some of my legitimate users have their mail forwarded to, which enjoys refusing to accept email. Of course, my policy on MY email server is to accept all mail: my spam filters (e.g. spamassassin) can be wrong and therefore are only advisory. When verizon refuses to accept the mail, my mail server is stuck holding the bag: I can’t deliver it, and I usually can’t bounce it either. I’d like to just dispose of it, but my queue lifetime is 7 days, so I have to keep it in the queue for 7 days while qmail realizes that verizon is never going to accept it, and the return address doesn’t exist.

So here’s my thought: have two qmail installs, one with a queue lifetime of 7 days, one with a queue lifetime of 1 day. Then, put a script in the QMAILQUEUE chain that decides which qmail-queue to use (the 7-day one or the 1-day one) based on whatever I want (i.e. the X-Spam-Score header, or the $RECIPIENT envariable).

Probably won’t happen on the WOPR here, because, of course, people are skittish about that kind of fiddling (read: it’s a production machine). which means it’ll basically never get done. Oh well - but it was a thought.

December 27, 2005


I found some ancient fonts I created once. Be a shame for them to be forgotten in the mists of time. So, here they are:

(Mac Win)

(Mac Win)

(Mac Win)

(Mac Win)

(Mac Win)

(Mac Win)

I also found my old “Poetry & Prose Archive”… it’s childish in places, but still was something I invested a lot of time in back in the day. Rather than let it fade, I’m making it live forever, here, unaltered from exactly how it was back in 1998. Let’s pretend it’s not embarrassing. :) By the way, I highly doubt any of the email addresses on that site still work.

Finally, an old MOD site I put together. No comment.

September 7, 2006

Why Are Compilers So Idiotic?

It astonishes me how idiotic some compilers can be.

I’ve been working with some unusual compilers recently: the PGI compilers from the Portland Group, and the XL compilers from IBM. I’ve been attempting to get them to compile some standard libraries for testing parallel operations. For example, I want them both to compile LAM-MPI, and BLAS, and LAPACK.

In fighting with them to get them to function, I’ve discovered all sorts of quirks. To vent my frustrations, I am documenting them here. Specifically, IBM’s XL compilers today.

I’m compiling things with the arguments -O4 -qstrict -qarch=ppc970 -Q -qcpluscmt. This takes some explanation. I’m using -O4 instead of -O5 because with the latter, the LAPACK libraries segfault. That’s right. Fortran code, with nary a pointer in sight, segfaults. How that happens is beyond me. The -qarch=ppc970 flag is because, without it, code segfaults. What does this mean? This means that the compiler can’t figure out what cpu it’s running on (which, hey, I’ll give them a pass on that one: I’m not running this compiler on a “supported” distribution of Linux) and is inserting not only bad code but bad pointer dereferences (HUH?!?).

When compiling LAPACK, you’re going to discover that the standard Fortran function ETIME, which LAPACK uses, doesn’t exist in XL-world. Instead, they decided it would be more useful to have an ETIME_ function. See the underscore? That was a beautiful addition, wasn’t it? I feel better already.

While compiling LAM with any sort of interesting optimization (the benefits of which are unclear in LAM’s case), you’re going to discover that XL’s -qipa flag (which is implicitly turned on by -O3 and above) can cause extremely long compile times for some files. How extreme? I’m talking over an hour on a 2Ghz PPC with several gigabytes of RAM. But don’t worry! Even though it looks like the compiler is stuck in an infinite loop, it’s really not, and will actually finish if you give it enough time. Or you could just compile LAM without optimization, it’s up to you.

Next nifty discovery: some genius at IBM decided that all inline functions MUST be static. They have to be, otherwise the world just comes apart at the seams. Nevermind the fact that the C standard defines the inline keyword as a hint to the compiler, and specifically forbids the compiler from changing the semantics of the language. What does this matter? A common, sensible thing a library can do is to define two init functions, like so:

inline int init_with_options(int foo, int bar, int baz)
{ stuff...
        return n;
int init(void)
        return init_with_options(0, 0, 0);

Now what do you suppose the author of such code intends? You guessed it! He wants to let the compiler know that dumping the contents of init_with_options() into init() is a fine thing to do. The author is not trying to tell the compiler “nobody will ever call init_with_options().” But that’s what the XL compilers think the author is saying. Better still, the documentation for XL explains that there’s a compiler option that may help: -qnostaticinline “Wow!” you say to yourself, that sounds promising! Nope. The option doesn’t seem to do a thing. You should have been clued in by the fact that the documentation says that that option is on by default. No, sorry, all inline functions are static, and there’s nothing you can do about it. If you didn’t want them static, you shouldn’t have given such hints to the compiler.

Here’s another good one: what does the user mean when he specifies the -M compiler flag? Well, let’s think about this. The documentation for that option says:

Creates an output file that contains information to be included in a “make” description file. This is equivalent to specifying -qmakedep without a suboption.

Now, what do you think -M really does? Oh, it does what it says, alright: creates a .d file. But it also doesn’t stop the compiler from actually attempting to COMPILE the file. So, now that you’ve got your dependencies built so that you know in what order to compile things, you tell make to have another go at building things. But what’s this? It’s already been compiled (incorrectly!)! Joy! My workaround is to run the compiler like so:

rm -f /tmp/foo
xlc -M -c -o /tmp/foo file.c

Now, when gcc and other compilers handle the -M flag, they spit out dependencies to stdout, rather than creating a file. Many complex Makefiles that you really don’t want to go mutzing with rely on that behavior. How do we get XL to do the same? Here’s one you wouldn’t have suspected: -MF/dev/stdout What’s the -MF flag supposed to do? Modify an existing Makefile, that’s what. See, isn’t that an excellent idea?

Speaking of excellent ideas, IBM decided that the C language needed some extensions. And I can’t begrudge them that; everybody does it. Among the extensions they added was a __align() directive, along the lines of sizeof(), that allows you to specify the alignment of variables that you create. You’d use it like so:

int __align(8) foo;

Unfortunately, in the standard pthread library headers, there are several structs defined that look like this:

struct something {
        void * __func;
        int __align;

You see the problem? Of course, there’s no way to tell XL to turn off the __align() extension. You would think that using -qlanglvl might do it, because it supposedly allows you to specify “strict K&R C conformance”. You’d be wrong. Your only option is to edit the headers and rename the variable.

Other ways in which XL tries to be “intelligent” but just ends up being idiotic is it’s handling of GCC’s __extension__ keyword. For example, in the pthreads headers, there is a function that looks like this:

int pthread_cancel_exists(void)
        static void * pointer =
        __extension__ (void *) pthread_cancel;
        return pointer != 0;

The reason GCC puts __extension__ there is because pthread_cancel may not exist, and it wants to have pointer be set to null in that case. Normally, however, if you attempt to point to a symbol that doesn’t exist, you’ll get a linking error. XL, of course, barfs when it sees this, but not in the way you think. XL attempts to be smart and recognize common uses of __extension__. Somehow, somewhere, the error you get will be:

found ), expected {

Really makes you wonder what the heck it thinks is going on there, doesn’t it? The solution? Remove “__extension__” and it works fine.

That’s all for now. I’m sure I’ll find more.

November 19, 2006

Are DNS-RBLs Illegal?

There was a discussion on the DJBDNS mailing list recently (very short, because of the characters involved) about email and DNS-based blacklists. The discussion is here.

The basics of it go like this: DNS-based blocklists (DNSBL’s or DNS-RBLs) are publicly available lists of “known-bad” IP addresses. How this generally works is somebody (SpamCop or MAPS, for example) will use some method (frequently known only to them) to determine whether or not a given IP address is a spammer or not, and will then publish that information. Other people (for example, Notre Dame) use that information to make decisions about whether or not to accept or reject mail from those IP addresses. There are a zillion of these lists, and many people use them as a short-cut (looking up the IP address in one of these lists is much easier than doing fancy content-analysis like we do on, particularly when they’ve got a high volume of email.

The claim, that I don’t believe, is that these things are essentially illegal. The reason people say these black lists are illegal is because they do not discriminate between non-spam and spam, but merely everything from a given host. The justification is usually Exactis v. MAPS, where a spammer (Exactis) sued the MAPS blacklist for listing them, claiming that, among other things, MAPS was abusing monopoly power and violating America’s anti-monopoly laws. What happened was that Exactis got a preliminary injunction (i.e. MAPS had to take them out of the blacklist), and then MAPS settled out of court.

To get more information, I asked my brother, who is a lawyer (passed the bar exam, currently gets paid for his legal services, etc.) to take a look and tell me what he thought.

The first thing he points out is how to think about this preliminary injunction business. He says:

Just FYI, a temporary restraining order and a preliminary injunction are essentially the same, but with important differences. A TRO is completely ex parte, and the other side gets no say in the matter, and is only granted when the harm is so immediate that the time it takes to get the other side to appear in court will damage the person requesting the TRO, and you also have to list and certify the efforts you’ve taken to inform the other side and to try to secure their appearance in court. A TRO expires in ten days, if not sooner. A preliminary injunction lasts through the final decision of the court as to the merits of the case.

So why was the preliminary injunction granted in this case? Well, the standard that the Tenth Circuit Court uses to decide whether to grant a preliminary injunction is defined as follows:

In the Tenth Circuit, preliminary relief is warranted upon a showing (i) that Plaintiff faces irreparable harm, (ii) that the prospective harm to Plaintiff outweighs any damage Defendants might sustain without an injunction, (iii) that injunctive relief is not adverse to the public interest, and (iv) that the case presents serious, substantial, and difficult questions as to the merits, as to make the issues ripe for litigation and deserving of more deliberate investigation. Walmer v. U.S. Dept. of Defense, 52 F.3d 851, 854 (10th Cir. 1995).

In other words, a preliminary injunction:

is not a decision on the merits or even that the merits are to be considered. Rather, the standard is a balancing of the risks. In this case, the balancing was easy, given the low individual harm cause by spam, and the high alleged damage to Exactis. Thus, comparing negligible (if any) harm to MAPS (since their business is built more on reputation than individual screenings) and high potential harm to Exactis, the granting of the preliminary injunction was fairly routine.

Indeed, for an example of a nearly identical situation where the preliminary injuction was DENIED, look no further than Media3 v. MAPS.

That doesn’t mean that the denial (Media3) or the imposition (Exactis) is in any way a judgment on the merits of the case. You don’t even have to show that you have a “valid legal argument”, as Dean Anderson claims. Instead, you have to have a “non-frivolous” legal argument. However, according to my brother:

The standard for frivolity is so low that only a limited group of arguments qualify (the only one I know of being an argument against paying the income tax). Attorneys are supposed to advance any claim where they can make “a good faith argument for an extension, modification, or reversal of existing law”. (That comes from the American Bar Association’s Model Rules of Professional Conduct, Rule 3.1)

More importantly, for that very reason, the very idea of citing a temporary restraining order or preliminary injunction in other legal action is ludicrous. In most jurisdictions, the only things you can cite are “published” final decisions—that is, in the Exactis case, since it was settled out of court, there was no final decision (also, municipal court decisions are not citable, because they tend to be extremely specific to the facts of the case), thus it cannot be cited. You can cite an injunction later on in the same court proceeding, but not in a different court proceeding. So, not only can you NOT cite the Exactis injunction, but it’s, quite specifically, not even a judgment on the merits of the case.

My brother puts it slightly better:

There are two different things involved in a case: facts and law. An injunction is based on the ALLEGED facts (so they may not even be the “real” facts), and a weighing of the alleged harms. The injunction reserves the decision as to the LAW in the case for later, and is intended to preserve the status quo through trial until the law can be decided. So it’s pretty pointless to say to a court that a fact-based injunction issued in a different case has any bearing on the issues of law in a different case. There’s just no relevance there.

But, well, that’s a very thorough explanation of why the Exactis v. MAPS case is, essentially, irrelevant, we’re still left with the question of whether the MAPS-style blacklists are legal or not. Well, there’s still the question of why MAPS settled the case, but that’s kinda beside the point. It could be that they knew it wouldn’t be worth the lawyer fees, or just didn’t want to fight it out. It could be that they flipped a coin. Who knows? Only MAPS, and maybe their lawyer.

Indeed, antitrust cases are almost never decided against the company in question.

So what about the legality of spam blacklists? Well… there’s not a whole lot of definitive law on the matter. But, for example, check out MAPS v. BlackIce. Now, note that this decision is totally un-cite-able. It’s unpublished, and it’s the California court interpreting a Federal statute (considered a “non-controlling” decision), BUT, unlike the Exactis case, it is actually a decision. And what does it say? From page 6, section B:

Mail Abuse argues the Communications Decency Act provides a complete defense to this action… . Mail Abuse is asserting §230( c )(2) as a defense. Under §230( c ):

“(1) No provider or user of an interactive computer service shall be treated as the publisher or speaker of any information provided by another information content provider. (2) No provider or user of an interactive computer service shall be held liable on account of: (A) any action voluntarily taken in good faith to restrict access to or availability of material that the provider or user considers to be obscene, lewd, lascivious, filthy, excessively violent, harassing, or otherwise objectionable, whether or not such material is constitutionally protected; or (B) any action taken to enable or make available to information content providers or others the technical means to restrict access to material described in paragraph (1).

Sounds pretty good for MAPS, eh? Next:

The next inquiry, then, is whether spam is “harassing” or “otherwise objectionable” material under §230( c )(2)(A). This is an undecided question of law. One federal court, in dicta, noted blockage of unsolicited bulk e-mail was “encouraged” by §230( c )(2). (America Online v. (S.D.W.Va 1999) 49 F.Supp.2d 851, 855, 864 (dismissing tortious interference with contractual relations and prospective economic advantage claims) (implying unsolicited bulk e-mail is “harassing” or “otherwise objectionable”).)

Whether spam is “harassing” or “otherwise objectionable” is likely an issue that will be resolved in the federal courts. But given the state of law before the court, the court’s conclusion that §230 encourages the blocking of unsolicited bulk e-mail seems correct.

Okay, but that’s been established already, right? I mean, we pretty much accept that blocking spam is legal; what about the collateral damage?

Black Ice contends its e-mails were solicited, and therefore not spam. It argues this factual dispute (i.e., whether its e-mails were solicited or unsolicited) cannot be resolved at the demurrer stage. Black Ice fails to read the entire section. Section 230( c )(2)(A) provides immunity for any good faith effort to block content. Any good faith but unintentional blockage of non-spam is therefore also afforded immunity.

(emphasis mine) Aha! This, I think, is a pretty convincing argument that DNS-RBLs are legal. They are (or can be) part of a good faith effort to block spam. The case goes on to go into the details of this particular block, and eventually rules that MAPS was not acting in good faith because of some details of what went on, but the general idea is still valid, and paves the way for black lists (provided that they act in “good faith”).

April 23, 2007


This is the coolest looking super-computer, ever.

August 21, 2007


Some half-crazed moron at Microsoft, in an attempt to be helpful, made an idiotic decision.

Of what do I speak? Microsoft Entourage ( attempts to be both convenient and pretty by replacing apostrophes (') with curly quotes (’). Ordinarily, I wouldn’t complain. I like curly-quotes as much as the next guy, and I regularly use a vim plugin called UniCycle to achieve the same effect. HOWEVER, Entourage knows that it only wants to send text email in the ISO-8859-1 (aka “Latin1”) character set, which does not contain a curly-quote. This presents the age-old conundrum: “wanna curly quote, can’t have a curly quote”. So Entourage must choose a different character from the ISO-8859-1 character set to use instead of the curly quote. The obvious choice would be the apostrophe ('); people are used to it, and after all it is a quote! But what does Entourage choose? A superscript 1, like this: ¹

What goon came up with this? A superscript 1, in most fonts (except at very small sizes) looks nothing like a quotation mark. It looks like the number one! Which is exactly what it is! Yes, it’s in the Latin1 character set (0xB9) but, let’s be honest here, how many fonts do you suppose have a superscript one character but NOT an apostrophe? Or a curly quote? Besides looking stupid, Microsoft isn’t actually improving their compatibility!

But, I suppose, what did I expect from such an “innovative” company?

P.S. I have no idea why superscript 1 gets to be its own character in the Latin1/ISO-8859-1 character set. Seems silly to me, but then, so does ¤.

September 20, 2007

I wish I had a C-based lock-free hash table...

I recently stumbled across a google tech talk video by a man named Cliff Click Jr. He works for a company named Azul, and he has a blog here This tech talk was all about a lock-free hash table that he’d invented for Azul. The video is here

Lock free hash table? Heck yeah I want a lock free hash table!

Sadly, Cliff’s implementation is Java-only, and relies on some Java memory semantics, but it’s on sourceforge if anyone’s interested.

So, as I began reading up on the subject, I discovered that he’s not the only one interested. In fact, there’s another fellow who has a C-based library here. Only problem? IT’S NOT ACTUALLY LOCK FREE!!! At least, not yet. At the moment it’s a pthread/mutex-based hash table that happens to have all the pthreads stuff ifdef’d out (joy). There are other people out there who talk about it. A fellow from IBM named Maged M. Michael has a paper about how to do lock-free hash tables, and he even has a patent on his particular method, but no implementations appear to be available. Chris Purcell wrote a paper on the topic, which contains pseudocode, but yet again, no implementation.

So it would appear that if I want a lock-free hash table, I’m going to have to implement it myself. But boy, it gets me giddy just thinking about it. :) Pthreads, you’re going down!

October 3, 2007

Come *on*, Apple

This is just petty, but Apple? What’s up with libtoolize? I know, I know, you decided you wanted to call it glibtoolize, and that’s fine! That’s fine, I don’t mind. But why did you distribute an autoreconf that still believed in libtoolize? That’s just dumb.

October 9, 2007

Concurrent Hash Table Tricks

So, I’m working on qthreads (which is open-sourced, but currently lacks a webpage), and thinking about its Unix implementation.

The Unix implementation emulates initialization-free synchronization (address locks and FEBs) by storing addresses in a hash table (well, okay, a striped hash table, but if we make the stripe 1, then it’s just a hash table). Let’s take the simplest: address locks. The semantics of the hash table at the moment are really pretty basic: if an address is in the hash, it’s locked. If it’s not in the hash, it’s not locked. The hash is the cp_hashtable from libcprops, a library which I appreciate greatly for giving C programmers easy access to high-quality basic data structures (I’ve contributed some significant code to it as well). Anyway, the downside of using this hash table is that it’s a bottleneck. The table is made thread-safe by simply wrapping it in a lock, and every operation (lock and unlock) requires locking the table to either insert an entry or remove an entry.

So how could we do this with a more concurrent hash table? I’ve seen two hash table APIs that are concurrent: the lock-free hash in Java that I talked about previously, and the concurrent_hash_map from Intel’s Thread Building Blocks library (which, given that it’s in C++, is something I can actually use).

The way the TBB hash works is that you can perform three basic operations on your hash: find(), insert(), and erase(). When you do either of the first two operations, you can lock that entry in the hash and prevent others from getting at it, or you can access it read-only. The erase function merely takes a key and removes it from the hash table, giving you no access to whatever might have been deleted from the hash table. Worse yet, you cannot erase something that you currently have a lock on, even if it’s a write lock!

Using this hash the way that I currently use the cprops hash is thus impossible. Why? Because erasing things from the TBB hash is ALWAYS a race condition. Put another way, all TBB hash erase operations are “blind erase” operations, when what you really want is “erase if it’s still in an erasable state”. You can never be certain that erasing an entry from the hash table is a good idea, because you can never be certain that someone else didn’t add something important to that entry in between the time that you decided the entry was erasable and the time you actually erased it. If I insert a value (to “lock” an address, say), I can associate that value with a queue of waiting threads (i.e. other threads that also want to lock that address), but I can never erase that entry in the hash table! The reason is that since I can’t erase something that I have access to (i.e. have a write-lock on), there’s a race condition between me fetching the contents of that hash table entry and me removing that entry from the hash table.

A different approach to this might be to simply never remove entries from the hash table, and to simply say that if the associated list of threads is empty (or NULL), then the lock is unlocked. That would work well, except for that tiny little problem of the hash table eternally growing and never reclaiming memory from unused entries. So, if I had an application that created lots of locks all over the place (i.e. inserted lots of different entries into the hash), but never had more than a handful locked (i.e. in the hash) at a time, I’d be wasting memory (and possibly, LOTS of it).

Is there another way to use such a hash table to implement locks more efficiently? I don’t know, but I don’t think so (I’d love to be proved wrong). Any way you slice it, you come back to the problem of deleting things that are in a deletable state, but not knowing if it’s safe to do so.

The Azul Java-only hash is an interesting hash that behaves differently. It is based upon compare-and-swap (CAS) atomic operations. Thus, for a given key, you can atomically read the contents of a value, but there’s no guarantee that that value isn’t changed the MOMENT you read it. Deleting an entry, in this case, means swapping a tombstone marker into place where the entry’s value is supposed to be, which you can avoid doing if that value changed before you got to the swap part (the C of the CAS). Thus, after you’ve extracted the last thread that’d locked that address (i.e. you’ve set the value to NULL) you can avoid marking a thing as “deleted” when it has really just been re-locked because if the value changed to non-NULL (and the compare part of the CAS fails), you can simply ignore the failure and assume that whoever changed it knew what they were doing. Thus, you CAN safely delete elements from the hash table. Better still, it easily integrates with (and may even require) a lock-free CAS-based linked list for queueing blocked threads. (You may be saying to yourself “um, dude, a hash table entry with a tombstone as a value is still taking up memory”, and I say to you: yeah? so? they get trimmed out of the hash table whenever the hash table is resized, thereby being an awesome idea.)

And, as I think about it, forcing users to do blind erases makes Intel TBB hash tables ALMOST unusable for an entire class of problems and/or algorithms. That category of algorithms is any algorithm that needs to delete entries that could potentially be added back at any time. They really ought to provide an equivalent of a CAS: let the user say “delete this hash entry if the value is equal to this”.

I say “ALMOST unusable” because it’s fixable. Consider the ramifications of having control over the comparison and equivalence functions: a key can be associated with a “deletable” flag that provides much of the needed functionality. With such a flag, the result of any find() operation can be considered invalid not only if it returns false but also if the deletable flag associated with the result’s key is true. Essentially, finding something in the hash becomes:

while (hash.find(result, &findme) && result->first->deletable) {

It’s an extra layer of indirection, and can cause something to spin once or twice, but it works. Your comparison struct functions must then be something like this:

typedef struct evilptrabstraction {
    bool deletable;
    void * key;
} epa_s;

typedef epa_s * epa;

struct EPAHashCompare {
    static size_t hash(const epa &x) {
        return (size_t)x->key; // or a more complex hash
    static bool equal (const epa &x, const epa &y) {
        if (x->deletable && y->deletable) return true;
        if (x->deletable || y->deletable) return false;
        return x->key == y->key;

Note that anything marked deletable is equivalent, but doesn’t match anything non-deletable. Thus, safely deleting something becomes the following (assuming findme is a version of the epa struct not marked deletable):

accessor *result = new accessor();

bool found = hash.find(*result, &findme);
while (found && (*result)->first->deletable)  {
    found = hash.find(*result, &findme);

if (found) {
    (*result)->first->deletable = true;
    delete result; // release the lock
    findme.deletable = true;
} else {
    delete result;

This opens the question of inserting safely, though, because during the insertion process, your inserted object might have already existed, and if it already existed, it may have been in the process of being deleted (i.e. it might have been marked as deleted). There’s the potential that your “freshly-inserted” item got marked deletable if it was already in the hash. So how do you insert safely?

bool inserted = hash.insert(result, insertme);
// !inserted means insertme was already in the hash
while (!inserted && result->first->deletable) {
    inserted = hash.insert(result, insertme);
if (!inserted) delete insertme;

Note that we can’t simply toggle the deletable mark, because an erase() operation may already be waiting for the hash value, and it doesn’t expect that the key for the item may have changed while it was waiting for the item to be locked (so changing the deletable flag won’t stop it from being erased). The downside, of course, is that popularly erased/re-inserted items may cause a fair bit of memory churn, but that’s unavoidable with the TBB’s bare-bones erase() interface.

November 27, 2007

Moving Parts are Evil

I recently was doing some work on the computer of an elderly friend of mine, and had a bit of a scare with a hard drive that appeared to have failed. Turns out the boot block had been corrupted somehow, which was easy enough to fix from another computer (yay Linux!). Anyway, this made me stick my nose into S.M.A.R.T. statistics on hard drives. There’s a nice little tool for OSX that sits in the menu bar and keeps an eye on your disks for you (SMARTReporter). I figured there had to be something similar for Windows. In the “free” department, there’s very little available that’s worth beans, but I was able to find something called HDD Health. No sooner had I installed it than it started telling me that the Seek Error Rate was fluctuating wildly (generally it would go from 100 to 200 and back again every couple minutes). This was rather sudden! I got worried about the health of the drive, and started backing things up on it… then I looked it up on the internet. Apparently this is a common thing with Western Digital drives (which is what this computer had): their Seek Error Rate tends to fluctuate like that, and it doesn’t mean anything at all. The general recommendation seems to be “download the diagnostic tools from Western Digital; those will be authoritative”. So I did, and they said the drive was in perfect health.

Well, so much for being worried!

It does seem to speak to the temperamental (and largely useless) nature of S.M.A.R.T. statistics. Thing to keep in mind: they don’t always mean much.

January 11, 2008

Apple's Compiler Idiocy

This is something that’s been bugging me for a while here, and I might as well write it down since I finally found a solution.

I have an atomic-increment function. To make it actually atomic, it uses assembly. Here’s the PPC version:

static inline int atomic_inc(int * operand)
    int retval;
    register unsigned int incrd = incrd; // silence initialization complaints
    asm volatile ("1:\n\t"
                  "lwarx  %0,0,%1\n\t" /* reserve operand into retval */
                  "addi   %2,%0,1\n\t" /* increment */
                  "stwcx. %2,0,%1\n\t" /* un-reserve operand */
                  "bne-   1b\n\t" /* if it failed, try again */
                  "isync" /* make sure it wasn't all just a dream */
                  :"=&r" (retval)
                  :"r" (operand), "r" (incrd)
    return retval;

Now, what exactly is wrong with that, eh? This works great on Linux. The general GCC compiles this just fine, as does the PGI compiler, IBM’s compiler, and Intel’s compiler.

Apple’s compiler? Here’s the error I get:

gcc -c test.c
/var/tmp/ccqu2RmV.s:5949:Parameter error: r0 not allowed for parameter 2 (code as 0 not r0)

Okay, so, some kind of monkey business is going on. What does this look like in the .S file?

    lwarx r0,0,r2
    addi   r3,r0,1
    stwcx. r3,0,r2
    bne-   1b
    mr r3,r0

It decided (retval) was going to be r0! Even though that’s apparently not allowed! (FYI it’s the addi that generates the error).

The correct workaround is to use the barely documented “b” option, like this:

static inline int atomic_inc(int * operand)
    int retval;
    register unsigned int incrd = incrd; // silence initialization complaints
    asm volatile ("1:\n\t"
                  "lwarx  %0,0,%1\n\t" /* reserve operand into retval */
                  "addi   %2,%0,1\n\t" /* increment */
                  "stwcx. %2,0,%1\n\t" /* un-reserve operand */
                  "bne-   1b\n\t" /* if it failed, try again */
                  "isync" /* make sure it wasn't all just a dream */
                  :"=&b" (retval) /* note the b instead of the r */
                  :"r" (operand), "r" (incrd)
    return retval;

That ensures, on PPC machines, that the value is a “base” register (aka not r0).

How gcc on Linux gets it right all the time, I have no idea. But it does.

March 12, 2008

Sorting Spaces

There seems to be some disagreement, at Apple Computer, about exactly what the definition of the word “ignore” is. From the “sort” man page:

-d Sort in `phone directory’ order: ignore all characters except letters, digits and blanks when sorting.

What does that suggest to you? Well, let’s compare it to the GNU “sort” man page:

-d, —dictionary-order
consider only blanks and alphanumeric characters

So you’d THINK, right, that sorting with these two options would be equivalent, right?


Here’s a simple list:

- foo
- foo

How should these things be sorted when the -d option is in effect? You’ve got a conundrum: is a space sorted BEFORE a number or AFTER a number?

Curse you, alphabet! You’re never around when I need you!

And, of course, BSD and GNU answer that question differently. On GNU, the answer is AFTER, on BSD the answer is BEFORE! Oh goody.

Here’s a better way if you need the sorting results to be the same on both BSD and GNU: replace all spaces with something else non-alpha-numeric that isn’t used in the file (such as an underscore, or an ellipsis, or an em-dash). Then sort with -ds (no last-minute saving throws!), then replace the underscore (or whatever) with a space again.

And if you need it to be consistent on OSX platforms too, make it a -dfs sort (so that capitals and lower-case are considered the same).

March 13, 2008

w3m and MacPorts

For whatever reason, w3m refuses to build on my Intel OSX box with the latest boehmgc library. To get it to build, you must forcibly downgrade to boehmgc 6.8 or 6.7 or something earlier.

Also, I noticed that w3m isn’t marked as depending on gdk-pixbuf. Strictly speaking, it doesn’t, but it does if you have --enable-image=x11. :P Add this to your Portfile:

depends_lib lib:libgccpp.1:boehmgc bin:gdk-pixbuf-config:gdk-pixbuf

Also, it seems that either w3m or gdk-pixbuf-config appends an extra library to the config line for gdk-pixbuf-config (essentially, they specify -lgdk_pixbuf AND -lgdk_pixbuf_xlib). That extra library causes build problems for w3m; you can fix it by editing /opt/local/bin/gdk-pixbuf-config and removing the -lgdk_pixbuf from what it prints out (however, if you use other software that uses gdk-pixbuf-config, you may need to put it back once w3m has finished building).

My Bashrc

There are few things that, over my time using Unix-like systems, I have put more cumulative effort into than into my configuration files. I've been tweaking them since the day I discovered them, attempting to make my environment more and more to my liking. I have posted them on my other website (here), but it occurred to me that they've gotten sufficiently hoary and complex that a walkthrough might help someone other than myself.

Anyway, my bashrc is first on the list. (Or, if you like, the pure text version.)

The file is divided into several (kinda fuzzy) sections:
- Initialization & Other Setup
- Useful Functions
- Loading System-wide Bashrc
- Behavioral Settings
- Environment Variables
- Character Set Detection
- Aliases
- Tab-completion Options
- Machine-local settings
- Auto-logout

Let's take them one at a time.

Initialization & Other Setup

Throughout my bashrc, I use a function I define here ( dprint ) to allow me to quickly turn on debugging information, which includes printing the seconds-since-bash-started variable ( SECONDS ) in case something is taking too long and you want to find the culprit. Yes, my bashrc has a debug mode. This is essentially controlled by the KBWDEBUG environment variable. Then, because this has come in useful once or twice, I allow myself to optionally create a ~/.bashrc.local.preload file which is sourced now, before anything else. Here's the code:


function dprint {
if [[ "$KBWDEBUG" == "yes" && "$-" == *i* ]]; then
    #date "+%H:%M:%S $*"
    echo $SECONDS $*
dprint alive
if [ -r "${HOME}/.bashrc.local.preload" ]; then
    dprint "Loading bashrc preload"
    source "${HOME}/.bashrc.local.preload"

Useful Functions

This section started with some simple functions for PATH manipulation. Then those functions got a little more complicated, then I wanted some extra functions for keeping track of my config files (which were now in CVS), and then they got more complicated...

You'll notice something about these functions. Bash (these days) will accept function declarations in this form:

function fname()
    do stuff

But that wasn't always the case. To maintain compatability with older bash versions, I avoid using the uselessly cosmetic parens and I make sure that the curly-braces are on the same line, like so:

function fname \
    do stuff

Anyway, the path manipulation functions are pretty typical — they're similar to the ones that Fink uses, but slightly more elegant. The idea is based on these rules of PATH variables:

  1. Paths must not have duplicate entries
  2. Paths are faster if they don't have symlinks in them
  3. Paths must not have "." in them
  4. All entries in a path must exist (usually)

There are two basic path manipulation functions: add_to_path and add_to_path_first. They do predictable things — the former appends something to a given path variable (e.g. PATH or MANPATH or LD_LIBRARY_PATH ) unless it's already in that path, and the latter function prepends something to the given PATH variable (or, if it's already in there, moves it to the beginning). Before they add a value to a path, they first check it to make sure it exists, is readable, that I can execute things that are inside it, and they resolve any symlinks in that path (more on that in a moment). Here's the code (ignore the reference to add_to_path_force in add_to_path for now; I'll explain shortly):

function add_to_path \
    local folder="${2%%/}"
    [ -d "$folder" -a -x "$folder" ] || return
    folder=`( cd "$folder" ; \pwd -P )`
    add_to_path_force "$1" "$folder"

function add_to_path_first \
    local folder="${2%%/}"
    [ -d "$folder" -a -x "$folder" ] || return
    folder=`( cd "$folder" ; \pwd -P )`
    # in the middle, move to front
    if eval '[[' -z "\"\${$1##*:$folder:*}\"" ']]'; then
        eval "$1=\"$folder:\${$1//:\$folder:/:}\""
        # at the end
    elif eval '[[' -z "\"\${$1%%*:\$folder}\"" ']]'; then
        eval "$1=\"$folder:\${$1%%:\$folder}\""
        # no path
    elif eval '[[' -z "\"\$$1\"" ']]'; then
        eval "$1=\"$folder\""
        # not in the path
    elif ! eval '[[' -z "\"\${$1##\$folder:*}\"" '||' \
      "\"\$$1\"" '==' "\"$folder\"" ']]'; then
        eval "export $1=\"$folder:\$$1\""

Then, because I was often logging into big multi-user Unix systems (particularly Solaris systems) with really UGLY PATH settings that had duplicate entries, often included ".", not to mention directories that either didn't exist or that I didn't have sufficient permissions to read, I added the function verify_path. All this function does is separates a path variable into its component pieces, eliminates ".", and then reconstructs the path using add_to_path, which handily takes care of duplicate and inaccessible entries. Here's that function:

function verify_path \
    # separating cmd out is stupid, but is compatible
    # with older, buggy, bash versions (2.05b.0(1)-release)
    local cmd="echo \$$1"
    local arg="`eval $cmd`"
    eval "$1=\"\""
    while [[ $arg == *:* ]] ; do
        if [ "$dir" != "." -a -d "$dir" -a \
          -x "$dir" -a -r "$dir" ] ; then
            dir=`( \cd "$dir" ; \pwd -P )`
            add_to_path "$1" "$dir"
    if [ "$arg" != "." -a -d "$arg" -a -x "$arg" -a -r "$arg" ] ;
        arg=`( cd "$arg" ; \pwd -P )`
        add_to_path "$1" "$arg"

Finally, I discovered XFILESEARCHPATH — a path variable that requires a strange sort of markup (it's for defining where your app-defaults files are for X applications). This wouldn't work for add_to_path, so I created add_to_path_force that still did duplicate checking but didn't do any verification of the things added to the path.

function add_to_path_force \
    if eval '[[' -z "\$$1" ']]'; then
        eval "export $1='$2'"
    elif ! eval '[[' \
        -z "\"\${$1##*:\$2:*}\"" '||' \
        -z "\"\${$1%%*:\$2}\"" '||' \
        -z "\"\${$1##\$2:*}\"" '||' \
        "\"\${$1}\"" '==' "\"$2\"" ']]'; then
        eval "export $1=\"\$$1:$2\""

I mentioned that I resolved symlinks before adding directories to path variables. This is a neat trick I discovered due to the existence of pwd -P and subshells. pwd -P will return the "real" path to the folder you're in, with all symlinks resolved. And it does so very efficiently (without actually resolving symlinks — it just follows all the ".." records). Since you can change directories in a subshell (i.e. between parentheses) without affecting the parent shell, a quick way to transform a folder's path into a resolved path is this: ( \cd "$folder"; pwd -P). I put the backslash in there to use the shell's builtin cd, just in case I'd somehow lost my mind and aliased cd to something else.

And then, just because it was convenient, I added another function: have, which detects whether a binary is accessible or not:

function have { type "$1" &>/dev/null ; }

Then I had to confront file paths, such as the MAILCAP variable. A lot of the same logic (i.e. add_to_path_force), but entry validation is different:

function add_to_path_file \
    local file="${2}"
    [ -f "$file" -a -r "$file" ] || return
    # realpath alias may not be set up yet
    file=`realpath_func "$file"`
    add_to_path_force "$1" "$file"

You'll note the realpath_func line in there. realpath is a program that takes a filename or directory name and resolves the symlinks in it. Unfortunately, realpath is a slightly unusual program; I've only ever found it on OSX (it may be on other BSDs). But, with the power of my pwd -P trick, I can fake most of it. The last little piece (resolving a file symlink) relies on a tool called readlink ... but I can fake that too. Here are the two functions:

function readlink_func \
    if have readlink ; then
        readlink "$1"
    #elif have perl ; then # seems slower than alternative
    #    perl -e 'print readlink("'"$1"'") . "\n"'
        \ls -l "$1" | sed 's/[^>]*-> //'

function realpath_func \
    local input="${1}"
    local output="/"
    if [ -d "$input" -a -x "$input" ] ; then
        # All too easy...
        output=`( cd "$input"; \pwd -P )`
        # sane-itize the input to the containing folder
        local fname="${input##*/}"
        if [ ! -d "$input" -o ! -x "$input" ] ; then
            echo "$input is not an accessible directory" >&2
        output="`( cd "$input" ; \pwd -P )`/"
        # output is now the realpath of the containing folder
        # so all we have to do is handle the fname (aka "input)
        if [ ! -L "$output$input" ] ; then
            input="`readlink_func "$output$input"`"
            while [ "$input" ] ; do
                if [[ $input == /* ]] ; then
                elif [[ $input == ../* ]] ; then
                elif [[ $input == ./* ]] ; then
                elif [[ $input == */* ]] ; then
                if [ -L "${output%%/}" ] ; then
                    if [ "$input" ] ; then
                        input="`readlink_func "${output%%/}"`/$input"
                        input="`readlink_func "${output%%/}"`"
    echo "${output%%/}"

Loading System-wide Bashrc

This section isn't too exciting. According to the man page:

When bash is invoked as an interactive login shell, or as a non-interactive shell with the --login option, it first reads and executes commands from the file /etc/profile, if that file exists. After reading that file, it looks for ~/.bash_profile, ~/.bash_login, and ~/.profile, in that order, and reads and executes commands from the first one that exists and is readable.

SOME systems have a version of bash that appears not to obey this rule. And some systems put crucial configuration settings in /etc/bashrc (why?!?). And some systems even do something silly like use /etc/bashrc to source ~/.bashrc (I did this myself, once upon a time, when I knew not-so-much). I've decided that this behavior cannot be relied upon, so I explicitly source these files myself. The only interesting bit is that I added a workaround so that systems that use /etc/bashrc to source ~/.bashrc won't get into an infinite loop. There's probably a lot more potential trouble here that I'm ignoring. But here's the code:

if [[ -r /etc/bashrc && $SYSTEM_BASHRC != 1 ]]; then
    dprint " - loading /etc/bashrc"
    . /etc/bashrc
    export SYSTEM_BASHRC=1

Behavioral Settings

This is basic stuff, but after you get used to certain behaviors (such as whether * matches . and ..), you often get surprised when they don't work that way on other systems. Some of this is because I found a system that did it another way by default; some is because I decided I like my defaults and I don't want to be surprised in the future.

The interactive-shell-detection here is nice. $- is a variable set by bash containing a set of letters indicating certain settings. It always contains the letter i if bash is running interactively. So far, this has been quite backwards-compatible.

shopt -s extglob # Fancy patterns, e.g. +()
# only interactive
if [[ $- == *i* ]]; then
    dprint setting the really spiffy stuff
    shopt -s checkwinsize # don't get confused by resizing
    shopt -s checkhash # if hash is broken, doublecheck it
    shopt -s cdspell # be tolerant of cd spelling mistakes

Environment Variables

There are a slew of standard environment variables that bash defines for you (such as HOSTNAME). There are even more standard environment variables that various programs pay attention to (such as EDITOR and PAGER). And there are a few others that are program-specific (such as PARINIT and CVSROOT).

Before I get going, though, let me show you a secret. Ssh doesn't like transmitting information from client to server shell... the only reliable way to do it that I've found is the TERM variable. So... I smuggle info through that way, delimited by colons. Before I set any other environment variables, first, I find my smuggled information:

if [[ $TERM == *:* && ( $SSH_CLIENT || $SSH_TTY || $SSH_CLIENT2 ) ]] ; then
    dprint "Smuggled information through the TERM variable!"
    term_smuggling=( ${TERM//:/ } )
    export SSH_LANG=${term_smuggling[1]}
    unset term_smuggling

I begin by setting GROUPNAME and USER in a standard way:

if [[ $OSTYPE == solaris* ]] ; then
    idout=(`/bin/id -a`)
    [[ $USER == ${idout[0]} ]] && USER="UnknownUser"
    unset idout
    [[ -z $GROUPNAME ]] && GROUPNAME="`id -gn`"
    [[ -z $USER ]] && USER="`id -un`"

Then some standard things (MAILPATH is used by bash to check for mail, that kind of thing), including creating OS_VER and HOST to allow me to identify the system I'm running on:

# I tote my own terminfo files around with me
[ -d ~/.terminfo ] && export TERMINFO=~/.terminfo/
[ "$TERM_PROGRAM" == "Apple_Terminal" ] && \
    export TERM=nsterm-16color

add_to_path_file MAILPATH /var/spool/mail/$USER
add_to_path MAILPATH $HOME/Maildir/
[[ -z $MAILPATH ]] && unset MAILCHECK
[[ -z $HOSTNAME ]] && \
    export HOSTNAME=`/bin/hostname` && echo 'Fake Bash!'
[ -z "$OS_VER" ] && OS_VER=$( uname -r )
OS_VER=(${OS_VER//./ })
PARINIT="rTbgq B=.,?_A_a P=_s Q=>|}+"


I've also gotten myself into trouble in the past with UMASK being set improperly, so it's worth setting manually. Additionally, to head off trouble, I make it hard to leave myself logged in as root on other people's systems accidentally:

if [[ $GROUPNAME == $USER && $UID -gt 99 ]]; then
    umask 002
    umask 022

if [[ $USER == root ]] ; then
    [[ $SSH_CLIENT || $SSH_TTY || $SSH_CLIENT2  ]] && \
        export TMOUT=600 || export TMOUT=3600

if [[ -z $INPUTRC && ! -r $HOME/.inputrc && -r /etc/inputrc ]];
    export INPUTRC=/etc/inputrc

It is at this point that we should pause and load anything that was in /etc/profile, just in case it was left out (and, if its in there, maybe it should override what we've done so far):


if [[ -r /etc/profile && -z $SYSTEM_PROFILE ]]; then
    dprint "- loading /etc/profile ... "
    . /etc/profile
    export SYSTEM_PROFILE=1

Now I set my prompt (but only if this is an interactive shell). There are several details here (obviously). The first is that, if I'm logged into another system, I want to see how long I've been idle. So I include a timestamp whenever I'm logged into a remote system. I also added color to my prompt in two ways, which has been very useful. First, it changes the color of the $ at the end of the prompt to red if the last command didn't exit cleanly. Second, remote systems have yellow prompts, whenever I'm root I have a red prompt, and I created commands to flip between a few other colors (blue, purple, cyan, green, etc.) in case I find that useful to quickly distinguish between terminals. Anyway, here's the code:

if [[ $- == *i* ]]; then
    if [[ $TERM == xterm* || $OSTYPE == darwin* ]]; then
        # This puts the term information into the title
        PSterminfo='\[\e]2;\u@\h: \w\a\]'
    PSparts[3]='(\d \T)\n'
    PSparts[2]='[\u@\h \W]'
    PSparts[1]='\$ '
    PScolors[2]='\[\e[34m\]' # Blue
    PScolors[3]='\[\e[35m\]' # Purple
    PScolors[4]='\[\e[36m\]' # Cyan
    PScolors[5]='\[\e[32m\]' # Green
    PScolors[6]='\[\e[33m\]' # Yellow
    PScolors[100]='\[\e[31m\]' # Badc
    PScolors[0]='\[\e[0m\]' # Reset
    if [[ $USER == root ]] ; then
        PScolors[1]='\[\e[31m\]' # Red
    elif [[ $SSH_CLIENT || $SSH_TTY || $SSH_CLIENT2 ]] ; then
        PScolors[1]="${PScolors[6]}" # yellow
        if [[ $HOSTNAME == marvin ]] ; then
            PScolors[1]="${PScolors[5]}" # green
        unset PSparts[3]
    function bashrc_genps {
        if [ "$1" -a "${PScolors[$1]}" ] ; then
    bashrc_genps 1
    function safeprompt {
        export PS1='{\u@\h \W}\$ '
        unset PROMPT_COMMAND
    alias stdprompt='bashrc_genps 1'
    alias blueprompt='bashrc_genps 2'
    alias purpleprompt='bashrc_genps 3'
    alias cyanprompt='bashrc_genps 4'
    alias greenprompt='bashrc_genps 5'
    alias whiteprompt='bashrc_genps'
    # this is executed before every prompt is displayed
    # it changes the prompt based on the preceeding command
    export PROMPT_COMMAND='[ $? = 0 ] && PS1=$PSgood || PS1=$PSbad'

Now I set up the various paths. Note that it doesn't matter if these paths don't exist; they'll be checked and ignored if they don't exist:

verify_path PATH
add_to_path PATH "/usr/local/sbin"
add_to_path PATH "/usr/local/teTeX/bin"
add_to_path PATH "/usr/X11R6/bin"
add_to_path PATH "$HOME/bin"
add_to_path_first PATH "/sbin"

add_to_path_first PATH "/bin"
add_to_path_first PATH "/usr/sbin"
add_to_path_first PATH "/opt/local/bin"
add_to_path_first PATH "/usr/local/bin"

if [[ $OSTYPE == darwin* ]] ; then
    add_to_path PATH "$HOME/.conf/darwincmds"

    # The XFILESEARCHPATH (for app-defaults and such)
    # is a wonky kind of path
    [ -d /opt/local/lib/X11/app-defaults/ ] && \
        add_to_path_force XFILESEARCHPATH \
    [ -d /sw/etc/app-defaults/ ] && \
        add_to_path_force XFILESEARCHPATH /sw/etc/%T/%N
    add_to_path_force XFILESEARCHPATH /private/etc/X11/%T/%N

verify_path MANPATH
add_to_path MANPATH "/usr/man"
add_to_path MANPATH "/usr/share/man"
add_to_path MANPATH "/usr/X11R6/man"
add_to_path_first MANPATH "/opt/local/share/man"
add_to_path_first MANPATH "/opt/local/man"
add_to_path_first MANPATH "/usr/local/man"
add_to_path_first MANPATH "/usr/local/share/man"

verify_path INFOPATH
add_to_path INFOPATH "/usr/share/info"
add_to_path INFOPATH "/opt/local/share/info"

And now there are STILL MORE environment variables to set. This final group may rely on some of the previous paths being set (most notably, PATH).

export PAGER='less'
have vim && export EDITOR='vim' || export EDITOR='vi'
if [[ -z $DISPLAY && $OSTYPE == darwin* ]]; then
    processes=`ps ax`
    # there are double-equals here, even though they don't show
    # on the webpage
    if [[ $processes == *xinit* || $processes == *quartz-wm* ]]; then
        export DISPLAY=:0
        unset DISPLAY
if [[ $HOSTNAME == wizard ]] ; then
    dprint Wizards X forwarding is broken
    unset DISPLAY
export TZ="US/Central"
if [ "${BASH_VERSINFO[0]}" -le 2 ]; then
    export HISTCONTROL=ignoreboth
    export HISTCONTROL="ignorespace:erasedups"
export HISTIGNORE="&:ls:[bf]g:exit"
export GLOBIGNORE=".:.."
export CVS_RSH=ssh
export BASH_ENV=$HOME/.bashrc
add_to_path_file MAILCAPS $HOME/.mailcap
add_to_path_file MAILCAPS /etc/mailcap
add_to_path_file MAILCAPS /usr/etc/mailcap
add_to_path_file MAILCAPS /usr/local/etc/mailcap
export EMAIL=''
export GPG_TTY=$TTY
export RSYNC_RSH="ssh -2 -c arcfour -o Compression=no -x"
if [ -d /opt/local/include -a -d /opt/local/lib ] ; then
    export CPPFLAGS="-I/opt/local/include $CPPFLAGS"
    export LDFLAGS="-L/opt/local/lib $LDFLAGS"
if have glibtoolize ; then
    have libtoolize || export LIBTOOLIZE=glibtoolize

One little detail that I rather like is the fact that xterm's window title often tells me exactly what user I am on what machine I am, particularly when I'm ssh'd into another host. This little bit of code ensures that this happens:

if [[ $TERM == xterm* || $OSTYPE == darwin* ]]; then
    export PROMPT_COMMAND='echo -ne "\033]0;${USER}@${HOSTNAME/.*/}: ${PWD/${HOME}/~}\007"'

Character Set Detection

I typically work in a UTF-8 environment. MacOS X (my preferred platform for day-to-day stuff) has made this pretty easy with really excellent UTF-8 support, and Linux has come a long way (to parity, as far as I can tell) in the last few years. Most of my computing is done via a uxterm (aka. xterm with UTF-8 capability turned on), but I also occasionally dabble in other terminals (sometimes without realizing it). Despite the progress made, however, not all systems support UTF-8, and neither do all terminals. Some systems, including certain servers I've used, simply don't have UTF-8 support installed, even though they're quite capable of it.

The idea is that the LANG environment variable is supposed to reflect the language you want to use and character set your terminal can display. So, this is where I try and figure out what LANG should be.

The nifty xprop trick here is from a vim hint I found. I haven't used it for very long, but so far it seems to be a really slick way of finding out what sort of environment your term is doing, even if it hasn't set the right environment variables (e.g. LANG).

One of the more annoying details of this stuff is that ssh doesn't pass LANG (or any other locale information) along when you connect to a remote server. Granted, there are good reasons for this (just because my computer is happy when LANG=en_US.utf-8 doesn't mean any server I connect to would be), but at the same time, shouldn't the remote server be made aware of my local terminal's capabilities? Imagine if I connected to a server that defaults to Japanese, but I want it to know that I use English! Remember how I smuggled that information through in TERM and stuck it in the SSH_LANG variable? Here's where it becomes important.

I've also fiddled with different variations of this code to make it as compatible as possible. So far, this should work with Bash 2.05b and up... though that makes it slightly awkward-looking.

As a final note here, I discovered that less is capable of handling multibyte charsets (at least, recent versions of it are), but for whatever reason it doesn't always support LANG and other associated envariables. It DOES however support LESSCHARSET...

Anyway, here's the code:

if [[ -z $LANG ]] ; then
    dprint no LANG set
    if [[ $WINDOWID ]] && have xprop ; then
        dprint querying xprop
        __bashrc__wmlocal=(`xprop -id $WINDOWID -f WM_LOCALE_NAME 8s ' $0' -notype WM_LOCALE_NAME`)
        export LANG=`eval echo ${__bashrc__wmlocal[1]}`
        unset __bashrc__wmlocal
    elif [[ $OSTYPE == darwin* ]] ; then
        dprint "I'm on Darwin"
        if [[ ( $SSH_LANG && \
            ( $SSH_LANG == *.UTF* || $SSH_LANG == *.utf* ) || \
            $TERM_PROGRAM == Apple_Terminal ) && \
            -d "/usr/share/locale/en_US.UTF-8" ]] ; then
            export LANG='en_US.UTF-8'
        elif [ -d "/usr/share/locale/en_US" ] ; then
            export LANG='en_US'
            export LANG=C
    elif [[ $TERM == linux || $TERM_PROGRAM == GLterm ]] ; then
        if [ -d "/usr/share/locale/en_US" ] ; then
            export LANG='en_US'
            export LANG=C # last resort
        if [[ $SSH_LANG == C ]] ; then
            export LANG=C
        elif have locale ; then
            dprint "checking locale from big list (A)"
            locales=`locale -a`
            locales="${locales//[[:space:]]/|}" # not +() because that's slow
            if [[ en_US.utf8 == @($locales) ]] ; then
                export LANG='en_US.utf8'
            elif [[ en_US.utf-8 == @($locales) ]] ; then
                export LANG='en_US.utf-8'
            elif [[ en_US.UTF8 == @($locales) ]] ; then
                export LANG='en_US.UTF8'
            elif [[ en_US.UTF-8 == @($locales) ]] ; then
                export LANG='en_US.UTF-8'
            elif [[ en_US == @($locales) ]] ; then
                export LANG='en_US'
                export LANG=C
            unset locales
    dprint "- LANG IS ALREADY SET! ($LANG)"
    if [[ $SSH_LANG && $SSH_LANG != $LANG ]]; then
        if [[ $SSH_LANG == C ]] ; then
            export LANG=C
            dprint "checking locale from big list (B)"
            locales=`locale -a`
            locales="${locales//[[:space:]]/|}" # not +() because that's slow
            if [[ $SSH_LANG == @(${locales}) ]] ; then
                dprint "- SSH_LANG is a valid locale, resetting LANG"
                dprint "- SSH_LANG is NOT a valid locale"
                if [[ $SSH_LANG == *.(u|U)(t|T)@(f|F)?(-)8 ]] ; then
                    if [[ ! $LANG == *.(u|U)(t|T)@(f|F)?(-)8 ]] ; then
                        dprint "- want utf-8, but LANG is not utf8, unsetting"
                        unset LANG
                    dprint "- don't want utf-8"
                if [[ ! $LANG || ! $LANG == @($locales) ]] ; then
                    if [ "$wantutf8" = yes ] ; then
                        dprint "- finding a utf8 LANG"
                        if [[ en_US.utf8 == @($locales) ]] ; then
                            export LANG='en_US.utf8'
                        elif [[ en_US.utf-8 == @($locales) ]] ; then
                            export LANG='en_US.utf-8'
                        elif [[ en_US.UTF8 == @($locales) ]] ; then
                            export LANG='en_US.UTF8'
                        elif [[ en_US.UTF-8 == @($locales) ]] ; then
                            export LANG='en_US.UTF-8'
                        elif [[ en_US == @($locales) ]] ; then
                            export LANG='en_US'
                            export LANG=C
                        dprint "- finding a basic LANG"
                        if [[ en_US == @($locales) ]] ; then
                            export LANG='en_US'
                            export LANG=C
                unset wantutf8
            unset locales
        dprint "- ... without SSH_LANG, why mess with it?"
dprint - LANG is $LANG
if [[ $LANG == *.(u|U)(t|T)@(f|F)?(-)8 ]] ; then
    export LESSCHARSET=utf-8


This is where a lot of the real action is, in terms of convenience settings. Like anyone who uses a computer every day, I type a lot; and if I can avoid it, so much the better. (I'm a lazy engineer.)

Sometimes I can't quite get what I want out of an alias. In csh aliases can specify what to do with their arguments. In bash, aliases are really more just shorthand — "pretend I really typed this" kind of stuff. Instead, if you want to be more creative with argument handling, you have to use functions (it's not a big deal, really). Here's a few functions I added just because they're occasionally handy to have the shell do for me:

function exec_cvim {
/Applications/ -g "$@" &

function darwin_locate { mdfind "kMDItemDisplayName == '$@'wc"; }
if [[ $- == *i* && $OSTYPE == darwin* && ${OS_VER[0]} -ge 8 ]] ;
alias locate=darwin_locate

function printargs { for F in "$@" ; do echo "$F" ; done ; }
function psq { ps ax | grep -i $@ | grep -v grep ; }
function printarray {
for ((i=0;$i<`eval 'echo ${#'$1'[*]}'`;i++)) ; do
    echo $1"[$i]" = `eval 'echo ${'$1'['$i']}'`
alias back='cd $OLDPWD'

There are often a lot of things that I just expect to work. For example, when I type "ls", I want it to print out the contents of the current directory. In color if possible, without if necessary. It often annoys me, on Solaris systems, when the working version of ls is buried in the path, while a really lame version is up in /bin for me to find first. Here's how I fix that problem:

# GNU ls check
if [[ $OSTYPE == darwin* ]]; then
    dprint "- DARWIN ls"
    alias ls='/bin/ls -FG'
    alias ll='/bin/ls -lhFG'
elif have colorls ; then
    dprint "- BSD colorls"
    alias ls='colorls -FG'
    alias ll='colorls -lhFG'
    __kbwbashrc__lsarray=(`\type -ap ls`)
    for ((i=0;$i<${#__kbwbashrc__lsarray[*]};i=$i+1)) ; do
        if ${__kbwbashrc__lsarray[$i]} --version &>/dev/null ;
            dprint "- found GNU ls: ${__kbwbashrc__lsarray[$i]}"
            alias ls="${__kbwbashrc__lsarray[$i]} --color -F"
            alias ll="${__kbwbashrc__lsarray[$i]} --color -F -lh"
    if [ "$__kbwbashrc__lsfound" == no ] ; then
        if ls -F &>/dev/null ; then
            dprint "- POSIX ls"
            alias ls='ls -F'
            alias ll='ls -lhF'
            alias ll='ls -lh'
    unset __kbwbashrc__lsarray __kbwbashrc__lsfound

Similar things are true of make and sed and such. I've gotten used to GNU's version, and if they exist on the machine I'd much rather automatically use them than have to figure out whether it's really called gnused or gsed or justtowasteyourtimesed all by myself:

if [[ $OSTYPE == linux* ]] ; then
    # actually, just Debian, but this works for now
    alias gv="gv --watch --antialias"
    alias gv="gv -watch -antialias"
if have gsed ; then
    alias sed=gsed
elif have gnused ; then
    alias sed=gnused
if have gmake ; then
    alias make=gmake
elif have gnumake ; then
    alias make=gnumake

The rest of them are mostly boring, with one exception:

alias macfile="perl -e 'tr/\x0d/\x0a/'"
have tidy && alias tidy='tidy -m -c -i'
have vim && alias vi='vim'
alias vlock='vlock -a'
alias fastscp='scp -c arcfour -o Compression=no' # yay speed!
alias startx='nohup ssh-agent startx & exit'
alias whatlocale='printenv | grep ^LC_'
alias fixx='xauth generate $DISPLAY'
alias whatuses='fuser -v -n tcp'
alias which=type
alias ssh='env TERM="$TERM:$LANG" ssh'
have realpath || alias realpath=realpath_func
if have readlink ; then
    unset -f readlink_func
    alias readlink=readlink_func
if [[ $OSTYPE == darwin* ]]; then
    alias top='top -R -F -ocpu -Otime'
    alias cvim='exec_cvim'
    alias gvim='exec_cvim'

Did you note that ssh alias? Heh.

Tab-completion Options

Bash has had, for a little while at least, the ability to do custom tab-completion. This is really convenient (for example, when I've typed cvs commit and I hit tab, bash can know that I really just want to tab-complete files that have been changed). However, I won't bore you with a long list of all the handy tab-completions that are out there. Most of mine are just copied from here anyway. But I often operate in places where that big ol' bash-completion file can be in multiple places. Here's the simple little loop I use. You'll notice that it only does the loop after ensuring that bash is of recent-enough vintage:

if [[ $BASH_VERSION && -z $BASH_COMPLETION && $- == *i* ]] ;
    bash=${BASH_VERSION%.*}; bmajor=${bash%.*}; bminor=${bash#*.}
    if [ $bmajor -eq 2 -a $bminor '>' 04 ] || [ $bmajor -gt 2 ] ;
        for bc in "${completion_options[@]}" ; do
            if [[ -r $bc ]] ; then
                dprint Loading the bash_completion file
                if [ "$BASH_COMPLETION" ] ; then
                export COMP_CVS_ENTRIES=yes
                source "$bc"
    unset bash bminor bmajor
unset completion_options

Machine-local settings

You'd be surprised how useful this can be sometimes. Sometimes I need machine-specific settings. For example, on some machines there's a PGI compiler I want to use, and maybe it needs some environment variable set. Rather than put it in the main bashrc, I just put that stuff into ~/.bashrc.local and have it loaded:

dprint checking for bashrc.local in $HOME
if [ -r "${HOME}/.bashrc.local" ]; then
    dprint Loading local bashrc
    source "${HOME}/.bashrc.local"


Lastly, it is sometimes the case that the TMOUT variable has been set, either by myself, or by a sysadmin who doesn't like idle users (on a popular system, too many idle users can unnecessarily run you out of ssh sockets, for example). In any case, when my time is limited, I like being aware of how much time I have left. So I have my bashrc detect the TMOUT variable and print out a big banner so that I know what's up and how much time I have. Note that bash can do simple math all by itself with the $(( )) construction. Heheh. Anyway:

if [[ $TMOUT && "$-" == *i* ]]; then
    echo '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'
    echo You will be autologged out after:
    echo -e -n '\t'
    [[ $days != 0 ]] && echo -n "$days days "
    [[ $hours != 0 ]] && echo -n "$hours hours "
    [[ $minutes != 0 ]] && echo -n "$minutes minutes "
    [[ $seconds != 0 ]] && echo -n "$seconds seconds "
    echo ... of being idle.
    unset days hours minutes seconds


While I'm at it, I suppose I should point out that I also have a ~/.bash_logout file that's got some niceness to it. If it's the last shell, it clears sudo's cache, empties the console's scrollback buffer, and clears the screen. Note: DO NOT PUT THIS IN YOUR BASHRC You wouldn't like it in there.

if [ "$SHLVL" -eq 1 ] ; then
    sudo -k
    type -P clear_console &>/dev/null && clear_console 2>/dev/null

And that's about it! Of course, I'm sure I'll add little details here and there and this blog entry will become outdated. But hopefully someone finds my bashrc useful. I know I've put a lot of time and effort into it. :)

April 8, 2008

Leopard - Finally!

So, I upgraded to MacOS 10.5 recently (from 10.4). Those of you who know me will doubtless be thinking “my god, man, what took so long?!?”, and that’s a longer story than I want to get into right now. Suffice to say: we’re rocking and rolling now!

My impressions of the new OS are pretty favorable. I’ve read all the complaints about the UI changes, and they have some merit. By the time I upgraded, Apple had already released 10.5.2, which addressed many of the more unfortunate problems for people like me who put /Applications into the Dock.

I really DO like the “Fan” icon display, though, particularly for the new “Downloads” folder. Creating a folder just for downloads is something I could have done years ago, of course, but I hadn’t - everything downloaded to the Desktop, which inevitably became incredibly cluttered. But I love the new approach, and part of what makes it especially useful is that things in the “Fan” display can be dragged to the trash. HA! I love it! It’s the little things that make me happy. :)

The new X11 is a bit of a pain in the butt. I’d become very used to using xterm - or more precisely, uxterm - for all my terminal needs (which is to say, for 90% of what I do with my computer). That’s not so tenable now, particularly since Apple has apparently decided that uxterm was just too useful a shell script to let stand. I am keeping a copy of that shell script (which just runs xterm with all the necessary utf-8 flags and sets the LANG appropriately) handy, just in case, but for the time being, I’ve decided to migrate to using Apple’s Terminal full time now. Undoubtedly, it’s still not as fast as uxterm, but since getting an Intel iMac, I don’t really notice anymore (on the old dual 500Mhz G4, it was definitely noticable).

For migrating, I’ve had to create my own nsterm-16color termcap file (which I keep in ~/.terminfo/n/nsterm-16color ) in order to ensure that all the features I want work properly. I stole the file from ncurses 5.6, and modified it to add correct dual-mode swapping ( smcup=\E7\E[?47h, rmcup=\E[2J\E[?47l\E8 ) and then to support the home and end keys ( khome=\E[H, kend=\E[F ). These are things that the native OSX dtterm/xterm/xterm-color/whatever terminfo settings don’t do correctly. ( WHY???) …And then, of course, I had to fix the key mapping of pageup/shift-pageup and pagedown/shift-pagedown and all the relative keys, but that was easy to do in the’s preferences. The defaults are sensible, just not for folks who are used to xterm’s behavior. I also re-discovered that I hate’s default blue (a dark, almost-midnight blue), and much prefer having a lighter one. Thankfully I’m not the only one - Ciarán Walsh’s update to the TerminalColors plugin is solid and works well.

Other than that, things have been pretty smooth. I haven’t experienced any really strange compatibility problems — in large part, I think, because I keep my system pretty up-to-date, so I already had the “Leopard-compatible” versions of all the software I use (and all the Unix applications seem to work flawlessly without even needing a recompile - huzzah for that!).

The one application that needed SERIOUS fiddling is VirtualBox. They have an OSX version, but only in beta form. I use it mostly so I can provide sensible Windows XP support to relatives who have computer questions (and for doing browser compatibility tests). I had been using Beta 2 (1.4.6), which had worked flawlessly for my needs. Unfortunately, Beta 2 isn’t compatible with Leopard, so an upgrade to the latest (Beta 3) was necessary. THIS beta seems to have a few problems. For one thing, it can’t understand all the old machine definitions (so when upgrading, make sure you don’t have any important system snapshots or saved machine state that you need). However, it does understand the old disk files, so it’s a simple matter to create a new machine definition using the old disk. The new machine still won’t BOOT, though, and it took me an hour or so of fiddling to figure out how to fix it.

There are two major problems that crop up. First: they changed the default IDE controller for Windows XP guests. The old default was PIIX3; the new default is PIIX4. Either one will work, and if you install XP from scratch on a newly created XP host, it will work with the PIIX4 controller just fine. But if you’re booting from an XP that was created with Beta 2 (i.e. a WindowsXP installation that thinks you have a PIIX3 controller), it will blue-screen and reboot immediately after displaying the Microsoft logo: not good. Fixing it is easy, though: just change the IDE controller for your XP machine in the machine settings dialog.

The second problem is that the network doesn’t work. Actually, that’s not true, the network works just fine, it’s DNS resolution that doesn’t work (but one looks a lot like the other when you’re not paying close attention to error messages). For whatever reason, when your XP system uses DHCP to get its network information, the information it receives from VirtualBox is wrong. Specifically, VirtualBox tells it to resolve DNS names by contacting; it should be contacting (i.e. the same as the router). Fixing this was just a matter of changing Windows’ network configuration to use a custom DNS server ( rather than the one supplied by DHCP. Annoying, but nothing terrible.

The only other stumbling block in Leopard that I’ve come across is the iChat-vs-Internet-Sharing problem that other people have discovered. Essentially, if you have enabled Internet Sharing, iChat can’t do video conferencing. Something to do with being able to remap ports… the explanations I’ve read are rather vague. It’s not especially important to me, but came up when I was trying to demonstrate the virtues of Leopard to Emily.

Which reminds me: the new iChat is MUCH better for talking to multiple people at the same time. The “tabbed” chatting interface is terrific. The vaunted “Spaces” (virtual desktops) are nice, and implemented well, but I gotta say that I’ve gotten used to having just one desktop these days (I use Exposé a lot). Getting used to having the extra desktops will probably take a while.

Two more features I noticed were the Quick View (in Finder, press the space bar to quickly view something) and Web Clips (in Safari, you can take a snippet of a webpage and turn it into a Dashboard widget). Quick View is pretty great, especially for folders full of PDFs, because you can leave it up and keep navigating around the Finder (the contents of the Quick View window will track whatever you select in the Finder), but since I don’t spend much time in the Finder, it’s of limited use. If I could integrate it with my ~/.mailcap file, now THAT would be awesome. Web Clips are not quite as great as they could be. For one thing, they don’t refresh quickly (but they DO refresh—at first I didn’t think they did—and in the worst case, you can click on them and press Ctrl-R to force the issue), and for another, they can’t scale — many of the things I want to clip are large graphics that I wish to monitor. If OSX could scale clips down for me, that would make them much more useful.

Which reminds me — one new feature of Leopard that I adore is their new built-in VNC viewer. It may not actually be VNC, but that’s fine by me — it’s blazing fast, and best of all, it scales the screen down so that you can easily control a screen that’s larger than the one you have. Chicken of the VNC used to be a must-have application for me, but Leopard’s built-in screen viewer is much better for what I usually want to do (which is control the iMac upstairs from the laptop down on the couch).

April 24, 2008

YAASI: Yet Another Anti-Spam Idea

Branden and I had an idea to help with the spam problem on our system, and it’s proven particularly effective. How effective? Here’s the graphs from the last year of email on my system. Can you tell when I started using the system?

If you want to see the live images, check here.

The idea is based on the following observations: certain addresses on my domain ONLY get spam. This is generally because they either don’t exist or because I stopped using them; for example, spammers often send email to Branden and I also both use the user-tag@domain scheme, so we get a lot of disposable addresses that way. These addresses are such that we know for certain that anyone sending email to them is a spammer. Some of these addresses were already being rejected as invalid; some we hadn’t gotten around to invalidating yet.

By simply rejecting emails sent to those addresses, we were able to reduce the spam load of our domains by a fair bit, and the false-positive rate is nil. But we took things a step further: since spammers rarely send only one message, often they will send spam to both invalid AND valid addresses.

If I view those known-bad addresses as, essentially, honeypots, I can say: aha! Any IP sending to a known-bad address is a spammer, and I can refuse (with a permanent fail) any email from that IP for some short time. I started with 5 minutes, but have moved to an exponentially increasing timeout system. Each additional spam increased the length of the timeout (5 minutes for the first spam, 6 for the second, 8 for the third, and so on). Longer-term bans, as a result of the exponentially increasing timeout, are made more efficient via the equivalent of /etc/hosts.deny. I haven’t gotten into the maintaining-my-spammer-database much yet, but I think this may not be terribly important (I’ll explain in a moment).

One of the best parts of the system is that it is fast: new spammers that identify themselves by sending to honeypot addresses get blocked quickly and without my intervention. So far this has been particularly helpful in eliminating spam spikes. Another feature that I originally thought would be useful, but hasn’t really appeared to be (yet) is that it allows our multiple domains to share information about spam sources. Thus far, however, our domains seem to be plagued by different spammers.

Now, interestingly, about a week after we started using the system, our database of known spammers was wiped out (it’s kept in /tmp, and we rebooted the system). Result? No noticeable change in effectiveness. How’s that for a result? And, as you can see from the graph above, there’s no obvious change in spam blocking over the course of a month that would indicate that the long-term history is particularly useful. So, it may be sufficient to keep a much shorter history. Maybe only a week is necessary, maybe two weeks, I haven’t decided yet (and, as there hasn’t yet been much of a speed penalty for it, there’s no pressure to establish a cutoff). But, given that most spam is sent from botnets with dynamic IPs, this isn’t a particularly surprising behavior. and have been using this filter for a month so far. The week before we started using this filter, averaged around 262 emails per hour. The week after instituting this filter, the average was around 96 per hour (a 60+% reduction!). Before using the filter, averaged 70 emails per hour; since starting to use the filter, that number is down to 27.4 per hour (also a 60+% reduction). We have recorded spams from over 33,000 IPs, most of which only ever sent one or two spams. We typically have between 100 and 150 IPs that are “in jail” at any one time (at this moment: 143), and most of those (at this moment 134) are blocked for sending more than ten spams (114 of them have a timeout measured in days rather than minutes).

Now, granted, I know that by simply dropping 60% of all connections we’d get approximately the same results. But I think our particular technique is superior to that because it’s based on known-bad addresses. Anyone who doesn’t send to invalid addresses will never notice the filter.

The biggest potential problem that I can see with this system is that of spammers who have taken over a normally friendly host, such as Gmail spam. I’ve waffled on this potential problem: on the one hand, Gmail has so many outbound servers that it’s unlikely to get caught (a couple bad emails won’t have much of a penalty). Thus far, I’ve seen a few yahoo servers in Japan sending us spam, but no Gmail servers. On the other hand, as long as I simply use temporary failures (at least for good addresses), and as long as ND doesn’t retry in the same order every time, messages will get through.

I’ve also begun testing a “restricted sender” feature to work with this. For example, I have the address that I use exclusively for my account. The only people who are allowed to send to that email address is (i.e. if I forget my password). If anyone from any other domain attempts that address, well, then I know that sending IP is a spammer and I can treat it as if it was a known-bad address. Not applicable to every email address, obviously, but it’s a start.

It’s been pointed out that this system is, in some respects, a variant on greylisting. The major difference is that it’s a penalty-based system, rather than a “prove yourself worthy by following the RFC” system, and I like that a bit better. I’m somewhat tempted to define some bogus address ( and sign it up for spam (via or something similar), but given that part of the benefit here is due to spammers trying both valid and invalid addresses, I think it would probably just generate lots of extra traffic and not achieve anything particularly useful.

Now, this technique is simply one of many; it’s not sufficient to guarantee a spam-free inbox. I use it in combination with several other antispam techniques, including a greet-delay system and a frequently updated SpamAssassin setup. But check out the difference it’s made in our CPU utilization:

Okay, so, grand scheme of things: knocking the CPU use down three percentage points isn’t huge, but knocking it down by 50%? That sounds better, anyway. And as long as it doesn’t cause problems by making valid email disappear (possible, but rather unlikely), it seems to me to be a great way to cut my spam load relatively easily.

June 27, 2008

I Hate Procmail

Its error handling is CRAP.

I am coming to this realization because I recently lost a BUNCH of messages because of a bad delivery path (I told procmail to pipe messages to a non-existent executable). So what did procmail do? According to its log:

/bin/sh: /tmp/dovecot11/libexec/dovecot/deliver: No such file or directory
procmail: Error while writing to "/tmp/dovecot11/libexec/dovecot/deliver"

Well, sure, that’s to be expected, right? So what happened to the email? VANISHED. Into the bloody ether.

Of course, determining that the message vanished is trickier than just saying “hey, it’s not in my mailbox.” Oh no, there’s a “feature”, called ORGMAIL. What is this? According to the procmailrc documentation (*that* collection of wisdom):

ORGMAIL     Usually the system  mailbox  (ORiGinal  MAIL‐
            box).   If,  for  some  obscure  reason (like
            ‘filesystem full’)  the  mail  could  not  be
            delivered, then this mailbox will be the last
            resort.  If procmail fails to save  the  mail
            in  here  (deep,  deep  trouble :-), then the
            mail will bounce back to the sender.

And so where is THAT? Why, /var/mail/$LOGNAME of course, where else? And if LOGNAME isn’t set for some reason? Or what if ORGMAIL is unset? Oh, well… nuts to you! Procmail will use $SENDMAIL to BOUNCE THE EMAIL rather than just try again later. That’s what they mean by “deep, deep trouble.” Notice the smiley face? Here’s why the manual has a smiley-face in it: to mock your pain.

But here’s the real crux of it: procmail doesn’t see delivery errors as FATAL. If one delivery instruction fails, it’ll just keep going through the procmailrc, looking for anything else that might match. In other words, the logic of your procmailrc has to take into account the fact that sometimes mail delivery can fail. If you fail to do this, your mail CAN end up in RANDOM LOCATIONS, depending on how messages that were supposed to match earlier rules fare against later rules.

If you want “first failure bail” behavior (which makes the most sense, in my mind), you have to add an extra rule after EVERY delivery instruction. For example:

:0 H
* ^From: .*fred@there\.com

:0 e # handle failure
    EXITCODE=75 # set a non-zero exit code
    HOST # This causes procmail to stop, obviously

You agree that HOST means “stop processing and exit”, right? Obviously. That’s procmail for you. Note that that second clause has gotta go after EVERY delivery instruction. I hope you enjoy copy-and-paste.

Another way to handle errors, since successful delivery does stop procmail, is to add something like that to the end of your procmailrc, like so:

:0 # catch-all default delivery

 # If we get this far, there must have been an error

Of course, you could also send the mail to /dev/null at that point, but unsetting the HOST variable (which is what listing it does) does the same thing faster. Intuitive, right? Here’s my smiley-face:


August 18, 2008

More Compiler Complaints: Sparc Edition

Unlike my previous whining about compilers, this one I have no explanation for. It’s not me specifying things incorrectly, it’s just the compiler being broken.

So, here’s the goal: atomically increment a variable. On a Sparc (specifically, SparcV9), the function looks something like this:

static inline int atomic_inc(int * operand)
    register uint32_t oldval, newval;
    newval = *operand;
    do {
        oldval = newval;
        __asm__ __volatile__ ("cas [%1], %2, %0"
            : "=&r" (newval)
            : "r" (operand), "r"(oldval)
            : "cc", "memory");
    } while (oldval != newval);
    return oldval+1;

Seems trivial, right? We use the CAS instruction (compare and swap). Conveniently, whenever the comparison fails, it stores the value of *operand in the second register (i.e. %0 aka newval), so there are no extraneous memory operations in this little loop. Right? Right. Does it work? NO.

Let’s take a look at the assembly that the compiler (gcc) generates with -O2 optimization:

save    %sp, -0x60, %sp
ld      [%i0], %i5      /* newval = *operand; */
mov     %i0, %o1        /* operand is copied into %o1 */
mov     %i5, %o2        /* oldval = newval; */
cas     [%o1], %o2, %o0 /* o1 = operand, o2 = newval, o0 = ? */
restore %i5, 0x1, %o0

Say what? Does that have ANYTHING to do with what I told it? Nope! %o0 is never even initialized, but somehow it gets used anyway! What about the increment? Nope! It was optimized out, apparently (which, in fairness, is probably because we didn’t explicitly list it as an input). Of course, gcc is awful, you say! Use SUN’s compiler! Sorry, it produces the exact same output.

But let’s be a bit more explicit about the fact that the newval register is an input to the assembly block:

static inline int atomic_inc(int * operand)
    register uint32_t oldval, newval;
    newval = *operand;
    do {
        oldval = newval;
        __asm__ __volatile__ ("cas [%1], %2, %0"
            : "=&r" (newval)
            : "r" (operand), "r"(oldval), "0"(newval)
            : "cc", "memory");
    } while (oldval != newval);
    return oldval+1;

Now, Sun’s compiler complains: warning: parameter in inline asm statement unused: %3. Well gosh, isn’t that useful; way to recognize the fact that "0" declares the input to be an output! But at least, gcc leaves the add operation in:

save    %sp, -0x60, %sp
ld      [%i0], %i5      /* oldval = *operand; */
mov     %i0, %o1        /* operand is copied to %o1 */
add     %i5, 0x1, %o0   /* newval = oldval + 1; */
mov     %i5, %o2        /* oldval is copied to %o2 */
cas     [%o1], %o2, %o0
restore %i5, 0x1, %o0

Yay! The increment made it in there, and %o0 is now initialized to something! But what happened to the do{ }while() loop? Sorry, that was optimized away, because gcc doesn’t recognize that newval can change values, despite the fact that it’s listed as an output!

Sun’s compiler will at least leave the while loop in, but will often use the WRONG REGISTER for comparison (such as %i2 instead of %o0).

But check out this minor change:

static inline int atomic_inc(int * operand)
    register uint32_t oldval, newval;
    do {
        newval = *operand;
        oldval = newval;
        __asm__ __volatile__ ("cas [%1], %2, %0"
            : "=&r" (newval)
            : "r" (operand), "r"(oldval), "0"(newval)
            : "cc", "memory");
    } while (oldval != newval);
    return oldval+1;

See the difference? Rather than using the output of the cas instruction (newval), we’re throwing it away and re-reading *operand no matter what. And guess what suddenly happens:

save     %sp, -0x60, %sp
ld       [%i0], %i5           /* oldval = *operand; */
add      %i5, 0x1, %o0        /* newval = oldval + 1; */
mov      %i0, %o1             /* operand is copied to %o1 */
mov      %i5, %o2             /* oldval is copied to %o2 */
cas      [%o1], %o2, %o0
cmp      %i5, %o0             /* if (oldval != newval) */
bne,a,pt %icc, atomic_inc+0x8 /* then go back and try again */
ld       [%i0], %i5
restore  %i5, 0x1, %o0

AHA! The while loop returns! And best of all, both GCC and Sun’s compiler suddenly, magically, (and best of all, consistently) use the correct registers for the loop comparison! It’s amazing! For some reason this change reminds the compilers that newval is an output!

It’s completely idiotic. So, we can get it to work… but we have to be inefficient in order to do it, because otherwise (inexplicably) the compiler refuses to acknowledge that our output register can change.

In case you’re curious, the gcc version is:
sparc-sun-solaris2.10-gcc (GCC) 4.0.4 (gccfss)
and the Sun compiler is:
cc: Sun C 5.9 SunOS_sparc 2007/05/03

June 10, 2009

More Compiler Complaints: PGI Edition

Continuing my series of pointless complaining about compiler behavior (see here and here for the previous entries), I recently downloaded a trial version of PGI’s compiler to put in my Linux virtual machine to see how that does compiling qthreads. There were a few minor things that it choked on that I could correct pretty easily, and some real bizarre behavior that seems completely broken to me.

Subtle Bugs in My Code

Let’s start with the minor mistakes it found in my code that other compilers hadn’t complained about:

static inline uint64_t qthread_incr64(
           volatile uint64_t *operand, const int incr)
  union {
    uint64_t i;
    struct {
      uint32_t l, h;
    } s;
  } oldval, newval;
  register char test;
  do {
    oldval.i = *operand;
    newval.i = oldval.i + incr;
    __asm__ __volatile__ ("lock; cmpxchg8b %1\n\t setne %0"
  } while (test);
  return oldval.i;

Seems fairly straightforward, right? Works fine on most compilers, but the PGI compiler complains that “%sli” is an invalid register. Really obvious error, right? Right? (I don’t really know what the %sli register is for either). Turns out that because setne requires a byte-sized register, I need to tell the compiler that I want a register that can be byte-sized. In other words, that "=r" needs to become "=q". Fair enough. It’s a confusing error, and thus annoying, but I am technically wrong (or at least I’m providing an incomplete description of my requirements) here so I concede the ground to PGI.

Unnecessary Pedantry

And then there are places where PGI is simply a bit more pedantic than it really needs to be. For example, it generates an error when you implicitly cast a volatile struct foo * into a void * when calling into a function. Okay, yes, the pointers are different, but… most compilers allow you to implicitly convert just about any pointer type into a void * without kvetching, because you aren’t allowed to dereference a void pointer (unless you cast again, and if you’re casting, all bets are off anyway), thus it’s a safe bet that you want to work on the pointer rather than what it points to. Yes, technically PGI has made a valid observation, but I disagree that their observation rises to the level of “warning-worthy” (I have no argument if they demote it to the sort of thing that shows up with the -Minform=inform flag).

Flat-out Broken

But there are other places where PGI is simply wrong/broken. For example, if I have (and use) a #define like this:

#define PARALLEL_FUNC(initials, type, shorttype, category) \
type qt_##shorttype##_##category (type *array, size_t length, int checkfeb) \
{ \
  struct qt##initials arg = { array, checkfeb }; \
  type ret; \
  qt_realfunc(0, length, sizeof(type), &ret, \
    qt##initials##_worker, \
    &arg, qt##initials##_acc, 0); \
  return ret; \
PARALLEL_FUNC(uis, aligned_t, uint, sum);

PGI will die! Specifically, it complains that struct qtuisarg does not exist, and that an identifier is missing. In other words, it blows away the whitespace following initials so that this line:

struct qt##initials arg = { array, checkfeb }; \

is interpreted as if it looked like this:

struct qt##initials##arg = { array, checkfeb }; \

But at least that’s easy to work around: rename the struct so that it has a _s at the end! Apparently PGI is okay with this:

struct qt##initials##_s arg = { array, checkfeb }; \

::sigh:: Stupid, stupid compiler. At least it can be worked around.

Thwarting The Debugger

PGI also bad at handling static inline functions in headers. How bad? Well, first of all, the DWARF2 symbols it generates (the default) are incorrect. It gets the line-numbers right but the file name wrong. For example, if I have an inline function in qthread_atomics.h on line 75, and include that header in qt_mpool.c, and then use that function on line 302, the DWARF2 symbols generated will claim that the function is on line 75 of qt_mpool.c (which isn’t even correct if we assume that it’s generating DWARF2 symbols based on the pre-processed source! and besides which, all the other line numbers are from non-pre-processed source). You CAN tell it to generate DWARF1 or DWARF3 symbols, but then it simply leaves out the line numbers and file names completely. Handy, no?

Everyone Else is Doing It…

Here’s another bug in PGI… though I suppose it’s my fault for outsmarting myself. So, once upon a time, I (think I) found that some compilers require assembly memory references to be within parentheses, while others require them to be within brackets. Unfortunately I didn’t write down which ones did what, so I don’t remember if I was merely being over-cautious in my code, or if it really was a compatibility problem. Nevertheless, I frequently do things like this:

atomic_incr(volatile uint32_t *op, const int incr) {
  uint32_t retval = incr;
  __asm__ __volatile__ ("lock; xaddl %0, %1"
    :"m"(*op), "0"(retval)
  return retval;

Note that weird "m"(*op) construction? That was my way of ensuring that the right memory reference syntax was automatically used, no matter what the compiler thought it was. So, what does PGI do in this instance? It actually performs the dereference! In other words, it behaves as if I had written:

atomic_incr(volatile uint32_t *op, const int incr) {
  uint32_t retval = incr;
  __asm__ __volatile__ ("lock; xaddl %0, (%1)"
    :"r"(*op), "0"(retval)
  return retval;

when what I really wanted was:

atomic_incr(volatile uint32_t *op, const int incr) {
  uint32_t retval = incr;
  __asm__ __volatile__ ("lock; xaddl %0, (%1)"
    :"r"(op), "0"(retval)
  return retval;

See the difference? <sigh> Again, it’s not hard to fix so that PGI does the right thing. And maybe I was being too clever in the first place. But dagnabit, my trick should work! And, more pointedly, it DOES work on other compilers (gcc and icc at the bare minimum, and I’ve tested similar things with xlc).

November 13, 2009


Once upon a time, in 2002, the BSD folks had this genius plan: make the standard C qsort() function safe to use in reentrant code by creating qsort_r() and adding an argument (a pointer to pass to the comparison function). So they did, and it was good.

Five years later, in 2007, the GNU libc folks said to themselves “dang, those BSD guys are smart, I wish we had qsort_r()”. Then some idiot said: WAIT! We cannot simply use the same prototype as the BSD folks; they use an evil license! We can’t put that into GPL’d code! So the GNU libc folks solved the problem by reordering the arguments.

And now we have two, incompatible, widely published versions of qsort_r(), which both do the exact same thing: crash horribly if you use the wrong argument order.


Okay, here’s an alternate history:

… Then some lazy idiot said: WAIT! The existing qsort_r() is a poor design that requires a second implementation of qsort()! If we throw out compatibility with existing qsort_r() code, we can implement qsort() as a call to qsort_r() and no one will ever know!


Either way, we all lose.

(I have no argument with the alternate history point… but why’d you have to call it the exact same thing??? Call it qsort_gnu() or something! Make it easy to detect the difference!)

June 16, 2010

PGI Compiler Bug

I ran across another PGI compiler bug that bears noting because it was so annoying to track down. Here’s the code:

static inline uint64_t qthread_cas64(
           volatile uint64_t *operand,
           const uint64_t newval,
           const uint64_t oldval)
    uint64_t retval;
    __asm__ __volatile__ ("lock; cmpxchg %1,(%2)"
        : "=&a"(retval) /* store from RAX */
        : "r"(newval),
          "a"(oldval) /* load into RAX */
        : "cc", "memory");
    return retval;

Now, both GCC and the Intel compiler will produce code you would expect; something like this:

mov 0xffffffffffffffe0(%rbp),%r12
mov 0xffffffffffffffe8(%rbp),%r13
mov 0xfffffffffffffff0(%rbp),%rax
lock cmpxchg %r12,0x0(%r13)
mov %rax,0xfffffffffffffff8(%rbp)

In essence, that’s:

  1. copy the newval into register %r12 (almost any register is fine)
  2. copy the operand into register %r13 (almost any register is fine)
  3. copy the oldval into register %rax (as I specified with “a”)
  4. execute the ASM I wrote (the compare-and-swap)
  5. copy register %rax to the variable I specified

Here’s what PGI produces instead:

mov 0xffffffffffffffe0(%rbp),%r12
mov 0xffffffffffffffe8(%rbp),%r13
mov 0xfffffffffffffff0(%rbp),%rax
lock cmpxchg %r12,0x0(%r13)
mov %eax,0xfffffffffffffff8(%rbp)

You notice the problem? That last step became %eax, so only the lower 32-bits of my 64-bit CAS get returned!

The workaround is to do something stupid: be more explicit. Like so:

static inline uint64_t qthread_cas64(
           volatile uint64_t *operand,
           const uint64_t newval,
           const uint64_t oldval)
    uint64_t retval;
    __asm__ __volatile__ ("lock; cmpxchg %1,(%2)\n\t"
            "mov %%rax,(%0)"
        : "r"(&retval) /* store from RAX */
          "a"(oldval) /* load into RAX */
        : "cc", "memory");
    return retval;

This is stupid because it requires an extra register; it becomes this:

mov 0xfffffffffffffff8(%rbp),%rbx
mov 0xffffffffffffffe0(%rbp),%r12
mov 0xffffffffffffffe8(%rbp),%r13
mov 0xfffffffffffffff0(%rbp),%rax
lock cmpxchg %r12,0x0(%r13)
mov %rax,(%rbx)

Obviously, not a killer (since it can be worked around), but annoying nevertheless.

A similar error happens in this code:

uint64_t retval;
__asm__ __volatile__ ("lock xaddq %0, (%1)"
    :"+r" (retval)
    :"r" (operand)

It would appear that PGI completely ignores the bitwidth of output data!

January 10, 2011

Gmail, DKIM, and DomainKeys

I recently spent a bunch of time trying to resolve some delivery problems we had with Gmail. Some of it was dealing with idiosyncratic issues associated with our mail system, and some of it, well, might benefit others.

In our mail system, we use qmail-qfilter and some custom scripts to manipulate incoming mail, along with a custom shell script I wrote to manipulate outbound mail. Inbound mail, prior to this, was prepended with three new headers: DomainKey-Status, DKIM-Status (and friends), and X-Originating-IP. Outbound mail was signed with both a DomainKey and a DKIM signature. All of my DomainKey-based manipulation was based on libdomainkeys and, in particular, their dktest utility. Yes, that library is technically out-of-date, but for a long time there were more DomainKey-compliant servers out there than DKIM-compliant servers, so… it made sense. The DKIM-based manipulation is all based on Perl’s Mail::DKIM module, which seems to be quite the workhorse.

Our situation was this: we have several users that use Gmail as a kind of “back-end” for their mail on this server. All of their mail needs to be forwarded to Gmail, and when they send from Gmail, it uses SMTP-AUTH to relay their mail through our server. This means that their outgoing mail is signed first by gmail, then by us. The domain of the outgoing signature is defined by the sender.

So, first problem: we use procmail to forward mail. This means that all mail that got sent to these Gmail users got re-transmitted with a return-address of (the procmail default). Thus, we signed all of this relayed mail (because the sender was from one of the domains we have a secret-key for). This became a problem because all spam that got sent to these users got relayed, and signed, and so we got blamed for it (thus causing gmail to blacklist us occasionally).

Gmail has a few recommendations on this subject. Their first recommendation is to stop changing the return address (which is exactly the opposite of the recommendation of SPF-supporters, I’d like to point out). They also suggest doing our own spam detection and putting “SPAM” in the subject of messages our system thinks is spam. I used Gmail’s recommended solution (which would also prevent us from signing outbound spam), adding the following lines to our procmailrc:

SENDER=`formail -c -x Return-Path`

This caused new problems. All of a sudden, mail wasn’t getting through to some of the Gmail users AT ALL. Gmail would permanently reject the messages with the error message:

555 5.5.2 Syntax error. u18si57222290ibk.46

It turns out that messages sent From the Gmail users often had multiple Return-Path headers. The same is true of messages from many mailing lists (including Google Apps mailing lists). This means that formail would dutifully print out a multi-line response, which would then cause garbage (more or less) into the sendmail binary, thereby causing invalid syntax, which is why Gmail was rejecting messages. On top of that, formail doesn’t strip off the surrounding wockas, which caused sendmail to encode the Return-Path header incorrectly, like this:

Return-Path: <<>

This reflects what would happen during the SMTP conversation with Gmail’s servers: the double-wockas would be there as well, which is, officially, invalid SMTP syntax. The solution we’re using now is relatively trivial and works well:

SENDER=`formail -c -x Return-Path | head -n 1 | tr -d'<>'`

Let me re-iterate that, because it’s worth being direct. Using Gmail’s suggested solution caused messages to DISAPPEAR. IRRETRIEVABLY.

Granted, that was my fault for not testing it first. But still, come on Google. That’s a BAD procmail recommendation.

There were a few more problems I had to deal with, relating to DomainKeys and DKIM, but these are someone idiosyncratic to our mail system (but it may be of interest for folks with a similar setup). Here I should explain that when you send from Gmail through another server via SMTP-AUTH, Gmail signs the message with its DK key, both with a DKIM and with a DomainKeys header. This is DESPITE the fact that the Return-Path is for a non-gmail domain, but because the Sender is a address, this behavior is completely legitimate and within the specified behavior of DKIM.

The first problem I ran into was that, without a new Return-Path, the dktest utility from DomainKeys would refuse to sign messages that had already been signed (in this case, by Gmail). Not only that, but it would refuse in a very bad way: instead of spitting out something that looks like a DomainKey-Signature: header, it would spit out an error message. Thus, unless my script was careful about only appending things that start with DomainKey-Signature: (which it wasn’t), I would get message headers that looked like this:

Message-Id: <>
do not sign email that already has a dksign unless Sender was found first
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed;; h=mime-version

That’s an excerpt, but you can see the problem. It spit an invalid header (the error) into the middle of my headers. This kind of thing made Gmail mad, and rightly so. It made me mad too. So mad, in fact, that I removed libdomainkeys from my toolchain completely. Yes, I could have added extra layers to my script to detect the problem, but that’s beside the point: that kind of behavior by a tool like that is malicious.

The second problem I ran into is, essentially, an oversight on my part. My signing script chose a domain (correctly, I might add), and then handed the signing script a filename for the private key of that domain. HOWEVER, since I didn’t explicitly tell it what domain the key was for, it attempted to discover the domain based on the other headers in the message (such as Return-Path and Sender). This auto-discovery was only accurate for users like myself who don’t use Gmail to relay mail through our server. But for messages from Gmail users, who relay via SMTP-AUTH, the script would detect that the mail’s sender was a Gmail user (similar problems would arise for mailing lists, depending on their sender-rewriting behavior). So what it would do is assume that the key it had been handed was for that sender’s domain (i.e., and would create an invalid signature. This, thankfully, was easy to fix: merely adding an explicit --domain=$DOMAIN argument to feed to the signing script fixed the issue. But it was a weird one to track down! It’s worth pointing out that the libdomainkeys dktest utility does not provide a means of doing this.

Anyway, at long last, mail seems to be flowing to my Gmail users once again. Thank heaven!

July 5, 2011

Google Breaks its own DKIM Signatures

So, Google, vaunted tech company that it is, seems to be doing something rather unfortunate. One of my friends/users, who uses Gmail as a repository for his email, recently notified me that email sent to him from other Gmail accounts showed up as “potentially forged”. Interestingly, this only happened for email that was sent from Gmail to an external server (i.e. mine) that then got relayed back to Gmail. Examining the “raw original”, here’s the differences:

  1. The relayed body has an extra newline at the end (this may be an artifact of Gmail’s view-raw-message feature)
  2. The relayed body quotes the display-name in the From header (or any other email header with a display-name)
  3. The relayed body strips off the weekday name from the Date header

Now, since this doesn’t happen to messages sent from-Gmail-to-Gmail directly, and I’m very certain that my email server isn’t doing it either (I sniffed the outbound SMTP traffic to prove it), I’m guessing that this message… “normalization”, for lack of a better term… is a function of their ingress filter. But all of those changes are enough to invalidate the DKIM signature that Gmail generated… or, I suppose, anyone else’s DKIM signature.


Come on, Google, get your act together.

June 8, 2012

A Use for Volatile in Multi-threaded Programming

As anyone who has done serious shared-memory parallel coding knows, the volatile keyword in C (and C++) has a pretty bad rap. There is no shortage of people trying to explain why this is the case. Take for example this article from the original author of Intel Threaded Building Blocks (a man who, I assure you, knows what he’s talking about): Volatile: Almost Useless for Multithreaded Programming. There are others out there who decry volatile, and their arguments are right to varying degrees. The heart of the issue is that first, volatile is EXPLICITLY IGNORABLE in the C specification, and second, that it provides neither ordering guarantees nor atomicity. Let me say that again, because I’ve had this argument:



But I’m not here to talk about that; I want to talk about a place where I found it to be critical to correctness (as long as it isn’t ignored by the compiler, in which case creating correct code is painful). Honestly, I was quite surprised about this, but it makes sense in retrospect.

I needed a double-precision floating point atomic increment. Most increments, of the __sync_fetch_and_add() variety, operate exclusively on integers. So, here’s my first implementation (just the x86_64 version, without the PGI bug workarounds):

double qthread_dincr(double *operand, double incr)
  union {
    double   d;
    uint64_t i;
  } oldval, newval, retval;
  do {
    oldval.d = *operand;
    newval.d = oldval.d + incr;
    __asm__ __volatile__ ("lock; cmpxchgq %1, (%2)"
      : "=a" (retval.i)
      : "r" (newval.i), "r" (operand),
       "0" (oldval.i)
      : "memory");
  } while (retval.i != oldval.i);
  return oldval.d;

Fairly straightforward, right? But this has a subtle race condition in it. The dereference of operand gets translated to the following assembly:

movsd (%rcx), %xmm0
movd (%rcx), %rax

See the problem? In the assembly, it’s actually dereferencing operand TWICE; and under contention, that memory location could change values between those two instructions. Now, we might pause to ask: why is it doing that? We only told it to go to memory ONCE; why would it go to twice? Well, a certain amount of that is unfathomable. Memory accesses are usually slow, so you’d think the compiler would try to avoid them. But apparently sometimes it doesn’t, and technically, dereferencing non-volatile memory multiple times is perfectly legal. The point is, this is what happened when compiling with basically every version of gcc 4.x right up through the latest gcc 4.7.1.

In any event, there are two basic ways to fix this problem. The first would be to code more things in assembly; either the entire loop or maybe just the dereference. That’s not an appealing option because it requires me to pick which floating point unit to use (SSE versus 387 versus whatever fancy new stuff comes down the pike), and I’d rather let the compiler do that. The second way to fix it is to use volatile. If I change that dereference to this:

oldval.d = *(volatile double *)operand;

Then the assembly it generates looks like this:

movsd (%rcx), %xmm0
movd %xmm0, %rax

Problem solved! As long as the compiler doesn’t ignore the volatile cast, at least…

So, for those who love copy-and-paste, here’s the fixed function:

double qthread_dincr(double *operand, double incr)
  union {
    double   d;
    uint64_t i;
  } oldval, newval, retval;
  do {
    oldval.d = *(volatile double *)operand;
    newval.d = oldval.d + incr;
    __asm__ __volatile__ ("lock; cmpxchgq %1, (%2)"
      : "=a" (retval.i)
      : "r" (newval.i), "r" (operand),
       "0" (oldval.i)
      : "memory");
  } while (retval.i != oldval.i);
  return oldval.d;

(That function will not work in the PGI compiler, due to a compiler bug I’ve talked about previously.)

April 29, 2013

Building GCC 4.8 on RHEL5.8

I didn’t find this solution anywhere else on the internet, so I figured I’d post it here…

When building GCC 4.8.0 on an up-to-date RHEL 5.8 system, it died, complaining that in the file libstdc++-v3/libsupc++/unwind-cxx.h, there’s a macro (PROBE2) that cannot be expanded. It complains about things like this:

In file included from ../../../../libstdc++-v3/libsupc++/unwind-cxx.h:41:0,
                 from ../../../../libstdc++-v3/libsupc++/
../../../../libstdc++-v3/libsupc++/ In function ‘void __cxxabiv1::__cxa_throw(void*, std::type_info*, void (*)(void*))’:
../../../../libstdc++-v3/libsupc++/unwind-cxx.h:45:34: error: unable to find string literal operator ‘operator"" _SDT_S’
 #define PROBE2(name, arg1, arg2) STAP_PROBE2 (libstdcxx, name, arg1, arg2)

The most helpful answer I could find was this one which didn’t actually HELP so much as point a good finger at the culprit: SystemTap. Never heard of it? Neither had I. Their headers are apparently not compatible with C++11, and need to be updated. Don’t have root? Heh, have fun with that.

Of course, telling GCC to ignore SystemTap is not possible, that I can tell, unless SystemTap happened to be installed in an unusual place. So, instead, we have to resort to convincing GCC that it’s not installed. Unfortunately, that can get tricky. What I ended up having to do was edit x86_64-unknown-linux-gnu/libstdc++-v3/config.h and comment-out the line that says

#define HAVE_SYS_SDT_H 1

…so that it reads this instead:

/*#define HAVE_SYS_SDT_H 1*/

It’s not a good solution, of course, because I’m coming along behind the configuration logic and changing some of the answers without ensuring that there weren’t conditional decisions made on that basis, AND since GCC builds itself several times to bootstrap into a clean, optimized product, you have to make that edit multiple (three) times. Basically, this is a horrible horrible hack around the problem. BUT, this works, is simple, and gets me a working compiler.

January 20, 2014

Automake 1.13's Parallel Harness

When GNU released Automake 1.13, they made some interesting (and unfortunate) decisions about their test harness.

Automake has (and has had since version 1.11, back in May 2009) two different test harnesses: a serial one and a parallel one. The serial one is simpler, but the parallel one has a lot of helpful features. (In this case, parallel means running multiple tests at the same time, not that the harness is good for tests that happen to themselves be parallel.) Importantly, these two harnesses are mutually incompatible (e.g. the use of TEST_ENVIRONMENT is often necessary in the serial harness and will seriously break the parallel harness). The default test harness, of course, has always been the serial one for lots of very good reasons, including backwards compatibility. However, starting in automake 1.13 (which is now the standard on up-to-date Ubuntu, and will become common), whether it’s a good idea or a bad idea, the default test harness behavior is now the parallel one. There is a way to specify that you want the original behavior (the “serial-tests” option to automake), however, that option was only added to automake 1.12 (released in April 2012), and since automake aborts when it sees an unsupported option, that path doesn’t really provide much backward compatibility.

Now, it’s clear that the automake authors think the parallel test harness is the way of the future, and is probably what people should be using for new projects. However, this has impact on backwards compatibility. For instance, the version of automake that comes with RHEL 5.8 (which still supported and relatively common) is 1.9.6 - which was released waaay back in July 2005!

One option for dealing with the latest wrinkle in automake is to adopt the parallel test harness, make 1.11 the required minimum version, and simply abandon support for developing on older systems. That may or may not be a viable option. Another option is to support both harnesses somehow, e.g. by providing two different files (or more, depending on how complex your test folder tree is) and swapping between them via some sort of script (e.g., which many people use). This, however, is a maintainence nightmare. And the third option is to stick with the serial test harness, attempt detection of the autoconf version in the configure script and conditionally define the “serial-tests” option only when necessary. This will maintain compatibility with old versions, but is somewhat fragile.

An example of this latter is, assuming your AM_INIT_AUTOMAKE line looks like this:

AM_INIT_AUTOMAKE([1.9 no-define foreign])

…to make it look like this:

AM_INIT_AUTOMAKE([1.9 no-define foreign]
  m4_ifdef([AM_EXTRA_RECURSIVE_TARGETS], [serial-tests]))

February 26, 2015

VMWare Workstation autostart vmware-user

I didn’t find this solution anywhere on the internet, so I figured I’d post it here, even if only for the next time I need it.

I have a copy of VMWare Workstation 10 that I use on a Windows laptop at work to run Ubuntu. In general, it works quite well - it does certain things faster than VirtualBox, and has a few key features I need for work. However, every time I have to re-install the vmware additions, it forgets how to automatically run the vmware-user binary at login. This program, for those who are unfamiliar, turns on a bunch of the VMWare extras, such as making sure that it knows when to resize the screen (e.g. when I maximize Linux to full-screen), among other things.

Now, sure, this program can be run by hand (as root), but it’s the principle of the thing.

The trick, as it turns out, is permissions. This program needs to be run by root, so it’s actually a link to a setuid binary in /usr/lib/vmware-tools/bin64. All the files in /usr/lib/vmware-tools/ are installed, for whatever reason, with the wrong permissions. The following set of commands will fix it:

sudo find /usr/lib/vmware-tools/ -type d -exec chmod o+rx {} \;
sudo find /usr/lib/vmware-tools/ -type f -perm -g+rx ! -perm -o+rx -exec chmod o+rx {} \;
sudo find /usr/lib/vmware-tools/ -type f -perm -g+r ! -perm -o+r -exec chmod o+r {} \;

But that’s not enough! There’s another directory that’s been installed improperly: /etc/vmware-tools/! I’m less confident about blanket making the contents of this directory readable by ordinary users, but the following two commands seemed to be enough to make things work:

sudo chmod go+r /etc/vmware-tools/vmware-tools*
sudo chmod go+r /etc/vmware-tools/vmware-user.*

Hopefully, that helps someone other than just me.

About Computers

This page contains an archive of all entries posted to Kyle in the Computers category. They are listed from oldest to newest.

Commentary is the previous category.

Cool Stuff is the next category.

Many more can be found on the main index page or by looking through the archives.

Creative Commons License
This weblog is licensed under a Creative Commons License.
Powered by
Movable Type 3.34