November 13, 2009

qsort_r

Once upon a time, in 2002, the BSD folks had this genius plan: make the standard C qsort() function safe to use in reentrant code by creating qsort_r() and adding an argument (a pointer to pass to the comparison function). So they did, and it was good.

Five years later, in 2007, the GNU libc folks said to themselves “dang, those BSD guys are smart, I wish we had qsort_r()”. Then some idiot said: WAIT! We cannot simply use the same prototype as the BSD folks; they use an evil license! We can’t put that into GPL’d code! So the GNU libc folks solved the problem by reordering the arguments.

And now we have two, incompatible, widely published versions of qsort_r(), which both do the exact same thing: crash horribly if you use the wrong argument order.

<sigh>

Okay, here’s an alternate history:

… Then some lazy idiot said: WAIT! The existing qsort_r() is a poor design that requires a second implementation of qsort()! If we throw out compatibility with existing qsort_r() code, we can implement qsort() as a call to qsort_r() and no one will ever know!

<sigh>

Either way, we all lose.

(I have no argument with the alternate history point… but why’d you have to call it the exact same thing??? Call it qsort_gnu() or something! Make it easy to detect the difference!)

June 10, 2009

More Compiler Complaints: PGI Edition

Continuing my series of pointless complaining about compiler behavior (see here and here for the previous entries), I recently downloaded a trial version of PGI’s compiler to put in my Linux virtual machine to see how that does compiling qthreads. There were a few minor things that it choked on that I could correct pretty easily, and some real bizarre behavior that seems completely broken to me.

Subtle Bugs in My Code

Let’s start with the minor mistakes it found in my code that other compilers hadn’t complained about:

static inline uint64_t qthread_incr64(
           volatile uint64_t *operand, const int incr)
{
  union {
    uint64_t i;
    struct {
      uint32_t l, h;
    } s;
  } oldval, newval;
  register char test;
  do {
    oldval.i = *operand;
    newval.i = oldval.i + incr;
    __asm__ __volatile__ ("lock; cmpxchg8b %1\n\t setne %0"
        :"=r"(test)
        :"m"(*operand),
         "a"(oldval.s.l),
         "d"(oldval.s.h),
         "b"(newval.s.l),
         "c"(newval.s.h)
        :"memory");
  } while (test);
  return oldval.i;
}

Seems fairly straightforward, right? Works fine on most compilers, but the PGI compiler complains that “%sli” is an invalid register. Really obvious error, right? Right? (I don’t really know what the %sli register is for either). Turns out that because setne requires a byte-sized register, I need to tell the compiler that I want a register that can be byte-sized. In other words, that "=r" needs to become "=q". Fair enough. It’s a confusing error, and thus annoying, but I am technically wrong (or at least I’m providing an incomplete description of my requirements) here so I concede the ground to PGI.

Unnecessary Pedantry

And then there are places where PGI is simply a bit more pedantic than it really needs to be. For example, it generates an error when you implicitly cast a volatile struct foo * into a void * when calling into a function. Okay, yes, the pointers are different, but… most compilers allow you to implicitly convert just about any pointer type into a void * without kvetching, because you aren’t allowed to dereference a void pointer (unless you cast again, and if you’re casting, all bets are off anyway), thus it’s a safe bet that you want to work on the pointer rather than what it points to. Yes, technically PGI has made a valid observation, but I disagree that their observation rises to the level of “warning-worthy” (I have no argument if they demote it to the sort of thing that shows up with the -Minform=inform flag).

Flat-out Broken

But there are other places where PGI is simply wrong/broken. For example, if I have (and use) a #define like this:

#define PARALLEL_FUNC(initials, type, shorttype, category) \
type qt_##shorttype##_##category (type *array, size_t length, int checkfeb) \
{ \
  struct qt##initials arg = { array, checkfeb }; \
  type ret; \
  qt_realfunc(0, length, sizeof(type), &ret, \
    qt##initials##_worker, \
    &arg, qt##initials##_acc, 0); \
  return ret; \
}
PARALLEL_FUNC(uis, aligned_t, uint, sum);

PGI will die! Specifically, it complains that struct qtuisarg does not exist, and that an identifier is missing. In other words, it blows away the whitespace following initials so that this line:

struct qt##initials arg = { array, checkfeb }; \

is interpreted as if it looked like this:

struct qt##initials##arg = { array, checkfeb }; \

But at least that’s easy to work around: rename the struct so that it has a _s at the end! Apparently PGI is okay with this:

struct qt##initials##_s arg = { array, checkfeb }; \

::sigh:: Stupid, stupid compiler. At least it can be worked around.

Thwarting The Debugger

PGI also bad at handling static inline functions in headers. How bad? Well, first of all, the DWARF2 symbols it generates (the default) are incorrect. It gets the line-numbers right but the file name wrong. For example, if I have an inline function in qthread_atomics.h on line 75, and include that header in qt_mpool.c, and then use that function on line 302, the DWARF2 symbols generated will claim that the function is on line 75 of qt_mpool.c (which isn’t even correct if we assume that it’s generating DWARF2 symbols based on the pre-processed source! and besides which, all the other line numbers are from non-pre-processed source). You CAN tell it to generate DWARF1 or DWARF3 symbols, but then it simply leaves out the line numbers and file names completely. Handy, no?

Everyone Else is Doing It…

Here’s another bug in PGI… though I suppose it’s my fault for outsmarting myself. So, once upon a time, I (think I) found that some compilers require assembly memory references to be within parentheses, while others require them to be within brackets. Unfortunately I didn’t write down which ones did what, so I don’t remember if I was merely being over-cautious in my code, or if it really was a compatibility problem. Nevertheless, I frequently do things like this:

atomic_incr(volatile uint32_t *op, const int incr) {
  uint32_t retval = incr;
  __asm__ __volatile__ ("lock; xaddl %0, %1"
    :"=r"(retval)
    :"m"(*op), "0"(retval)
    :"memory");
  return retval;
}

Note that weird "m"(*op) construction? That was my way of ensuring that the right memory reference syntax was automatically used, no matter what the compiler thought it was. So, what does PGI do in this instance? It actually performs the dereference! In other words, it behaves as if I had written:

atomic_incr(volatile uint32_t *op, const int incr) {
  uint32_t retval = incr;
  __asm__ __volatile__ ("lock; xaddl %0, (%1)"
    :"=r"(retval)
    :"r"(*op), "0"(retval)
    :"memory");
  return retval;
}

when what I really wanted was:

atomic_incr(volatile uint32_t *op, const int incr) {
  uint32_t retval = incr;
  __asm__ __volatile__ ("lock; xaddl %0, (%1)"
    :"=r"(retval)
    :"r"(op), "0"(retval)
    :"memory");
  return retval;
}

See the difference? Again, it’s not hard to fix so that PGI does the right thing. And maybe I was being too clever in the first place. But dagnabit, my trick should work! And, more pointedly, it DOES work on other compilers (gcc and icc at the bare minimum, and I’ve tested similar things with xlc).

May 5, 2009

Muslim Demographics

I recently got sent a link to the Muslim Demographics video on YouTube. It’s pretty alarmist, so I composed an email response. Since it might be interesting to have available to Google, here’s my thoughts, as an antidote to the panic the video is peddling.

Note that snopes.com has their own page discussing this video. They don’t address the accuracy of the facts presented, but its an interesting read nevertheless.


It is probably true that there is an aspect of evolution to religion. If we think of religion as a gene, the dominant religion will be defined (over the long run) by the extent to which it benefits or promotes reproduction, just as any piece of DNA does. Of course, given evangelism, it may be more apt to think of religion as similar to a virus that gets passed from person to person, but I doubt that’d be a popular viewpoint. In any case, of course, religion isn’t a gene, because we’d have to assume that people stick with the religion they’re born with. Christianity started with just 12 Christians, after all, and every last one of them was male (and thus couldn’t pass on any genes without help from an originally non-Christian woman).

That said, I’m skeptical of this video’s claims. For one thing, practically no sources are cited (2.11 is the bare minimum to maintain a society? Says who? How do we know?). For another, it’s interesting to note what gets left out. For example, the video says the Muslim population skyrocketed from 82,000 in the UK to several million, but according to the CIA World Factbook, Muslims are a whopping 2.7% of the UK population. In the Netherlands, the CIA World FactBook says that Muslims are a crushing 5.8% of the population. Not only that, but it says that the Netherlands have a fertility rate of 1.66. Woo! Scary!

Now that I come to think of it… let me look this up.

CountryVideo Claimed Fertility RateCIA FactBook Fertility Rate
France1.81.98
England1.61.66
Greece1.31.37
Germany1.31.41
Italy1.21.31
Spain1.11.31
EU1.381.51

Yikes - they didn’t get a single one correct! The closest was England, and even there they rounded in the wrong direction.

So should we really believe that French Muslims have a birth rate of 8.1? Think about that, EIGHT kids on average, which means that for every childless Muslim woman, there’s another out there with SIXTEEN KIDS. That sounds totally plausible, right?

The Population Reference Bureau (who I’ve never heard of, but they were linked to by about.com) says that in Austria, Muslim women had a fertility rate of 3.1 in 1981, but that by 2001 the rate was a mere 2.3 (apparently they didn’t get the memo from their French kindred). That reflects the falling fertility rates in Muslim countries the immigrants came from. For example, in Turkey the fertility rate dropped from 3.3 in 1985 to 2.2 in 2003. According to the CIA World FactBook, Turkey’s current fertility rate is 2.21. In Morocco it fell from 4.5 to 2.5 in the same time period (CIA says 2.51). And get this: in Iran, it fell from 5.6 to 2.1 in 2003. The CIA World FactBook currently pegs the Iranian fertility rate at a paltry 1.71!

1.71! 1.71! The Iranian culture cannot survive! The US fertility rate is 2.05! We will crush them with our progeny! MUAHAHAHAHA!

Oh, wait, does that not serve the purpose of getting people riled up?

I’m thinking of a word… fearmongering! That’s the one.

It reminds me a lot of grade school. I remember playing with a bunch of computer “simulations” that showed that the world population was exploding and that we’d run out of food by the turn of the millenium. The reason they were wrong is that they made assumptions without realizing it. For example, they assumed that food production would stay constant, and that fertility rates would stay constant. Surprise! They didn’t.

And THAT reminds me of another quote:

Scientists have shown that the moon is moving away at a tiny, although measurable distance from the earth every year. If you do the math, you can calculate that 65 million years ago, the moon was orbiting at a distance of about 35 feet from the earth’s surface. This would explain the death of the dinosaurs … the tallest ones, anyway.

Don’t believe everything you see on YouTube. ;)

April 27, 2009

SlingLink Killed My Network!

This isn’t the most accurate title, but…

I’ve got a SlingLink Turbo that I use for connecting my Macs upstairs to my cable modem downstairs. I went with a network-over-powerline option, because I’ve been having all kinds of intermittent interference problems with my wireless connectivity. So, rather than running an extra-long patch cable up the stairs and taping it down to the carpet, I went the SlingLink route. It seems to be designed specifically for SlingBox applications, but it forwards plain ol’ ethernet signals, and it’s about $40 cheaper than the NetGear equivalents. Huzzah for getting a bargain!

First impression: fabulous! I went along happily for weeks, enjoying my newfound reliable network connection. Then I tried downloading the latest Ubuntu ISO images via BitTorrent, and within five to ten minutes, the internet connection went offline. If I went downstairs and turned the cable modem off-and-on again, the internet would come back. For five to ten minutes. Then it’d go down again.

Surely, I say to myself, that’s a cable-modem problem, right?

I had to have the tech guys from Time Warner’s Cable group come out (twice!) before I finally figured out that it wasn’t their fault (the first time they said they replaced the splitter, and presto, the network was fine! I didn’t go after the ISO again for a while so…). Turns out I didn’t need to restart the cable modem, all I had to do was restart the SlingLink node and I’d get another five to ten minutes out of it. But it ONLY happens when BitTorrent is running; otherwise, the network connection is rock solid!

Weird, no?

So, to experiment, I tried limiting the BitTorrent connections: no dice. Then I tried limiting the BitTorrent bandwidth and all of a sudden the network would stay up. Somewhere between 100Kb/s and 150Kb/s is the cutoff. Something about BitTorrent’s bandwidth seems to either confuse the SlingLink node OR triggers some sort of antiviral cutoff in the SlingLink hardware (either way is annoying). For the record, it’s not a pure bandwidth issue: I can transfer files over the SlingLink network at speeds of over 400Kb/s. As soon as I introduce BitTorrent, though… down she goes.

Maybe it’s a packet-size issue. Maybe it’s a connection-tracking issue. I have no idea. But at least now I know that SlingLink has its limitations. And now, so do you.