« June 2008 | Main | September 2008 »

August 2008 Archives

August 18, 2008

More Compiler Complaints: Sparc Edition

Unlike my previous whining about compilers, this one I have no explanation for. It’s not me specifying things incorrectly, it’s just the compiler being broken.

So, here’s the goal: atomically increment a variable. On a Sparc (specifically, SparcV9), the function looks something like this:

static inline int atomic_inc(int * operand)
{
    register uint32_t oldval, newval;
    newval = *operand;
    do {
        oldval = newval;
        newval++;
        __asm__ __volatile__ ("cas [%1], %2, %0"
            : "=&r" (newval)
            : "r" (operand), "r"(oldval)
            : "cc", "memory");
    } while (oldval != newval);
    return oldval+1;
}

Seems trivial, right? We use the CAS instruction (compare and swap). Conveniently, whenever the comparison fails, it stores the value of *operand in the second register (i.e. %0 aka newval), so there are no extraneous memory operations in this little loop. Right? Right. Does it work? NO.

Let’s take a look at the assembly that the compiler (gcc) generates with -O2 optimization:

save    %sp, -0x60, %sp
ld      [%i0], %i5      /* newval = *operand; */
mov     %i0, %o1        /* operand is copied into %o1 */
mov     %i5, %o2        /* oldval = newval; */
cas     [%o1], %o2, %o0 /* o1 = operand, o2 = newval, o0 = ? */
ret
restore %i5, 0x1, %o0

Say what? Does that have ANYTHING to do with what I told it? Nope! %o0 is never even initialized, but somehow it gets used anyway! What about the increment? Nope! It was optimized out, apparently (which, in fairness, is probably because we didn’t explicitly list it as an input). Of course, gcc is awful, you say! Use SUN’s compiler! Sorry, it produces the exact same output.

But let’s be a bit more explicit about the fact that the newval register is an input to the assembly block:

static inline int atomic_inc(int * operand)
{
    register uint32_t oldval, newval;
    newval = *operand;
    do {
        oldval = newval;
        newval++;
        __asm__ __volatile__ ("cas [%1], %2, %0"
            : "=&r" (newval)
            : "r" (operand), "r"(oldval), "0"(newval)
            : "cc", "memory");
    } while (oldval != newval);
    return oldval+1;
}

Now, Sun’s compiler complains: warning: parameter in inline asm statement unused: %3. Well gosh, isn’t that useful; way to recognize the fact that "0" declares the input to be an output! But at least, gcc leaves the add operation in:

save    %sp, -0x60, %sp
ld      [%i0], %i5      /* oldval = *operand; */
mov     %i0, %o1        /* operand is copied to %o1 */
add     %i5, 0x1, %o0   /* newval = oldval + 1; */
mov     %i5, %o2        /* oldval is copied to %o2 */
cas     [%o1], %o2, %o0
ret
restore %i5, 0x1, %o0

Yay! The increment made it in there, and %o0 is now initialized to something! But what happened to the do{ }while() loop? Sorry, that was optimized away, because gcc doesn’t recognize that newval can change values, despite the fact that it’s listed as an output!

Sun’s compiler will at least leave the while loop in, but will often use the WRONG REGISTER for comparison (such as %i2 instead of %o0).

But check out this minor change:

static inline int atomic_inc(int * operand)
{
    register uint32_t oldval, newval;
    do {
        newval = *operand;
        oldval = newval;
        newval++;
        __asm__ __volatile__ ("cas [%1], %2, %0"
            : "=&r" (newval)
            : "r" (operand), "r"(oldval), "0"(newval)
            : "cc", "memory");
    } while (oldval != newval);
    return oldval+1;
}

See the difference? Rather than using the output of the cas instruction (newval), we’re throwing it away and re-reading *operand no matter what. And guess what suddenly happens:

save     %sp, -0x60, %sp
ld       [%i0], %i5           /* oldval = *operand; */
add      %i5, 0x1, %o0        /* newval = oldval + 1; */
mov      %i0, %o1             /* operand is copied to %o1 */
mov      %i5, %o2             /* oldval is copied to %o2 */
cas      [%o1], %o2, %o0
cmp      %i5, %o0             /* if (oldval != newval) */
bne,a,pt %icc, atomic_inc+0x8 /* then go back and try again */
ld       [%i0], %i5
ret
restore  %i5, 0x1, %o0

AHA! The while loop returns! And best of all, both GCC and Sun’s compiler suddenly, magically, (and best of all, consistently) use the correct registers for the loop comparison! It’s amazing! For some reason this change reminds the compilers that newval is an output!

It’s completely idiotic. So, we can get it to work… but we have to be inefficient in order to do it, because otherwise (inexplicably) the compiler refuses to acknowledge that our output register can change.

In case you’re curious, the gcc version is:
sparc-sun-solaris2.10-gcc (GCC) 4.0.4 (gccfss)
and the Sun compiler is:
cc: Sun C 5.9 SunOS_sparc 2007/05/03

About August 2008

This page contains all entries posted to Kyle in August 2008. They are listed from oldest to newest.

June 2008 is the previous archive.

September 2008 is the next archive.

Many more can be found on the main index page or by looking through the archives.

Creative Commons License
This weblog is licensed under a Creative Commons License.
Powered by
Movable Type 3.34