« I Hate Procmail | Main | That's MY $700 Billion »

More Compiler Complaints: Sparc Edition

Unlike my previous whining about compilers, this one I have no explanation for. It’s not me specifying things incorrectly, it’s just the compiler being broken.

So, here’s the goal: atomically increment a variable. On a Sparc (specifically, SparcV9), the function looks something like this:

static inline int atomic_inc(int * operand)
{
    register uint32_t oldval, newval;
    newval = *operand;
    do {
        oldval = newval;
        newval++;
        __asm__ __volatile__ ("cas [%1], %2, %0"
            : "=&r" (newval)
            : "r" (operand), "r"(oldval)
            : "cc", "memory");
    } while (oldval != newval);
    return oldval+1;
}

Seems trivial, right? We use the CAS instruction (compare and swap). Conveniently, whenever the comparison fails, it stores the value of *operand in the second register (i.e. %0 aka newval), so there are no extraneous memory operations in this little loop. Right? Right. Does it work? NO.

Let’s take a look at the assembly that the compiler (gcc) generates with -O2 optimization:

save    %sp, -0x60, %sp
ld      [%i0], %i5      /* newval = *operand; */
mov     %i0, %o1        /* operand is copied into %o1 */
mov     %i5, %o2        /* oldval = newval; */
cas     [%o1], %o2, %o0 /* o1 = operand, o2 = newval, o0 = ? */
ret
restore %i5, 0x1, %o0

Say what? Does that have ANYTHING to do with what I told it? Nope! %o0 is never even initialized, but somehow it gets used anyway! What about the increment? Nope! It was optimized out, apparently (which, in fairness, is probably because we didn’t explicitly list it as an input). Of course, gcc is awful, you say! Use SUN’s compiler! Sorry, it produces the exact same output.

But let’s be a bit more explicit about the fact that the newval register is an input to the assembly block:

static inline int atomic_inc(int * operand)
{
    register uint32_t oldval, newval;
    newval = *operand;
    do {
        oldval = newval;
        newval++;
        __asm__ __volatile__ ("cas [%1], %2, %0"
            : "=&r" (newval)
            : "r" (operand), "r"(oldval), "0"(newval)
            : "cc", "memory");
    } while (oldval != newval);
    return oldval+1;
}

Now, Sun’s compiler complains: warning: parameter in inline asm statement unused: %3. Well gosh, isn’t that useful; way to recognize the fact that "0" declares the input to be an output! But at least, gcc leaves the add operation in:

save    %sp, -0x60, %sp
ld      [%i0], %i5      /* oldval = *operand; */
mov     %i0, %o1        /* operand is copied to %o1 */
add     %i5, 0x1, %o0   /* newval = oldval + 1; */
mov     %i5, %o2        /* oldval is copied to %o2 */
cas     [%o1], %o2, %o0
ret
restore %i5, 0x1, %o0

Yay! The increment made it in there, and %o0 is now initialized to something! But what happened to the do{ }while() loop? Sorry, that was optimized away, because gcc doesn’t recognize that newval can change values, despite the fact that it’s listed as an output!

Sun’s compiler will at least leave the while loop in, but will often use the WRONG REGISTER for comparison (such as %i2 instead of %o0).

But check out this minor change:

static inline int atomic_inc(int * operand)
{
    register uint32_t oldval, newval;
    do {
        newval = *operand;
        oldval = newval;
        newval++;
        __asm__ __volatile__ ("cas [%1], %2, %0"
            : "=&r" (newval)
            : "r" (operand), "r"(oldval), "0"(newval)
            : "cc", "memory");
    } while (oldval != newval);
    return oldval+1;
}

See the difference? Rather than using the output of the cas instruction (newval), we’re throwing it away and re-reading *operand no matter what. And guess what suddenly happens:

save     %sp, -0x60, %sp
ld       [%i0], %i5           /* oldval = *operand; */
add      %i5, 0x1, %o0        /* newval = oldval + 1; */
mov      %i0, %o1             /* operand is copied to %o1 */
mov      %i5, %o2             /* oldval is copied to %o2 */
cas      [%o1], %o2, %o0
cmp      %i5, %o0             /* if (oldval != newval) */
bne,a,pt %icc, atomic_inc+0x8 /* then go back and try again */
ld       [%i0], %i5
ret
restore  %i5, 0x1, %o0

AHA! The while loop returns! And best of all, both GCC and Sun’s compiler suddenly, magically, (and best of all, consistently) use the correct registers for the loop comparison! It’s amazing! For some reason this change reminds the compilers that newval is an output!

It’s completely idiotic. So, we can get it to work… but we have to be inefficient in order to do it, because otherwise (inexplicably) the compiler refuses to acknowledge that our output register can change.

In case you’re curious, the gcc version is:
sparc-sun-solaris2.10-gcc (GCC) 4.0.4 (gccfss)
and the Sun compiler is:
cc: Sun C 5.9 SunOS_sparc 2007/05/03

TrackBack

TrackBack URL for this entry:
https://www.we-be-smart.org/mt/mt-tb.cgi/734

Listed below are links to weblogs that reference More Compiler Complaints: Sparc Edition:

» More Compiler Complaints: PGI Edition from Kyle
Continuing my series of pointless complaining about compiler behavior (see here and here for the previous entries), I recently downloaded a trial version of PGI’s compiler to put in my Linux virtual machine to see how that does compiling qthreads... [Read More]

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

About

This page contains a single entry from the blog posted on August 18, 2008 6:01 PM.

The previous post in this blog was I Hate Procmail.

The next post in this blog is That's MY $700 Billion.

Many more can be found on the main index page or by looking through the archives.

Creative Commons License
This weblog is licensed under a Creative Commons License.
Powered by
Movable Type 3.34