the register constraint for the address to be accessed did not convey
that the asm can access the pointed-to object. as far as the compiler
could tell, the result of the asm was just a pure function of the
address and the values passed in, and thus the asm could be hoisted
out of loops or omitted entirely if the result was not used.
" stwcx. %3, 0, %1\n"
" bne- 1b\n"
"1: \n"
- : "=&r"(t) : "r"(p), "r"(t), "r"(s) : "cc", "memory" );
+ : "=&r"(t), "+m"(*p) : "r"(t), "r"(s) : "cc", "memory" );
return t;
}