this error was only found by reading the code, but it seems to have
been causing gcc to produce wrong code in malloc: the same register
was used for the output and the high word of the input. in principle
this could have caused an infinite loop searching for an available
bin, but in practice most x86 models seem to implement the "undefined"
result of the bsf instruction as "unchanged".
{
int r;
__asm__( "bsf %1,%0 ; jnz 1f ; bsf %2,%0 ; addl $32,%0\n1:"
- : "=r"(r) : "r"((unsigned)x), "r"((unsigned)(x>>32)) );
+ : "=&r"(r) : "r"((unsigned)x), "r"((unsigned)(x>>32)) );
return r;
}