other archs use asm for the thread pointer load, so making that asm
volatile is sufficient to inform the compiler that it has a "side
effect" (crashing or giving the wrong result if the thread pointer was
not yet initialized) that prevents reordering. however, powerpc and
or1k have dedicated general purpose registers for the thread pointer
and did not need to use any asm to access it; instead, "local register
variables with a specified register" were used. however, there is no
specification for ordering constraints on this type of usage, and
presumably use of the thread pointer could be reordered across its
initialization.
to impose an ordering, I have added empty volatile asm blocks that
produce the "local register variable with a specified register" as
an output constraint.
__asm__ __volatile__ ("l.ori %0, r10, 0" : "=r" (tp) );
#else
register char *tp __asm__("r10");
+ __asm__ __volatile__ ("" : "=r" (tp) );
#endif
return (struct pthread *) (tp - sizeof(struct pthread));
}
__asm__ __volatile__ ("mr %0, 2" : "=r"(tp) : : );
#else
register char *tp __asm__("r2");
+ __asm__ __volatile__ ("" : "=r" (tp) );
#endif
return (pthread_t)(tp - 0x7000 - sizeof(struct pthread));
}