in practice this was probably a non-issue, because the necessary
barrier almost certainly exists in kernel space -- implementing signal
delivery without such a barrier seems impossible -- but for the sake
of correctness, it should be done here too.
in principle, without a barrier, it is possible that the thread to be
cancelled does not see the store of its cancellation flag performed by
another thread. this affects both the case where the signal arrives
before entering the critical program counter range from __cp_begin to
__cp_end (in which case both the signal handler and the inline check
fail to see the value which was already stored) and the case where the
signal arrives during the critical range (in which case the signal
handler should be responsible for cancellation, but when it does not
see the cancellation flag, it assumes the signal is spurious and
refuses to act on it).
in the fix, the barrier is placed only in the signal handler, not in
the inline check at the beginning of the critical program counter
range. if the signal handler runs before the critical range is
entered, it will of course take no action, but its barrier will ensure
that the inline check subsequently sees the store. if on the other
hand the inline check runs first, it may miss seeing the store, but
the subsequent signal handler in the critical range will act upon the
cancellation request. this strategy avoids adding a memory barrier in
the common, non-cancellation code path.
const char *ip = ((char **)&uc->uc_mcontext)[CANCEL_REG_IP];
extern const char __cp_begin[1], __cp_end[1];
+ a_barrier();
if (!self->cancel || self->canceldisable) return;
_sigaddset(&uc->uc_sigmask, SIGCANCEL);