fix incorrect __hwcap seen in dynamic-linked __set_thread_area
the bug fixed in commit
b82cd6c78d812d38c31febba5a9e57dbaa7919c4 was
mostly masked on arm because __hwcap was zero at the point of the call
from the dynamic linker to __set_thread_area, causing the access to
libc.auxv to be skipped and kuser_helper versions of TLS access and
atomics to be used instead of the armv6 or v7 versions. however, on
kernels with kuser_helper removed for hardening it would crash.
since __set_thread_area potentially uses __hwcap, it must be
initialized before the function is called. move the AT_HWCAP lookup
from stage 3 to stage 2b.