move thread pointer setup to beginning of dynamic linker stage 3
this allows the dynamic linker itself to run with a valid thread
pointer, which is a prerequisite for stack protector on archs where
the ssp canary is stored in TLS. it will also allow us to remove some
remaining runtime checks for whether the thread pointer is valid.
as long as the application and its libraries do not require additional
size or alignment, this early thread pointer will be kept and reused
at runtime. otherwise, a new static TLS block is allocated after
library loading has finished and the thread pointer is switched over.