make arm crt_arch.h compatible with thumb code generation
compilers targeting armv7 may be configured to produce thumb2 code
instead of arm code by default, and in the future we may wish to
support targets where only the thumb instruction set is available.
the changes made here avoid operating directly on the sp register,
which is not possible in thumb code, and address an issue with the way
the address of _DYNAMIC is computed.
previously, the relative address of _DYNAMIC was stored with an
additional offset of -8 versus the pc-relative add instruction, since
on arm the pc register evaluates to ".+8". in thumb code, it instead
evaluates to ".+4". both are two (normal-size) instructions beyond "."
in the current execution mode, so the numbered label 2 used in the
relative address expression is simply moved two instructions ahead to
be compatible with both instruction sets.