the 64-bit push reads not only the 32-bit return address but also the
first 32 signal mask bits. if any were nonzero, the return address
obtained will be invalid.
at some point storage of the return address should probably be moved
to follow the saved mask so that there's plenty room and the same code
can be used on x32 and regular x86_64, but for now I want a fix that
does not risk breaking x86_64, and this simple re-zeroing works.
call setjmp@PLT
pushq 64(%rbx)
+ movl $0, 4(%rsp)
mov %rbx,%rdi
mov %eax,%esi
mov 72+8(%rbx),%rbx