Even though Apple refers to Procedure Call Standard for ARM Architecture
(AAPCS), they apparently adhere to custom version that doesn't follow
stack alignment constraints in the said standard. [Why or why? If it's
vendor lock-in thing, then it would be like worst spot ever.] And since
bsaes-armv7 relied on standard alignment, it became problematic to
execute the code on iOS.
Reviewed-by: Rich Salz <rsalz@openssl.org>
vmov @XMM[4],@XMM[15] @ just in case ensure that IV
vmov @XMM[5],@XMM[0] @ and input are preserved
bl AES_decrypt
- vld1.8 {@XMM[0]}, [$fp,:64] @ load result
+ vld1.8 {@XMM[0]}, [$fp] @ load result
veor @XMM[0], @XMM[0], @XMM[4] @ ^= IV
vmov @XMM[15], @XMM[5] @ @XMM[5] holds input
vst1.8 {@XMM[0]}, [$rounds] @ write output