From 02a73e2bed57cea55e3defa3ae040f8f166e327e Mon Sep 17 00:00:00 2001 From: Andy Polyakov Date: Mon, 4 Jul 2011 11:20:33 +0000 Subject: [PATCH] s390x-gf2m.pl: commentary update (final performance numbers turned to be higher). --- crypto/bn/asm/s390x-gf2m.pl | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/crypto/bn/asm/s390x-gf2m.pl b/crypto/bn/asm/s390x-gf2m.pl index eb389b323a..cd9f13eca2 100644 --- a/crypto/bn/asm/s390x-gf2m.pl +++ b/crypto/bn/asm/s390x-gf2m.pl @@ -12,17 +12,18 @@ # The module implements bn_GF2m_mul_2x2 polynomial multiplication used # in bn_gf2m.c. It's kind of low-hanging mechanical port from C for # the time being... gcc 4.3 appeared to generate poor code, therefore -# the effort. The module delivers 55%-90% improvement on haviest ECDSA -# verify and ECDH benchmarks for 163- and 571-bit keys on z990, and -# 25%-30% - on z196(*). This is for 64-bit build. In 32-bit "highgprs" -# case improvement is even higher, for example on z990 it was measured -# 80%-150%. ECDSA sign is modest 9%-12% faster. Keep in mind that -# these coefficients are not ones for bn_GF2m_mul_2x2 itself, as not -# all CPU time is burnt in it... +# the effort. And indeed, the module delivers 55%-90%(*) improvement +# on haviest ECDSA verify and ECDH benchmarks for 163- and 571-bit +# key lengths on z990, 30%-55%(*) - on z10, and 70%-110%(*) - on z196. +# This is for 64-bit build. In 32-bit "highgprs" case improvement is +# even higher, for example on z990 it was measured 80%-150%. ECDSA +# sign is modest 9%-12% faster. Keep in mind that these coefficients +# are not ones for bn_GF2m_mul_2x2 itself, as not all CPU time is +# burnt in it... # -# (*) Though no improvement could be measured if compared to code -# generated by gcc 4.1. Keep in mind that z196 is out-of-order -# execution core and is better at executing poor code. +# (*) gcc 4.1 was observed to deliver better results than gcc 4.3, +# so that improvement coefficients can vary from one specific +# setup to another. $flavour = shift; -- 2.25.1