summaryrefslogtreecommitdiffstats
path: root/crypto
diff options
context:
space:
mode:
authorAndy Polyakov <appro@openssl.org>2011-07-04 11:20:33 +0000
committerAndy Polyakov <appro@openssl.org>2011-07-04 11:20:33 +0000
commit02a73e2bed57cea55e3defa3ae040f8f166e327e (patch)
tree057c53352d3b437a3d29af3f3f1a056f87d8eef4 /crypto
parentc540aa2fb1b744342fa8bc0c3215a07d5d272abe (diff)
s390x-gf2m.pl: commentary update (final performance numbers turned to be
higher).
Diffstat (limited to 'crypto')
-rw-r--r--crypto/bn/asm/s390x-gf2m.pl21
1 files changed, 11 insertions, 10 deletions
diff --git a/crypto/bn/asm/s390x-gf2m.pl b/crypto/bn/asm/s390x-gf2m.pl
index eb389b323a..cd9f13eca2 100644
--- a/crypto/bn/asm/s390x-gf2m.pl
+++ b/crypto/bn/asm/s390x-gf2m.pl
@@ -12,17 +12,18 @@
# The module implements bn_GF2m_mul_2x2 polynomial multiplication used
# in bn_gf2m.c. It's kind of low-hanging mechanical port from C for
# the time being... gcc 4.3 appeared to generate poor code, therefore
-# the effort. The module delivers 55%-90% improvement on haviest ECDSA
-# verify and ECDH benchmarks for 163- and 571-bit keys on z990, and
-# 25%-30% - on z196(*). This is for 64-bit build. In 32-bit "highgprs"
-# case improvement is even higher, for example on z990 it was measured
-# 80%-150%. ECDSA sign is modest 9%-12% faster. Keep in mind that
-# these coefficients are not ones for bn_GF2m_mul_2x2 itself, as not
-# all CPU time is burnt in it...
+# the effort. And indeed, the module delivers 55%-90%(*) improvement
+# on haviest ECDSA verify and ECDH benchmarks for 163- and 571-bit
+# key lengths on z990, 30%-55%(*) - on z10, and 70%-110%(*) - on z196.
+# This is for 64-bit build. In 32-bit "highgprs" case improvement is
+# even higher, for example on z990 it was measured 80%-150%. ECDSA
+# sign is modest 9%-12% faster. Keep in mind that these coefficients
+# are not ones for bn_GF2m_mul_2x2 itself, as not all CPU time is
+# burnt in it...
#
-# (*) Though no improvement could be measured if compared to code
-# generated by gcc 4.1. Keep in mind that z196 is out-of-order
-# execution core and is better at executing poor code.
+# (*) gcc 4.1 was observed to deliver better results than gcc 4.3,
+# so that improvement coefficients can vary from one specific
+# setup to another.
$flavour = shift;