summaryrefslogtreecommitdiffstats
path: root/crypto/sha
diff options
context:
space:
mode:
authorAndy Polyakov <appro@openssl.org>2015-03-03 22:05:25 +0100
committerAndy Polyakov <appro@openssl.org>2015-04-02 09:51:24 +0200
commit5a27a20be3c67c2ba5f0258b563bcfe41b1befe1 (patch)
tree3825c1a58b587cf282d4cd6f4fb697ea219dff3f /crypto/sha
parent3d5bb773ecd78f75984fb096bb0be7808d3dc18d (diff)
aes/asm/aesv8-armx.pl: optimize for Cortex-A5x.
ARM has optimized Cortex-A5x pipeline to favour pairs of complementary AES instructions. While modified code improves performance of post-r0p0 Cortex-A53 performance by >40% (for CBC decrypt and CTR), it hurts original r0p0. We favour later revisions, because one can't prevent future from coming. Improvement on post-r0p0 Cortex-A57 exceeds 50%, while new code is not slower on r0p0, or Apple A7 for that matter. [Update even SHA results for latest Cortex-A53.] Reviewed-by: Richard Levitte <levitte@openssl.org> (cherry picked from commit 94376cccb4ed5b376220bffe0739140ea9dad8c8)
Diffstat (limited to 'crypto/sha')
-rw-r--r--crypto/sha/asm/sha1-armv8.pl2
-rw-r--r--crypto/sha/asm/sha512-armv8.pl4
2 files changed, 3 insertions, 3 deletions
diff --git a/crypto/sha/asm/sha1-armv8.pl b/crypto/sha/asm/sha1-armv8.pl
index deb1238d36..9c091031b0 100644
--- a/crypto/sha/asm/sha1-armv8.pl
+++ b/crypto/sha/asm/sha1-armv8.pl
@@ -14,7 +14,7 @@
#
# hardware-assisted software(*)
# Apple A7 2.31 4.13 (+14%)
-# Cortex-A53 2.19 8.73 (+108%)
+# Cortex-A53 2.24 8.03 (+97%)
# Cortex-A57 2.35 7.88 (+74%)
#
# (*) Software results are presented mostly for reference purposes.
diff --git a/crypto/sha/asm/sha512-armv8.pl b/crypto/sha/asm/sha512-armv8.pl
index bd7a0a5662..adc273e81b 100644
--- a/crypto/sha/asm/sha512-armv8.pl
+++ b/crypto/sha/asm/sha512-armv8.pl
@@ -14,7 +14,7 @@
#
# SHA256-hw SHA256(*) SHA512
# Apple A7 1.97 10.5 (+33%) 6.73 (-1%(**))
-# Cortex-A53 2.38 15.6 (+110%) 10.1 (+190%(***))
+# Cortex-A53 2.38 15.5 (+115%) 10.0 (+150%(***))
# Cortex-A57 2.31 11.6 (+86%) 7.51 (+260%(***))
#
# (*) Software SHA256 results are of lesser relevance, presented
@@ -25,7 +25,7 @@
# (***) Super-impressive coefficients over gcc-generated code are
# indication of some compiler "pathology", most notably code
# generated with -mgeneral-regs-only is significanty faster
-# and lags behind assembly only by 50-90%.
+# and the gap is only 40-90%.
$flavour=shift;
$output=shift;