summaryrefslogtreecommitdiffstats
path: root/crypto/sha
AgeCommit message (Collapse)Author
2017-09-09sha/asm/keccak1600-armv8.pl: fix return value buglet and ...Andy Polyakov
... script data load. On related note an attempt was made to merge rotations with logical operations. I mean as we know, ARM ISA has merged rotate-n-logical instructions which can be used here. And they were used to improve keccak1600-armv4 performance. But not here. Even though this approach resulted in improvement on Cortex-A53 proportional to reduction of amount of instructions, ~8%, it didn't exactly worked out on non-Cortex cores. Presumably because they break merged instructions to separate μ-ops, which results in higher *operations* count. X-Gene and Denver went ~20% slower and Apple A7 - 40%. The optimization was therefore dismissed. Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-08-27MSC_VER <= 1200 isn't supported; remove dead codeRich Salz
VisualStudio 6 and earlier aren't supported. Reviewed-by: Andy Polyakov <appro@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4263)
2017-08-16sha/asm/keccak1600-armv4.pl: optimize for Thumb-2.Andy Polyakov
Reduce per-round instruction count in Thumb-2 case by 16%. This is achieved by folding ldr/str pairs to their double-word counterparts. Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-08-12sha/asm/keccak1600-avx512.pl: fix buglet in SHA3_squeeze tail.Andy Polyakov
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-08-02sha/asm/keccak1600-armv4.pl: improve non-NEON performance by ~10%.Andy Polyakov
This is achieved mostly by ~10% reduction of amount of instructions per round thanks to a) switch to KECCAK_2X variant; b) merge of almost 1/2 rotations with logical instructions. Performance is improved on all observed processors except on Cortex-A15. This is because it's capable of exploiting more parallelism and can execute original code for same amount of time. Reviewed-by: Rich Salz <rsalz@openssl.org> Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> (Merged from https://github.com/openssl/openssl/pull/4057)
2017-08-01sha/keccak1600.c: choose more sensible default parameters.Andy Polyakov
"More" refers to the fact that we make active BIT_INTERLEAVE choice in some specific cases. Update commentary correspondingly. Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-07-30Fix typo in sha1-thumb.plXiaoyin Liu
Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4056)
2017-07-25sha/keccak1600.c: build and make it work with strict warnings.Andy Polyakov
Reviewed-by: Paul Dale <paul.dale@oracle.com> Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3943)
2017-07-24sha/asm/keccak1600-avx512.pl: improve performance by 17%.Andy Polyakov
Improvement is result of combination of data layout ideas from Keccak Code Package and initial version of this module. Hardware used for benchmarking courtesy of Atos, experiments run by Romain Dolbeau <romain.dolbeau@atos.net>. Kudos! Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-07-21sha/asm/keccak1600-avx512.pl: absorb bug-fix and minor optimization.Andy Polyakov
Hardware used for benchmarking courtesy of Atos, experiments run by Romain Dolbeau <romain.dolbeau@atos.net>. Kudos! Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-07-21x86_64 assembly pack: "optimize" for Knights Landing, add AVX-512 results.Andy Polyakov
"Optimize" is in quotes because it's rather a "salvage operation" for now. Idea is to identify processor capability flags that drive Knights Landing to suboptimial code paths and mask them. Two flags were identified, XSAVE and ADCX/ADOX. Former affects choice of AES-NI code path specific for Silvermont (Knights Landing is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are effectively mishandled at decode time. In both cases we are looking at ~2x improvement. AVX-512 results cover even Skylake-X :-) Hardware used for benchmarking courtesy of Atos, experiments run by Romain Dolbeau <romain.dolbeau@atos.net>. Kudos! Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-07-15sha/asm/keccak1600-avx2.pl: optimized remodelled version.Andy Polyakov
New register usage pattern allows to achieve sligtly better performance. Not as much as I hoped for. Performance is believed to be limited by irreconcilable write-back conflicts, rather than lack of computational resources or data dependencies. Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-07-15sha/asm/keccak1600-avx2.pl: remodel register usage.Andy Polyakov
This gives much more freedom to rearrange instructions. This is unoptimized version, provided for reference. Basically you need to compare it to initial 29724d0e15b4934abdf2d7ab71957b05d1a28256 to figure out the key difference. Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-07-10Optimize sha/asm/keccak1600-avx2.pl.Andy Polyakov
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-07-10Add sha/asm/keccak1600-avx2.pl.Andy Polyakov
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-07-07Add sha/asm/keccak1600-avx512.pl.Andy Polyakov
Reviewed-by: Rich Salz <rsalz@openssl.org> Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> (Merged from https://github.com/openssl/openssl/pull/3861)
2017-07-03sha/keccak1600.c: internalize KeccakF1600 and simplify SHA3_absorb.Andy Polyakov
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
2017-07-03sha/asm/keccak1600-x86_64.pl: close gap with Keccak Code Package.Andy Polyakov
[Also typo and readability fixes. Ryzen result is added.] Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
2017-07-03sha/asm/keccak1600-s390x.pl: typo and readability, minor size optimization.Andy Polyakov
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
2017-07-03x86_64 assembly pack: fill some blanks in Ryzen results.Andy Polyakov
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
2017-06-29Add sha/asm/keccak1600-s390x.pl.Andy Polyakov
Reviewed-by: Richard Levitte <levitte@openssl.org>
2017-06-29sha/asm/keccak1600-x86_64.pl: add CFI directives.Andy Polyakov
Reviewed-by: Richard Levitte <levitte@openssl.org>
2017-06-29sha/asm/keccak1600-x86_64.pl: optimize by re-ordering instructions.Andy Polyakov
Reviewed-by: Richard Levitte <levitte@openssl.org>
2017-06-29sha/asm/keccak1600-x86_64.pl: remove redundant moves.Andy Polyakov
Reviewed-by: Richard Levitte <levitte@openssl.org>
2017-06-29Add sha/asm/keccak1600-x86_64.pl.Andy Polyakov
Reviewed-by: Richard Levitte <levitte@openssl.org>
2017-06-24sha/asm/keccak1600-mmx.pl: optimize for Atom and add comparison data.Andy Polyakov
Curiously enough out-of-order Silvermont benefited most from optimization, 33%. [Originally mentioned "anomaly" turned to be misreported frequency scaling problem. Correct results were collected under older kernel.] Reviewed-by: Rich Salz <rsalz@openssl.org> Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> (Merged from https://github.com/openssl/openssl/pull/3739)
2017-06-24Add sha/asm/keccak1600-mmx.pl, x86 MMX module.Andy Polyakov
Reviewed-by: Rich Salz <rsalz@openssl.org> Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> (Merged from https://github.com/openssl/openssl/pull/3739)
2017-06-21sha/asm/sha512p8-ppc.pl: add POWER8 performance data.Andy Polyakov
[skip ci] Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3705)
2017-06-21Add Keccak-1600 modules for PPC64 and POWER8.Andy Polyakov
[skip ci] Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3705)
2017-06-21Add sha/asm/keccak1600-c64x.plAndy Polyakov
[skip ci] Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> (Merged from https://github.com/openssl/openssl/pull/3708)
2017-06-15Add sha/asm/keccak1600-armv8.pl.Andy Polyakov
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-06-08sha/asm/keccak1600-armv4.pl: switch to more efficient bit interleaving ↵Andy Polyakov
algorithm. Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-06-08sha/keccak1600.c: switch to more efficient bit interleaving algorithm.Andy Polyakov
[Also bypass sizeof(void *) == 8 check on some platforms.] Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-06-06sha/asm/keccak1600-armv4.pl: add NEON code path.Andy Polyakov
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-06-06sha/asm/keccak1600-armv4.pl: add SHA3_absorb and SHA3_squeeze.Andy Polyakov
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-06-06sha/asm/keccak1600-armv4.pl: optimization based on profiler feedback.Andy Polyakov
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-06-06Add sha/asm/keccak1600-armv4.pl.Andy Polyakov
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-06-05sha/keccak1600.c: add #ifdef KECCAK1600_ASM.Andy Polyakov
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-06-05sha/keccak1600.c: reduce temporary storage utilization even futher.Andy Polyakov
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-06-05sha/keccak1600.c: add another 1x variant.Andy Polyakov
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-06-05sha/keccak1600.c: add ARM-specific "reference" tweaks.Andy Polyakov
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-05-30sha/keccak1600.c: implement lane complementing transformAndy Polyakov
...as discussed in section 2.2 of "Keccak implementation overview". [skip ci] Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-05-30sha/keccak1600.c: implement bit interleaving optimization.Andy Polyakov
This targets 32-bit processors and is discussed in section 2.1 of "Keccak implementation overview". Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-05-11Remove filename argument to x86 asm_init.David Benjamin
The assembler already knows the actual path to the generated file and, in other perlasm architectures, is left to manage debug symbols itself. Notably, in OpenSSL 1.1.x's new build system, which allows a separate build directory, converting .pl to .s as the scripts currently do result in the wrong paths. This also avoids inconsistencies from some of the files using $0 and some passing in the filename. Reviewed-by: Richard Levitte <levitte@openssl.org> Reviewed-by: Andy Polyakov <appro@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3431)
2017-05-11Cleanup - use e_os2.h rather than stdint.hRichard Levitte
Not exactly everywhere, but in those source files where stdint.h is included conditionally, or where it will be eventually Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3447)
2017-05-05sha/sha512.c: fix formatting.Andy Polyakov
Reviewed-by: Richard Levitte <levitte@openssl.org>
2017-03-29More typo fixesFdaSilvaYY
Fix some comments too [skip ci] Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3069)
2017-03-22x86_64 assembly pack: add some Ryzen performance results.Andy Polyakov
Reviewed-by: Tim Hudson <tjh@openssl.org>
2017-02-28Clean up references to FIPSEmilia Kasper
This removes the fips configure option. This option is broken as the required FIPS code is not available. FIPS_mode() and FIPS_mode_set() are retained for compatibility, but FIPS_mode() always returns 0, and FIPS_mode_set() can only be used to turn FIPS mode off. Reviewed-by: Stephen Henson <steve@openssl.org>
2017-02-15sha/asm/*-x86_64.pl: add CFI annotations.Andy Polyakov
Reviewed-by: Rich Salz <rsalz@openssl.org>