summaryrefslogtreecommitdiffhomepage
path: root/src/crypto/zinc
AgeCommit message (Collapse)Author
2019-05-29zinc: update copyrightJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-05-29blake2s: shorten ssse3 loopSamuel Neves
This (mostly) preserves the performance (as measured on Haswell and *lake) of last commit, but it drastically reduces code size. Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-05-29blake2s,chacha: latency tweakSamuel Neves
In every odd-numbered round, instead of operating over the state x00 x01 x02 x03 x05 x06 x07 x04 x10 x11 x08 x09 x15 x12 x13 x14 we operate over the rotated state x03 x00 x01 x02 x04 x05 x06 x07 x09 x10 x11 x08 x14 x15 x12 x13 The advantage here is that this requires no changes to the 'x04 x05 x06 x07' row, which is in the critical path. This results in a noticeable latency improvement of roughly R cycles, for R diagonal rounds in the primitive. In the case of BLAKE2s, which I also moved from requiring AVX to only requiring SSSE3, we save approximately 30 cycles per compression function call on Haswell and Skylake. In other words, this is an improvement of ~0.6 cpb. This idea was pointed out to me by Shunsuke Shimizu, though it appears to have been around for longer. Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-05-29zinc: arm64: use cpu_get_elf_hwcap accessor for 5.2Jason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-03-27blake2s: remove outlen parameter from finalJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-03-27blake2s: simplifySamuel Neves
Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-02-03noise: store clamped key instead of raw keyJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-02-03chacha20poly1305: permit unaligned strides on certain platformsJason A. Donenfeld
The map allocations required to fix this are mostly slower than unaligned paths. Reported-by: Louis Sautier <sbraz@gentoo.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-01-23global: normalize -> clampJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-01-07global: update copyrightJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-12-07chacha20: do not define unused asm functionJason A. Donenfeld
This causes RAP to be unhappy, and we're not using it anyway. Reported-by: Ivan J. <parazyd@dyne.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-12-07chacha20,poly1305: simplify perlasm fancinessJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-19chacha20,poly1305: do not use xlateJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-17poly1305: make frame pointers for auxiliary callsSamuel Neves
Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-15chacha20,poly1305: don't do compiler testing in generator and remove xor helperJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-15poly1305: cleanup leftover debugging changesJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-15poly1305: only export neon symbols when in useJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-15chacha20,poly1305: fix up for win64Samuel Neves
These don't help us, but it is important to keep this working for when it's re-added to cryptogams. Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-15perlasm: avoid rep retJason A. Donenfeld
The original hardcodes returns as .byte 0xf3,0xc3, aka "rep ret". We replace this by "ret". "rep ret" was meant to help with AMD K8 chips, cf. http://repzret.org/p/repzret. It makes no sense to continue to use this kludge for code that won't even run on ancient AMD chips. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-15poly1305: specialize to wireguardJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-15chacha20: specialize to wireguardJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-15perlasm: cleanup whitespaceJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-15poly1305: adjust to kernelSamuel Neves
Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-14chacha20: cleaner function declarationsSamuel Neves
Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-14chacha20: normalize namesSamuel Neves
Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-14chacha20: fixup win64 stack offsetsSamuel Neves
We don't need to do this for kernel purposes, but it's polite to leave things unbroken. Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-14chacha20: simplify stack unwinding on ChaCha20_ctr32Samuel Neves
objtool did not quite understand the stack arithmetic employed here. Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-14chacha20: use DRAP idiomSamuel Neves
This effectively means swapping the usage of %r9 and %r10 globally. Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-14chacha20: add hchacha_ssse3Samuel Neves
Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-14chacha20: begin adapting to kernel settingSamuel Neves
Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-14chacha20,poly1305: switch to perlasm originals on x86_64Samuel Neves
Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-14chacha20,poly1305: use CONFIG_KERNEL_MODE_NEON in .pl on armJason A. Donenfeld
While Andy is right to desire a separation between compiler defines and project defines, there are simply too many odd kernel configurations and we require testing for CONFIG_KERNEL_MODE_NEON. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-14chacha20,poly1305: switch to perlasm originals on mips and armJason A. Donenfeld
We also separate out Eric Biggers' Cortex A7 implementation into its own file. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-13global: various formatting tweeksJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-27curve25519-x86_64: this was relicensed to BSD-3-Clause upstreamJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-27poly1305-donna64: mark large constants as ULLJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-07crypto: clean up remaining .h->.cJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-07crypto: use BIT(i) & bitmap instead of (bitmap >> i) & 1Jason A. Donenfeld
Pros: clearer if you're not familiar with the shift idiom, uses kernel macro. Cons: doesn't work any more if the lvalue ever ceases to be a bool. Neutral: generates the same machine code. Suggested-by: Sultan Alsawaf <sultanxda@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-07crypto: disable broken implementations in selftestsJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-06crypto: test all SIMD combinationsJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-06global: rename include'd C files to be .cJason A. Donenfeld
This is done by 259 other files in the kernel tree: linux $ rg '#include.*\.c' -l | wc -l 259 Suggested-by: Sultan Alsawaf <sultanxda@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-04curve25519-arm: rearrange multiplications for better in-order performanceJason A. Donenfeld
Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-04curve25519-arm: writeback to base register when possibleJason A. Donenfeld
Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-03blake2s: always put a simd, even if not use()'dJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-03simd: introduce useful disabling macroJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-03curve25519-arm: adjust commentJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-03curve25519-arm: use new simd apiJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-02chacha20-arm: use proper reteq macro instead of bxeqJason A. Donenfeld
Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-02global: change BUG_ON to WARN_ONJason A. Donenfeld
Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-02poly1305: document rationale for base 2^26->2^64/32 conversionJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>