summaryrefslogtreecommitdiffhomepage
path: root/src/crypto
AgeCommit message (Collapse)Author
2018-09-10blake2s-x86_64: fix whitespace errorsJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-09-10poly1305: switch to donnaJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-09-08poly1305: rewrite self tests from scratchJason A. Donenfeld
This removes the old cruft and makes things a bit more idiomatic. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-09-06compat: move simd.h from crypto to compat since it's going upstreamJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-09-06crypto: use CRYPTOGAMS licenseJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-09-06curve25519: arm: do not modify sp directlyJason A. Donenfeld
Thumb doesn't like this. Reported-by: Roman Mamedov <rm@romanrm.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-09-04global: prefer sizeof(*pointer) when possibleJason A. Donenfeld
Suggested-by: Sultan Alsawaf <sultanxda@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-09-03crypto: import zincJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-28curve25519-arm: prefix immediates with #Jason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-28curve25519-arm: do not waste 32 bytes of stackJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-28curve25519-arm: use ordinary prolog and epilogueSamuel Neves
Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-28curve25519-arm: add spaces after commasJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-28curve25519-arm: cleanups from lkmlJason A. Donenfeld
Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-28curve25519-arm: reformatJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-28curve25519-x86_64: let the compiler decide when/how to load constantsSamuel Neves
Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-28curve25519-hacl64: use formally verified C for comparisonsJason A. Donenfeld
The previous code had been proved in Z3, but this new code from upstream KreMLin is directly generated from the F*, which is preferable. The assembly generated is identical. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-28crypto: use unaligned helpersJason A. Donenfeld
This is not useful for WireGuard, but for the general use case we probably want it this way, and the speed difference is mostly lost in the noise. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-07curve25519-hacl64: correct u64_gte_maskSamuel Neves
Remove signed right shifts. Previously u64_gte_mask was only correct for x < 2^63. Z3 script proving correctness: >>> from z3 import * >>> >>> x = BitVec("x", 64) >>> y = BitVec("y", 64) >>> >>> t = LShR(x^((x^y)|((x-y)^y)), 63) - 1 >>> >>> prove(If(UGE(x, y), BitVecVal(-1, 64), BitVecVal(0, 64)) == t) proved Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-07curve25519-hacl64: simplify u64_eq_maskSamuel Neves
Avoid signed right shift. Z3 script showing equivalence: >>> from z3 import * >>> >>> x = BitVec("x", 64) >>> y = BitVec("y", 64) >>> >>> # Before ... x_ = ~(x ^ y) >>> x_ &= x_ << 32 >>> x_ &= x_ << 16 >>> x_ &= x_ << 8 >>> x_ &= x_ << 4 >>> x_ &= x_ << 2 >>> x_ &= x_ << 1 >>> x_ >>= 63 >>> >>> # After ... y_ = x ^ y >>> y_ = y_ | -y_ >>> y_ = LShR(y_, 63) - 1 >>> >>> prove(x_ == y_) proved Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-07chacha20: use memmove in case buffers overlapJason A. Donenfeld
Suggested-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-07curve25519-x86_64: avoid use of r12Jason A. Donenfeld
This causes problems with RAP and KERNEXEC for PaX, as r12 is a reserved register. Suggested-by: PaX Team <pageexec@freemail.hu> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-06crypto: move simd context to specific typeJason A. Donenfeld
Suggested-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-07-31main: add missing chacha20poly1305 headerJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-07-28curve25519-x86_64: tighten reductions modulo 2^256-38Samuel Neves
At this stage the value if C[4] is at most ((2^256-1) + 38*(2^256-1)) / 2^256 = 38, so there is no need to use a wide multiplication. Change inspired by Andy Polyakov's OpenSSL implementation. Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-07-28curve25519-x86_64: simplify the final reduction by adding 19 beforehandSamuel Neves
Correctness can be quickly verified with the following z3py script: >>> from z3 import * >>> x = BitVec("x", 256) # any 256-bit value >>> ref = URem(x, 2**255 - 19) # correct value >>> t = Extract(255, 255, x); x &= 2**255 - 1; # btrq $63, %3 >>> u = If(t != 0, BitVecVal(38, 256), BitVecVal(19, 256)) # cmovncl %k5, %k4 >>> x += u # addq %4, %0; adcq $0, %1; adcq $0, %2; adcq $0, %3; >>> t = Extract(255, 255, x); x &= 2**255 - 1; # btrq $63, %3 >>> u = If(t != 0, BitVecVal(0, 256), BitVecVal(19, 256)) # cmovncl %k5, %k4 >>> x -= u # subq %4, %0; sbbq $0, %1; sbbq $0, %2; sbbq $0, %3; >>> prove(x == ref) proved Change inspired by Andy Polyakov's OpenSSL implementation. Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-07-28curve25519-x86_64: tighten the x25519 assemblySamuel Neves
The wide multiplication by 38 in mul_a24_eltfp25519_1w is redundant: (2^256-1) * 121666 / 2^256 is at most 121665, and therefore a 64-bit multiplication can never overflow. Change inspired by Andy Polyakov's OpenSSL implementation. Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-06-22simd: add missing headerJason A. Donenfeld
Suggested-by: Shlomi Steinberg <shlomi@shlomisteinberg.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-06-22poly1305: give linker the correct constant data section sizeJason A. Donenfeld
Otherwise these constants will be merged wrong or excluded, and we'll wind up with wrong calculations. While bfd (the normal kernel linker) doesn't seem to mind, recent versions of gold do bad things. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-06-20poly1305: add missing string.h headerJason A. Donenfeld
Reported-by: Peter Korsgaard <peter@korsgaard.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-06-17simd: no need to restore fpu state when no preemptionJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-06-17simd: encapsulate fpu amortization into nice functionsJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-06-14chacha20poly1305: use slow crypto on -rt kernels on arm tooJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-06-13chacha20poly1305: use slow crypto on -rt kernelsJason A. Donenfeld
In rt kernels, spinlocks call schedule(), which means preemption can't be disabled. The FPU disables preemption. Hence, we can either restructure things to move the calls to kernel_fpu_begin/end to be really close to the actual crypto routines, or we can do the slower lazier solution of just not using the FPU at all on -rt kernels. This patch goes with the latter lazy solution. The reason why we don't place the calls to kernel_fpu_begin/end close to the crypto routines in the first place is that they're very expensive, as it usually involves a call to XSAVE. So on sane kernels, we benefit from only having to call it once. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-06-02chacha20: add missing include to headerJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-05-31poly1305: mips: compute S on flyRené van Dorst
This reduces memory access and the total opaque size. Signed-off-by: René van Dorst <opensource@vdorst.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-05-31crypto: consistent constificationJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-05-31chacha20poly1305: combine stack variables into unionJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-05-31chacha20poly1305: split up into separate filesJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-05-29curve25519: x86_64: make symbol staticJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-05-29curve25519: x86_64: satisfy sparseJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-05-18chacha20poly1305: add mips32 implementationRené van Dorst
Signed-off-by: René van Dorst <opensource@vdorst.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-05-13chacha20poly1305: make gcc 8.1 happySamuel Neves
GCC 8.1 does not know about the invariant `0 <= ctx->num < POLY1305_BLOCK_SIZE`. This results in a warning that `memcpy(ctx->data + num, inp, len);` may overflow the `data` field, which is correct for arbitrary values of `num`. To make the invariant explicit we ensure that `num` is in the required range. An alternative would be to change `ctx->num` to a 4-bit bitfield at the point of declaration. This changes the code from `test ebp, ebp; jz end` to `and ebp, 15; jz end`, which have identical performance characteristics. Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-04-18poly1305: do not place constants in different sectionsJason A. Donenfeld
We're referencing these constants as one contiguous blob, so if there's any merging that goes on with other constants elsewhere (such as the kernel's current poly1305 implementation that we hope to replace), then these will be reordered and have the wrong values. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-04-16blake2s: remove unused helperJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-04-05chacha20poly1305: put magic constant behind macroJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-03-09curve25519: precomp const correctnessJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-03-09curve25519: memzero in batchesJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-03-09curve25519: use cmov instead of xor for cswapJason A. Donenfeld
Also add cselect optimization. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-03-09curve25519: use precomp implementation instead of sandy2xJason A. Donenfeld
It's faster and doesn't use the FPU. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-03-02crypto: read only after initJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>