wireguard-linux-compat - WireGuard Linux compat

Age	Commit message (Collapse)	Author
2018-06-22	simd: add missing header	Jason A. Donenfeld
	Suggested-by: Shlomi Steinberg <shlomi@shlomisteinberg.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-06-22	poly1305: give linker the correct constant data section size	Jason A. Donenfeld
	Otherwise these constants will be merged wrong or excluded, and we'll wind up with wrong calculations. While bfd (the normal kernel linker) doesn't seem to mind, recent versions of gold do bad things. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-06-20	poly1305: add missing string.h header	Jason A. Donenfeld
	Reported-by: Peter Korsgaard <peter@korsgaard.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-06-17	simd: no need to restore fpu state when no preemption	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-06-17	simd: encapsulate fpu amortization into nice functions	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-06-14	chacha20poly1305: use slow crypto on -rt kernels on arm too	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-06-13	chacha20poly1305: use slow crypto on -rt kernels	Jason A. Donenfeld
	In rt kernels, spinlocks call schedule(), which means preemption can't be disabled. The FPU disables preemption. Hence, we can either restructure things to move the calls to kernel_fpu_begin/end to be really close to the actual crypto routines, or we can do the slower lazier solution of just not using the FPU at all on -rt kernels. This patch goes with the latter lazy solution. The reason why we don't place the calls to kernel_fpu_begin/end close to the crypto routines in the first place is that they're very expensive, as it usually involves a call to XSAVE. So on sane kernels, we benefit from only having to call it once. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-06-02	chacha20: add missing include to header	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-05-31	poly1305: mips: compute S on fly	René van Dorst
	This reduces memory access and the total opaque size. Signed-off-by: René van Dorst <opensource@vdorst.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-05-31	crypto: consistent constification	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-05-31	chacha20poly1305: combine stack variables into union	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-05-31	chacha20poly1305: split up into separate files	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-05-29	curve25519: x86_64: make symbol static	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-05-29	curve25519: x86_64: satisfy sparse	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-05-18	chacha20poly1305: add mips32 implementation	René van Dorst
	Signed-off-by: René van Dorst <opensource@vdorst.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-05-13	chacha20poly1305: make gcc 8.1 happy	Samuel Neves
	GCC 8.1 does not know about the invariant `0 <= ctx->num < POLY1305_BLOCK_SIZE`. This results in a warning that `memcpy(ctx->data + num, inp, len);` may overflow the `data` field, which is correct for arbitrary values of `num`. To make the invariant explicit we ensure that `num` is in the required range. An alternative would be to change `ctx->num` to a 4-bit bitfield at the point of declaration. This changes the code from `test ebp, ebp; jz end` to `and ebp, 15; jz end`, which have identical performance characteristics. Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-04-18	poly1305: do not place constants in different sections	Jason A. Donenfeld
	We're referencing these constants as one contiguous blob, so if there's any merging that goes on with other constants elsewhere (such as the kernel's current poly1305 implementation that we hope to replace), then these will be reordered and have the wrong values. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-04-16	blake2s: remove unused helper	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-04-05	chacha20poly1305: put magic constant behind macro	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-03-09	curve25519: precomp const correctness	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-03-09	curve25519: memzero in batches	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-03-09	curve25519: use cmov instead of xor for cswap	Jason A. Donenfeld
	Also add cselect optimization. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-03-09	curve25519: use precomp implementation instead of sandy2x	Jason A. Donenfeld
	It's faster and doesn't use the FPU. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-03-02	crypto: read only after init	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-02-14	blake2s: use union instead of casting	Jason A. Donenfeld
	This deals with alignment more easily and also helps squelch a clang-analyzer warning. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-02-01	curve25519: replace fiat64 with faster hacl64	Jason A. Donenfeld
	This reverts commit da4ff396cc5d5e0ff21f9ecbc2f951c048c63fff and adds some optimizations to hacl64. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-02-01	curve25519: replace hacl64 with fiat64	Jason A. Donenfeld
	For now, it's faster: hacl64: 109782 cycles per call fiat64: 108984 cycles per call It's quite possible this commit will be reverted with nice changes from INRIA, though. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-01-30	chacha20poly1305: better buffer alignment	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-01-30	chacha20poly1305: use existing rol32 function	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-01-19	poly1305: add poly-specific self-tests	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-01-18	curve25519-fiat32: uninline certain functions	Jason A. Donenfeld
	While this has a negative performance impact on x86_64, it has a positive performance impact on smaller machines, which is where we're actually using this code. For example, an A53: Before: fiat32: 228605 cycles per call After: fiat32: 188307 cycles per call Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-01-18	curve25519: wire up new impls and remove donna	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-01-18	curve25519: resolve symbol clash between fe types	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-01-18	curve25519: import 64-bit hacl-star implementation	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-01-18	curve25519: import 32-bit fiat-crypto implementation	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-01-18	curve25519: modularize implementation	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-01-18	poly1305: remove indirect calls	Samuel Neves
	Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-01-03	global: year bump	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-12-13	crypto: compile on UML	Jason A. Donenfeld
	We basically just don't use FPU in UML. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-12-11	chacha20poly1305: wire up avx512vl for skylake-x	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-12-11	chacha20: avx512vl implementation	Samuel Neves
	Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-12-11	poly1305: fix avx512f alignment bug	Samuel Neves
	Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-12-11	chacha20poly1305: cleaner generic code	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-12-09	blake2s-x86_64: fix spacing	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-12-09	global: add SPDX tags to all files	Greg Kroah-Hartman
	It's good to have SPDX identifiers in all files as the Linux kernel developers are working to add these identifiers to all files. Update all files with the correct SPDX license identifier based on the license text of the project or based on the license in the file itself. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Modified-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-12-03	chacha20-arm: fix with clang -fno-integrated-as.	David Benjamin
	The __clang__-guarded #defines cause gas to complain if clang is passed -fno-integrated-as. Emitting .syntax unified when those are used fixes this. Reviewed-by: Andy Polyakov <appro@openssl.org> Reviewed-by: Kurt Roeckx <kurt@roeckx.be> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-12-03	poly1305: update x86-64 kernel to AVX512F only	Samuel Neves
	Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-11-28	curve25519: explictly depend on AS_AVX	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-11-28	curve25519: modularize dispatch	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-11-26	blake2s: tweak avx512 code	Samuel Neves
	This is not as ideal as using zmm, but zmm downclocks. And it's not as fast single-threaded as using the gathers. But it is faster when multithreaded, which is what WireGuard is doing. Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>