Age | Commit message (Collapse) | Author |
|
Signed-off-by: Samuel Neves <sneves@dei.uc.pt>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
This causes problems with RAP and KERNEXEC for PaX, as r12 is a
reserved register.
Suggested-by: PaX Team <pageexec@freemail.hu>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
At this stage the value if C[4] is at most ((2^256-1) + 38*(2^256-1)) / 2^256 = 38,
so there is no need to use a wide multiplication.
Change inspired by Andy Polyakov's OpenSSL implementation.
Signed-off-by: Samuel Neves <sneves@dei.uc.pt>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Correctness can be quickly verified with the following z3py script:
>>> from z3 import *
>>> x = BitVec("x", 256) # any 256-bit value
>>> ref = URem(x, 2**255 - 19) # correct value
>>> t = Extract(255, 255, x); x &= 2**255 - 1; # btrq $63, %3
>>> u = If(t != 0, BitVecVal(38, 256), BitVecVal(19, 256)) # cmovncl %k5, %k4
>>> x += u # addq %4, %0; adcq $0, %1; adcq $0, %2; adcq $0, %3;
>>> t = Extract(255, 255, x); x &= 2**255 - 1; # btrq $63, %3
>>> u = If(t != 0, BitVecVal(0, 256), BitVecVal(19, 256)) # cmovncl %k5, %k4
>>> x -= u # subq %4, %0; sbbq $0, %1; sbbq $0, %2; sbbq $0, %3;
>>> prove(x == ref)
proved
Change inspired by Andy Polyakov's OpenSSL implementation.
Signed-off-by: Samuel Neves <sneves@dei.uc.pt>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
The wide multiplication by 38 in mul_a24_eltfp25519_1w is redundant:
(2^256-1) * 121666 / 2^256 is at most 121665, and therefore a 64-bit
multiplication can never overflow.
Change inspired by Andy Polyakov's OpenSSL implementation.
Signed-off-by: Samuel Neves <sneves@dei.uc.pt>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Also add cselect optimization.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
It's faster and doesn't use the FPU.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|