perf: optimize scalar multiplications and multi-scalar multiplications circuits via lattice reductions#1697
perf: optimize scalar multiplications and multi-scalar multiplications circuits via lattice reductions#1697
Conversation
There was a problem hiding this comment.
Pull request overview
This PR migrates gnark's scalar decomposition hints from the eisenstein package to the new lattice package in gnark-crypto, implementing lattice-based rational reconstruction following the approach from "Fast elliptic curve scalar multiplications in SN(T)ARK circuits" (EEMPE 2025). The new approach provides proven bounds from LLL lattice reduction theory instead of heuristic bounds, enabling tighter bit-width bounds for decomposed scalars.
Changes:
- Renamed hint functions:
halfGCD→rationalReconstructandhalfGCDEisenstein→rationalReconstructExt - Reduced bit bounds from
r.BitLen()/4 + 9to(r.BitLen()+3)/4 + 2, saving ~7 iterations in scalar multiplication loops - Updated imports to use
github.com/consensys/gnark-crypto/algebra/latticeinstead of the eisenstein package
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| std/algebra/native/twistededwards/hints.go | Reimplemented rationalReconstruct hint using lattice.RationalReconstruct with proper sign handling and overflow computation |
| std/algebra/native/twistededwards/point.go | Updated hint call from halfGCD to rationalReconstruct |
| std/algebra/native/twistededwards/curve_test.go | Added benchmark for constraint counting |
| std/algebra/native/sw_bls12377/hints.go | Reimplemented rationalReconstructExt using lattice.RationalReconstructExt for 4-part decomposition |
| std/algebra/native/sw_bls12377/g1.go | Updated hint call, bounds calculation, and comments to reflect new LLL-proven bounds |
| std/algebra/native/sw_bls12377/g1_test.go | Added benchmark for constraint counting |
| std/algebra/emulated/sw_emulated/hints.go | Reimplemented both rationalReconstruct and rationalReconstructExt for emulated field arithmetic |
| std/algebra/emulated/sw_emulated/point.go | Updated hint calls, bounds calculation, and comments |
| std/algebra/emulated/sw_emulated/point_test.go | Added benchmarks for multiple curve configurations |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Hmm, something strange is happening right now -- testing with And I think the issue is that we're using func (p *Point) phi(api frontend.API, p1 *Point, curve *CurveParams, endo *EndoParams) *Point {
xy := api.Mul(p1.X, p1.Y)
yy := api.Mul(p1.Y, p1.Y)
f := api.Sub(1, yy)
f = api.Mul(f, endo.Endo[1])
g := api.Add(yy, endo.Endo[0])
g = api.Mul(g, endo.Endo[0])
h := api.Sub(yy, endo.Endo[0])
p.X = api.DivUnchecked(f, xy) // <---- here
p.Y = api.DivUnchecked(g, h)
return p
}which is unconstrained by the // DivUnchecked returns i1 / i2
// If i1 == i2 == 0, the return value (0) is unconstrained.
DivUnchecked(i1, i2 Variable) VariableHere test engine silently returns 0 and R1CS solver as well (it could return anything though), but PLONK solver explicitly fails here. So I think there is still the issue that the twistededwards path doesn't handle edge cases GLV in twistededwards. Additionally, imo in another PR we should make test engine more strict to panic explicitly in case we have |
Made test engine stricter in #1734. It is merged now and could merge |
|
Thanks for the stricter test engine in #1734 — merged master and it immediately surfaced the issue. The problem was in |
ivokub
left a comment
There was a problem hiding this comment.
I have reviewed the emulated cases, still reviewing 2-chains. I'm posting my comments for now. I'm not confident that the changes are correct, particularly we seem to trust the hinted scalar mul result before we constrain it. And later we also may switch back to hinted result without constraining for particular edge cases (scalar=0 for example).
|
Also for func TestScalarMulFakeGLVUnsafeS1Fails(t *testing.T) {
assert := test.NewAssert(t)
p256 := elliptic.P256()
s := big.NewInt(1)
px, py := p256.Params().Gx, p256.Params().Gy
unsafeCircuit := ScalarMulFakeGLVTest[emulated.P256Fp, emulated.P256Fr]{}
completeCircuit := ScalarMulFakeGLVEdgeCasesTest[emulated.P256Fp, emulated.P256Fr]{}
witness := ScalarMulFakeGLVTest[emulated.P256Fp, emulated.P256Fr]{
S: emulated.ValueOf[emulated.P256Fr](s),
Q: AffinePoint[emulated.P256Fp]{
X: emulated.ValueOf[emulated.P256Fp](px),
Y: emulated.ValueOf[emulated.P256Fp](py),
},
R: AffinePoint[emulated.P256Fp]{
X: emulated.ValueOf[emulated.P256Fp](px),
Y: emulated.ValueOf[emulated.P256Fp](py),
},
}
err := test.IsSolved(&unsafeCircuit, &witness, testCurve.ScalarField())
assert.Error(err)
completeWitness := ScalarMulFakeGLVEdgeCasesTest[emulated.P256Fp, emulated.P256Fr]{
S: witness.S,
P: witness.Q,
R: witness.R,
}
err = test.IsSolved(&completeCircuit, &completeWitness, testCurve.ScalarField())
assert.NoError(err)
}So this fails to solve for |
ivokub
left a comment
There was a problem hiding this comment.
There is still the completeness issue for sw_emulated.scalarMulFakeGLV
| @@ -1229,18 +1247,20 @@ func (c *Curve[B, S]) scalarMulFakeGLV(Q *AffinePoint[B], s *emulated.Element[S] | |||
| panic(err) | |||
There was a problem hiding this comment.
This doesn't still handle all edge cases in case we have WithCompleteArithmetic set. See the test:
func TestScalarMulFakeGLVUnsafeS1Fails(t *testing.T) {
assert := test.NewAssert(t)
p256 := elliptic.P256()
s := big.NewInt(1)
px, py := p256.Params().Gx, p256.Params().Gy
unsafeCircuit := ScalarMulFakeGLVTest[emulated.P256Fp, emulated.P256Fr]{}
completeCircuit := ScalarMulFakeGLVEdgeCasesTest[emulated.P256Fp, emulated.P256Fr]{}
witness := ScalarMulFakeGLVTest[emulated.P256Fp, emulated.P256Fr]{
S: emulated.ValueOf[emulated.P256Fr](s),
Q: AffinePoint[emulated.P256Fp]{
X: emulated.ValueOf[emulated.P256Fp](px),
Y: emulated.ValueOf[emulated.P256Fp](py),
},
R: AffinePoint[emulated.P256Fp]{
X: emulated.ValueOf[emulated.P256Fp](px),
Y: emulated.ValueOf[emulated.P256Fp](py),
},
}
err := test.IsSolved(&unsafeCircuit, &witness, testCurve.ScalarField())
assert.Error(err)
completeWitness := ScalarMulFakeGLVEdgeCasesTest[emulated.P256Fp, emulated.P256Fr]{
S: witness.S,
P: witness.Q,
R: witness.R,
}
err = test.IsSolved(&completeCircuit, &completeWitness, testCurve.ScalarField())
assert.NoError(err)
}I think we should revert to using complete arithmetic in case the option is set instead of using dummy point to shift. As the dummy point is known beforehand, then it would otherwise always be possible to create an edge case imo which leads to incomplete circuit.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Description
This PR migrates gnark's scalar decomposition hints from the
eisensteinpackage to the newlatticepackage in Consensys/gnark-crypto#799, following the lattice-based rational reconstruction approach from "Fast elliptic curve scalar multiplications in SN(T)ARK circuits" by Eagen-ElHousni-Masson-Piellard (https://eprint.iacr.org/2025/933.pdf).The new approach provides proven bounds from LLL lattice reduction theory, replacing heuristic bounds. This allows tighter bit-width bounds for the decomposed scalars, reducing circuit constraints.
The PR also revisits the complete arithmetic path to make it more constraint-optimized.
Changes
Hint Renames
halfGCD→rationalReconstruct(2-part decomposition usinglattice.RationalReconstruct)halfGCDEisenstein→rationalReconstructExt(4-part decomposition usinglattice.RationalReconstructExt)Tighter Bounds
The number of bits for decomposed scalars has been reduced:
r.BitLen()/4 + 9(heuristic with large safety margin)(r.BitLen()+3)/4 + 2(proven bound from LLL: outputs< 1.25·r^(1/4))This saves ~7 iterations in the scalar multiplication loop.
Affected Packages
std/algebra/emulated/sw_emulated- Emulated short Weierstrass curves G1std/algebra/emulated/sw_bls12381,std/algebra/emulated/sw_bn254andstd/algebra/emulated/sw_bw6761- emulated G2std/algebra/native/sw_bls12377- Native BLS12-377 G1 and G2std/algebra/native/twistededwards- Native twisted Edwards curvesType of change
How has this been tested?
All existing tests pass:
How has this been benchmarked?
Constraint Counts (Plonk/SCS)
G1 scalar multiplication:
G2 scalar multiplication:
G1 MSM of size 2 :
Applications:
Discussion
1. Hint Computation Time
2-part decomposition
4-part decomposition (RationalReconstructExt) - GLV curves
The new approach is slower for hint computation (4D) because it runs LLL reduction from scratch rather than using 2-step Eisenstein half-GCD. However, hint computation happens outside the prover and is negligible compared to proof generation time. The constraint reduction provides a net benefit.
3-part decomposition (MultiRationalReconstruct) - 2 scalars
6-part decomposition (MultiRationalReconstructExt) - 2 scalars
2.
logupvsMuxFor G1 we can do a 4-MSM. For G2 we can leverage the Frobenius as a second endomorphism, we can apply it to all and get a 8-MSM or to half and get a 6-MSM. But with big tables
Muxbecomes the bottlneck, so we can try withlogup.G1 BLS12-381:
G2 BLS12-381:
The 4D GLV+FakeGLV method with
Muxremains optimal for single scalar multiplication on both G1 and G2. Higher-dimensional methods (6D, 8D) using the ψ endomorphism don't reduce constraints because theMux/logupoverhead outweighs the benefits of fewer loop iterations, even with logup optimization.3. MSM
According to [EEMP25], we can turn a
MSM(2,n)verification (i.e. a MSM of size 2 with scalars of n bits) into aMSM(3,2n/3)orMSM(6,n/3)verification. We implemented this for the native (SW and tEd) and emulated (SW) cases withMuxandlogup(for native). For all the scenario existing algorithms were better except for:MSM(3,2n/3)withLogUpMSM(3,2n/3)withMuxMSM(6,n/3)withMux(Bandersnatch).Checklist:
golangci-lintdoes not output errors locallyNote
High Risk
High risk because it rewrites core elliptic-curve scalar multiplication verification logic and hint interfaces (including complete-arithmetic edge cases) across multiple curves; any mistake can silently break proof soundness or curve arithmetic correctness.
Overview
Optimizes elliptic-curve scalar multiplication verification in circuits by replacing Eisenstein/half-GCD based scalar decompositions with LLL-backed
latticerational reconstruction (rationalReconstruct,rationalReconstructExt) and tightening sub-scalar bit bounds.Updates emulated SW code to use the new hints (including new G2 hints) and introduces a new GLV+fakeGLV-based
G2.ScalarMulpath forbls12-381,bn254, andbw6-761, with precomputed generator constants and expanded edge-case handling underalgopts.WithCompleteArithmetic().Refactors native SW
bls12-377to use lattice reconstruction for GLV+fakeGLV checks, adds a hint-backed completeG1joint-scalar-mul verifier, and adjusts arithmetic to avoid overflow via emulated scalar-field checks.Improves MSM/joint-scalar-mul behavior in
sw_emulated(preferScalarMul+Add for non-GLV, bias-point strategy, stricter edge-case/soundness handling) and updates twisted EdwardsDoubleBaseScalarMulto choose between GLV and non-GLV optimized paths; adds extensive new tests/benchmarks and updatesinternal/stats/latest_stats.csvaccordingly.Written by Cursor Bugbot for commit 986d261. This will update automatically on new commits. Configure here.