-
Notifications
You must be signed in to change notification settings - Fork 767
Description
Prerequisites
Please put an X between the brackets as you perform the following steps:
- Check that your issue is not already filed:
https://github.com/leanprover/lean4/issues - Reduce the issue to a minimal, self-contained, reproducible test case.
Avoid dependencies to Mathlib or Batteries. - Test your test case against the latest nightly release, for example on
https://live.lean-lang.org/#project=lean-nightly
(You can also use the settings there to switch to “Lean nightly”)
Description
Using ↑x in a term (that has an explicit expected type) at a point where the expected type of ↑x is not yet known can cause unnecessary attempts at unification that can be very slow.
Context
This was noticed because it can cause apply?/exact? to time out unexpectedly. This can be a serious drag on QoL when working interactively, as it can render apply?/exact? almost unusable. (And there is no work-around by writing terms differently; the terms are generated internally by apply?/exact?.)
Further investigation isolated the issue described here.
See #mathlib4 > apply? failure @ 💬, #mathlib4 > Coercion triggers timeout @ 💬, and #lean4 > Coercion and unification @ 💬 (and further threads linked there) on Zulip for more details.
The issue lean4#2831 seems related. It appears that the slowness observed there is caused by the same phenomenon.
Steps to Reproduce
Here is a Mathlib-free example that shows that minor differences in the code can make a big difference in elaboration time. (The heartbeat counts are with v4.20.0-rc5 on the web editor.)
@[irreducible] def A : Type := Unit
@[irreducible] def B : Type := Unit
unseal B in
@[coe] def AtoB (_a : A) : B := ()
instance : Coe A B where coe := AtoB
def dontUnfoldMe {α : Type} (x y : α) : Nat → List Bool → Nat
| 0, _ => 0
| n+1, l => dontUnfoldMe x y n (true::l) + dontUnfoldMe x y n (false::l)
theorem dontUnfoldMe_eq {α : Type} (x y : α) : ∀ n l, dontUnfoldMe x y n l = 0
| 0, _ => rfl
| n+1, l => by rw [dontUnfoldMe, dontUnfoldMe_eq x y n, dontUnfoldMe_eq x y n]
set_option trace.profiler true
set_option trace.profiler.useHeartbeats true
-- set_option trace.Meta.isDefEq true
-- set_option trace.Elab.coe true
-- 71k profiler heartbeats
example (b : B) (a : A) : dontUnfoldMe b (AtoB a) 14 [] = 0 :=
dontUnfoldMe_eq b (↑a) 14 [] -- fast
-- 103165k (25787k for 12) profiler heartbeats
example (a : A) (b : B) : dontUnfoldMe (AtoB a) b 14 [] = 0 :=
dontUnfoldMe_eq (↑a) b 14 [] -- slow (~ 2^m)
-- 71k profiler heartbeats
example (a : A) (b : B) : dontUnfoldMe (AtoB a) b 14 [] = 0 :=
(dontUnfoldMe_eq (↑a) b 14 [] :) -- fast
-- 25k profiler heartbeats
example (a : A) (b : B) : dontUnfoldMe (AtoB a) b 14 [] = 0 :=
dontUnfoldMe_eq .. -- fast
-- 152555k (10), 50889k (9) profiler heartbeats
example (a : A) (b : B) : dontUnfoldMe (AtoB a) b 10 [] = 0 :=
dontUnfoldMe_eq (↑a) b .. -- slower (~ 3^m)
-- 30495k (7), 10473k (6) profiler heartbeats
-- fails (type mismatch)
example (a : A) (b : B) : dontUnfoldMe (AtoB a) b 7 [] = 0 :=
dontUnfoldMe_eq (↑a) .. -- slower (~ 3^m)
-- 25k profiler heartbeats
example (a : A) (b : B) : dontUnfoldMe (AtoB a) b 14 [] = 0 :=
dontUnfoldMe_eq _ b .. -- fast
-- 66k profiler heartbeats
example (a : A) (b : B) : dontUnfoldMe (AtoB a) b 14 [] = 0 :=
dontUnfoldMe_eq (α := B) a .. -- fastInterpretation:
- When the expected type of
↑ais known when this term is elaborated, the coercionAtoBis found, and↑ais replaced byAtoB a. This leads to an (expected) fast final type-check. - When the expected type of
↑ais not (yet) known,↑agets replaced by?m.abc : ?m.xyz, where?m.abcis an unassignable metavariable (and?.xyzis assignable). In particular?m.abcdoes not unify withAtoB a. When Lean tries to unify the type ofdontUnfoldMe_eq ...with the expected typedontUnfoldMe (AtoB a) n 14 [], this forces it to keep unfoldingdontUnfoldMeall the way down, until unification either succeeds (because the metavariable gets eliminated in the end) or fails. - However, before we try to unify the types, we actually know the expected type of
↑a(because it is determined by the typeBofb), so Lean could go back and check for the coercion to be used, thus avoiding the slow unification attempt. - When the given term is
(dontUnfoldMe_eq (↑a) b 14 [] :), it is elaborated without expected type; at the end of that, Lean goes back to↑aand finds the coercion, resulting in the correct type without unsassignable metavariables.
Compare (parts of) the trace for (dontUnfoldMe_eq (↑a) b 14 [] :):
[step] [44453.000000] expected type: dontUnfoldMe (↑a) b 14 [] = 0, term
(dontUnfoldMe_eq (↑a) b 14 [] :) ▼
# Elaborate the definition without expected type first
[] [5440.000000] expected type: <not-available>, term
dontUnfoldMe_eq (↑a) b 14 [] ▼
[] [216.000000] expected type: ?m.263087, term
(↑a) ▼
[] [90.000000] expected type: ?m.263087, term
↑a
# At this point, we don't know yet which type `↑a` is supposed to have
# --> replace by `?m.263088 : ?m.263087` where `?m.263088` is unassignable
[Meta.isDefEq] [7.000000] ✅️ ?m.263087 =?= ?m.263087
[] [569.000000] expected type: B, term
b ▼
[Meta.isDefEq] [287.000000] ✅️ ?m.263087 =?= B ▼
# We know that the second argument `b` has type `B` and
# that the first argument has the same type
# --> assign `?m.263087 := B`
[] ?m.263087 [assignable] =?= B [nonassignable]
[assign] [198.000000] ✅️ ?m.263087 := B ▶
# Now we go back to `↑a`; it has expected type `B`, and we find the coercion.
[] [37967.000000] expected type: B, term
↑a ▼
[] [197.000000] expected type: <not-available>, term
a
[Meta.isDefEq] [83.000000] ❌️ A =?= B
[coe] [37451.000000] adding coercion for a : A =?= B ▶
# Finally, we check that the types agree, which is now trivial, as they are the same.
[Meta.isDefEq] [544.000000] ✅️ dontUnfoldMe (↑a) b 14 [] = 0 =?= dontUnfoldMe (↑a) b 14 [] = 0 ▼
[] [14.000000] ✅️ dontUnfoldMe (↑a) b 14 [] =?= dontUnfoldMe (↑a) b 14 []
[] [10.000000] ✅️ 0 =?= 0
[] [7.000000] ✅️ Nat =?= Nat
[Meta.isDefEq] [430.000000] ✅️ dontUnfoldMe (↑a) b 14 [] = 0 =?= dontUnfoldMe (↑a) b 14 [] = 0 ▶
with that for dontUnfoldMe_eq (↑a) b 14 []:
# Elaborate the definition with expected type `dontUnfoldMe (↑a) b 14 [] = 0`
[step] [54948707.000000] expected type: dontUnfoldMe (↑a) b 14 [] = 0, term
dontUnfoldMe_eq (↑a) b 14 [] ▼
[] [216.000000] expected type: ?m.749, term
(↑a) ▼
[] [90.000000] expected type: ?m.749, term
↑a
# At this point, we don't know yet which type `↑a` is supposed to have
# --> replace by `?m.750 : ?m.749` where `?m.750` is unassignable
[Meta.isDefEq] [7.000000] ✅️ ?m.749 =?= ?m.749
[] [568.000000] expected type: B, term
b ▼
[Meta.isDefEq] [286.000000] ✅️ ?m.749 =?= B ▼
# We know that the second argument `b` has type `B` and
# that the first argument has the same type
# (*) --> assign `?m.749 := B`
[] ?m.749 [assignable] =?= B [nonassignable]
[assign] [197.000000] ✅️ ?m.749 := B ▶
# Now check if the type of the example can be the same as the type of its definition
[Meta.isDefEq] [54943086.000000] ✅️ dontUnfoldMe (↑a) b 14 [] = 0 =?= dontUnfoldMe ?m.750 b 14 [] = 0 ▼
# This unification only succeeds after unfolding `dontUnfoldMe` all the way down,
# taking exponential time.
# (In most similar "real-life" situations, the unification will fail because `?m.750`
# is unassignable and so, as long as it occurs in the term on the left, the problem
# has no solution. In particular, `?m.750` does not get assigned to `AtoB a`,
# which occurs at the corresponding position on the right.)
[] [54942546.000000] ✅️ dontUnfoldMe (↑a) b 14 [] =?= dontUnfoldMe ?m.750 b 14 [] ▼
[delta] [340.000000] ❌️ dontUnfoldMe (↑a) b 14 [] =?= dontUnfoldMe ?m.750 b 14 [] ▶
# Now we start unfolding...
[] [54940326.000000] ✅️ dontUnfoldMe (↑a) b 13 [true] +
dontUnfoldMe (↑a) b 13 [false] =?= dontUnfoldMe ?m.750 b 13 [true] + dontUnfoldMe ?m.750 b 13 [false] ▼
[delta] [54939810.000000] ✅️ dontUnfoldMe (↑a) b 13 [true] +
[] [27460189.000000] ✅️ dontUnfoldMe (↑a) b 13 [true] =?= dontUnfoldMe ?m.750 b 13 [true] ▶
[] [27478501.000000] ✅️ dontUnfoldMe (↑a) b 13 [false] =?= dontUnfoldMe ?m.750 b 13 [false] ▶
# After the unification has succeeded, there is another check that the types agree,
# this time with sides switched, which takes equally long.
[Meta.isDefEq] [52116086.000000] ✅️ dontUnfoldMe ?m.750 b 14 [] = 0 =?= dontUnfoldMe (↑a) b 14 [] = 0 ▶
# Only now do we go back and look at the term `↑a` in `dontUnfoldMe_eq (↑a) b 14 []` again.
# We now know that the expected type is `B`, so we can go and find the coercion `AtoB`.
# But note that we knew that already at point (*) above!
[step] [37987.000000] expected type: B, term
↑a ▼
[] [197.000000] expected type: <not-available>, term
a
[Meta.isDefEq] [83.000000] ❌️ A =?= B
[coe] [37471.000000] adding coercion for a : A =?= B ▶
Expected behavior: Elaboration speed should not heavily depend on the order of the arguments ↑a and b.
Actual behavior: Elaboration can be much slower when ↑a precedes b.
Versions
v4.20.0-rc5 on live.lean-lang.org.
v4.19.0 on my laptop (Linux Debian 6.1.115-1 (2024-11-01)).
Additional Information
Jovan Gerbscheid has a PR lean#8294 that improves caching of unification results (and failures, I assume). This mitigates the problem to some extent (the slow unification attempts caused by this seem to involve quite a bit of repetition in real-life cases), but does not fix its root cause.
Proposed Fix
Disclaimer: I do not know how the unification algorithm works in detail, and so I have no idea whether the following is practical.
The suggestion would be to keep track of when the type metavariable of the unassignable metavariable that stands for ↑x is assigned and at this point go back and re-elaborate ↑x with the now known expected type (so that hopefully the correct coercion can be found and the unassignable metavariable eliminated).
Impact
Add 👍 to issues you consider important. If others are impacted by this issue, please ask them to add 👍 to it.