Proof checker #764

oflatt · 2025-12-13T19:54:05Z

No description provided.

* use Arc for SourceExpr variable names * Revert "use Arc for SourceExpr variable names" This reverts commit 9269488. * working on proof api test * new proof api test * get matches * sourcevar pubcrate * working on get matches * working on query_and_explain test * working on get matches * move to egraph_operations.rs * prelude fix * add todos * improve docs * small fix * working on new way to implement query * worked on get_matches test * working get_matches * fix build and new api endpoint

There are still issues with unbound variables though, I believe due to how primitives work.

Revert "use Arc for SourceExpr variable names" This reverts commit 9269488. working on proof api test get matches working on get matches working on query_and_explain test working on get matches move to egraph_operations.rs prelude fix add todos improve docs small fix working on new way to implement query worked on get_matches test working get_matches fix build and new api endpoint renames fix nits add proof checker adaptation proof checker draft new proof not supported code testing global insertion fix up check proof move code around prim validation some fixes small fixes remove old file Add helper functions for collecting query columns from PR #741 Added to_surface() and collect_subexprs_preorder() methods to ResolvedExprExt and collect_query_columns() function. These helpers collect all subexpressions (not just variables) from resolved query facts, which is useful for operations that need to capture the full structure of matched terms. This is based on the approach from PR #741 for the fresh! sugar feature. Add allow(dead_code) to new helper functions These helpers are intended for future use when we need to collect all subexpressions from queries, not just variables. Move proof_checker tests to separate file Moved all proof_checker tests from src/proof_checker.rs to src/proof_checker/tests.rs to improve code organization and readability. The main proof_checker module is now ~450 lines down from ~1400 lines. All 39 tests still pass. prove-query prove query, remove bad tests in tests.rs working on real proof checker new proof checker test rename add new literal prim macro working on test cleanup testing some fixes better prim resolve refactor again try again try again on literal macro imports another try at testing literal convertible tests refactor another try at checker fix build erros small fix proof checker cleaner duplicate validator detection Update get_matches documentation to clarify all variables are captured - Updated documentation to emphasize that ALL variables from both sides of equality constraints are captured in matches - Made it clear that variables like 'x' and 'y' from expressions within equality patterns are accessible in the results - Added explicit comments explaining where each variable comes from - Formatted code to pass make nits Use facts! macro in tests and remove ALL emphasis in docs Add comprehensive proof API tests and fix formatting Make proof format accessible in egglog API and fix formatting another cleanup current diff small changes small fix working on todos getting closer on prims small cleanup another improvement many improvements oops new tests test changes small changes test deletion with duplicate tests

Copilot

Pull request overview

This PR implements a comprehensive proof checking system for egglog, enabling validation of proofs generated during e-graph operations. The implementation adds infrastructure for registering validators for primitive operations, checking proofs at various levels (term proofs, equality proofs, rule applications), and provides a new prove-query command for generating and verifying proofs.

Key changes include:

New proof checker with support for validating different proof types (PRule, PCong, PFiat, PRefl, PSym, PTrans)
Automatic validator generation via the add_literal_prim! macro for common primitive operations
Type inference system for proof validation
Support for proof checking mode via EGraph::with_proof_checking()

Reviewed changes

Copilot reviewed 32 out of 32 changed files in this pull request and generated 12 comments.

Show a summary per file

File	Description
tests/test_validators.rs	Comprehensive tests for primitive validators and the add_literal_prim macro
tests/prove_query_test.rs	Tests for the new prove_query functionality with various query patterns
tests/proof_checker_tests.rs	End-to-end tests for proof checker validation of rules, congruence, and primitives
tests/proof_api/tests/manual_proofs.rs	Tests for manual proof construction and validation
tests/proof_api/tests/query_and_explain.rs	Updated to reflect new behavior where equality constraint variables are included in matches
tests/proof_api/tests/query_proof.rs	Deleted file (functionality moved to prove_query_test.rs)
tests/files.rs	Refactored proof testing infrastructure with separate with_proofs and proof_checking modes
src/typechecking.rs	Extended PrimitiveWithId to support validators and output types; added validator registration methods
src/proof_checker.rs	Core proof checker implementation with proposition checking and rule validation
src/proof_check.rs	Standalone proof checking API that doesn't require running programs
src/proof_support.rs	Utilities for checking if programs support proof mode
src/proof_type_inference.rs	Type inference context for validating terms in proofs
src/literal_convertible.rs	Trait for converting between Rust types and egglog Literals
src/sort/i64.rs	Updated integer primitives to use add_literal_prim for automatic validator generation
src/sort/f64.rs	Updated float primitives to use add_literal_prim for automatic validator generation
src/sort/bool.rs	Updated boolean primitives to use add_literal_prim for automatic validator generation
src/sort/add_primitive/src/lib.rs	New macros: add_primitive_with_validator and add_literal_prim for validator support
src/lib.rs	Added proof_checking_enabled flag, ProveQuery command support, and new error types
src/egraph_operations.rs	Renamed get_proof to prove_query and updated documentation
src/core.rs	Updated SpecializedPrimitive field access and improved error messages for ambiguous resolution
src/prelude.rs	Added exports for LiteralConvertible, ProofStore, and proof types
src/serialize.rs	Removed unnecessary dead_code annotation
src/ast/*.rs	Added ProveQuery command throughout AST infrastructure
src/constraint.rs	Updated to use new PrimitiveWithId structure
egglog-bridge/src/termdag.rs	Added helper methods and explicit Default implementation
egglog-bridge/src/proof_format.rs	Extended ProofStore with accessor methods and added PFiat equality proof
egglog-bridge/src/proof_spec.rs	Minor formatting improvement
core-relations/src/query.rs	Removed unnecessary dead_code annotation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-13T19:59:18Z

src/core.rs

+        if resolved_call.is_empty() {
+            panic!(
+                "No resolution found for '{}' with types: {:?}",
+                head,
+                types.iter().map(|s| s.name()).collect::<Vec<_>>()
+            );
+        }
+
+        if resolved_call.len() > 1 {
+            let mut msg = format!(
+                "Ambiguous resolution for '{}' with types: {:?}\n",
+                head,
+                types.iter().map(|s| s.name()).collect::<Vec<_>>()
+            );
+            msg.push_str("Found multiple matching primitives/functions:\n");
+            for rc in &resolved_call {
+                match rc {
+                    ResolvedCall::Func(f) => {
+                        msg.push_str(&format!("  - Function: {}\n", f.name));
+                    }
+                    ResolvedCall::Primitive(p) => {
+                        msg.push_str(&format!(
+                            "  - Primitive: {} (inputs: {:?}, output: {})\n",
+                            p.primitive.primitive.name(),
+                            p.input.iter().map(|s| s.name()).collect::<Vec<_>>(),
+                            p.output.name()
+                        ));
+                    }
+                }
+            }
+            panic!("{}", msg);
+        }


The panic messages in this section (lines 129-159) are more detailed than typical panic messages, which is good. However, panicking on ambiguous resolution might be too aggressive - consider whether returning an error would be more appropriate for library code, allowing callers to handle the ambiguity gracefully.

src/sort/add_primitive/src/lib.rs

Copilot · 2025-12-13T19:59:18Z

src/egraph_operations.rs

        for (target_var, target_sort) in query_vars {
            let constructor_name = self.parser.symbol_gen.fresh("get_proof_ctor");
            let ruleset_name = self.parser.symbol_gen.fresh("get_proof_ruleset");
            let rule_name = self.parser.symbol_gen.fresh("get_proof_rule");


The function name get_proof has been renamed to prove_query, but the docstring still refers to the old name "get_proof_rule" in the function implementation. The variable rule_name is generated using get_proof_rule which is inconsistent with the new function name.

Copilot · 2025-12-13T19:59:19Z

src/proof_checker.rs

+    /// An external proof checker would have trouble with this step, so we should consider adding type annotations
+    /// to the egglog


The comment "An external proof checker would have trouble with this step" on line 805 suggests a limitation but doesn't explain why or what could be done about it. Consider expanding this comment to explain the issue more clearly, or remove it if it's not actionable.

Suggested change

/// An external proof checker would have trouble with this step, so we should consider adding type annotations

/// to the egglog

/// Note: An external proof checker would have trouble with this step because type information for variables

/// is inferred from the rule body, which may not be available or easily reconstructed outside of this context.

/// To make external proof checking feasible, consider adding explicit type annotations to the egglog proofs

/// or including variable type information in the proof format.

Copilot · 2025-12-13T19:59:19Z

src/egraph_operations.rs

+        // TODO currently using Push and Pop to avoid touching the e-graph here.
+        // we should either make push and pop fast or rewrite to clean up the new table and rule.


The comment states "TODO currently using Push and Pop to avoid touching the e-graph here." This suggests the implementation may be temporary or could be optimized. Consider filing an issue to track this work or implementing the optimization if feasible.

Copilot · 2025-12-13T19:59:20Z

src/proof_checker.rs

+            GenericAction::Let(_, name, expr) => {
+                // Let creates a term
+                let _ = name; // Suppress unused warning


The panic message on line 909 uses an underscore prefix for the name variable to suppress an unused warning, but then includes a comment explaining it. Consider either using the variable in a debug assertion or removing both the variable and comment for clarity.

Suggested change

GenericAction::Let(_, name, expr) => {

// Let creates a term

let _ = name; // Suppress unused warning

GenericAction::Let(_, _, expr) => {

// Let creates a term

Copilot · 2025-12-13T19:59:20Z

src/lib.rs

+                        .map_err(|e| Error::BackendError(e.to_string()))?;
+                    log::info!("Proof:\n{}", String::from_utf8_lossy(&buf));
+                } else {
+                    return Err(Error::BackendError("No proof found for query".to_string()));


The error message format is inconsistent. Some error constructors use format! while others just use string literals. For example, line 1157 uses a raw string while line 1184 uses to_string(). Consider standardizing the error message format.

Suggested change

return Err(Error::BackendError("No proof found for query".to_string()));

return Err(Error::BackendError(format!("No proof found for query")));

Copilot · 2025-12-13T19:59:20Z

src/typechecking.rs

+    /// Register a validation function for a primitive.
+    /// This associates the validator with the LAST primitive added with the given name.
+    /// For overloaded primitives, use add_primitive_validator_typed instead to specify
+    /// the exact primitive by its input and output types.
+    /// Validators are used in proof-checking mode to verify primitive computations.
+    pub fn add_primitive_validator(&mut self, name: &str, validator: PrimitiveValidator) {
+        if let Some(primitives) = self.type_info.primitives.get_mut(name) {
+            if let Some(last) = primitives.last_mut() {
+                if last.validator.is_some() {
+                    panic!(
+                        "Duplicate validator for primitive '{}'. A validator is already registered.",
+                        name
+                    );
+                }
+                last.validator = Some(validator);
+            }
+        }
+    }


The documentation for add_primitive_validator states it associates the validator with the "LAST primitive added" but doesn't explain what happens if called multiple times or what the expected usage pattern is. Consider adding examples or clarifying when this should be used versus add_primitive_validator_typed.

Copilot · 2025-12-13T19:59:21Z

src/proof_checker.rs

+    /// Apply a substitution to an expression, replacing variables with terms
+    /// Unified expression evaluation that handles both variables and globals
+    /// The environment can contain both substituted variables and global definitions
+    /// This is used by queries, actions, and global evaluation  


The comment "This is used by queries, actions, and global evaluation" on line 167 is somewhat vague. Consider being more specific about what scenarios this unified function handles and why it's important that it's unified.

Suggested change

/// This is used by queries, actions, and global evaluation

/// This function is invoked during the evaluation of queries (e.g., when checking if a query holds),

/// actions (such as let-bindings and unions in the proof script), and when processing global definitions.

/// Unifying expression evaluation in this function ensures that all parts of the proof checker interpret

/// expressions in a consistent manner, preventing subtle bugs that could arise from divergent evaluation logic.

/// If this logic were duplicated or fragmented, inconsistencies could lead to incorrect proof validation.

Copilot · 2025-12-13T19:59:21Z

src/typechecking.rs

+                if last.validator.is_some() {
+                    panic!(
+                        "Duplicate validator for primitive '{}'. A validator is already registered.",
+                        name
+                    );
+                }


The panic on line 210 says "Duplicate validator" but the condition checks if a validator already exists. This will fire even on the first attempt to add a validator to a primitive that already has one (perhaps from a previous call). Consider whether this should be a panic or return an error, and clarify the error message to indicate this is about attempting to register a second validator.

codspeed-hq · 2025-12-13T20:02:43Z

CodSpeed Performance Report

Merging #764 will degrade performances by 5.2%

_{Comparing oflatt-proofs-7 (ddf79c1) with ezr-oflatt-refactor-proofs-4 (9f11c5f)}

Summary

❌ 2 regressions
✅ 18 untouched
⏩ 192 skipped¹

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

	Mode	Benchmark	`BASE`	`HEAD`	Change
❌	WallTime	`tests[herbie-tutorial]`	11.8 ms	12.4 ms	-5.08%
❌	Simulation	`tests[extract-vec-bench]`	102.6 ms	108.3 ms	-5.2%

192 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

saulshanabrook · 2025-12-13T20:21:32Z

It says you requested a review from me but you dont want one yet right?

oflatt · 2025-12-13T21:56:23Z

Yeah it's a draft, no need to review yet (sorry for confusion)

oflatt and others added 6 commits December 13, 2025 11:41

some progress

56b1c26

Initial plumbing -- mid debugging

94755c9

correct SourceSyntax translation.

9c8beb2

There are still issues with unbound variables though, I believe due to how primitives work.

fix build

bb47c6d

Copilot AI review requested due to automatic review settings December 13, 2025 19:54

oflatt requested a review from a team as a code owner December 13, 2025 19:54

oflatt requested review from saulshanabrook and removed request for a team December 13, 2025 19:54

oflatt marked this pull request as draft December 13, 2025 19:54

Copilot started reviewing on behalf of oflatt December 13, 2025 19:54 View session

Copilot AI reviewed Dec 13, 2025

View reviewed changes

working on moving to resolved commands

67175a3

oflatt added 13 commits December 13, 2025 21:32

lots of fixes with resolved commands

7414111

remove proof type checker

3d18b39

debug another time

405420d

working on fix prove query

2755dd6

working on proving queries again

5b6927e

small refactor

33609aa

don't push

b41c72e

need to debug desugared commands not present

b1b8ca2

fix more stuff

f0bf261

print out proofs

f98592c

try to do proofs for functions

5d42b21

one test working

89ad26d

working on refactor

9532b52

oflatt added 8 commits December 17, 2025 10:55

small fix

a1bfbd4

found problem

f95028a

try to fix arity issues

921e565

trying to add output var

f5ddeaf

fix up get_syntax_val

0ed11b2

smmal compile error

680bd7f

fix for rebvuild proofs

a9e551f

better error msg

ddf79c1

		/// An external proof checker would have trouble with this step, so we should consider adding type annotations
		/// to the egglog

-    /// An external proof checker would have trouble with this step, so we should consider adding type annotations
-    /// to the egglog
+    /// Note: An external proof checker would have trouble with this step because type information for variables
+    /// is inferred from the rule body, which may not be available or easily reconstructed outside of this context.
+    /// To make external proof checking feasible, consider adding explicit type annotations to the egglog proofs
+    /// or including variable type information in the proof format.

		// TODO currently using Push and Pop to avoid touching the e-graph here.
		// we should either make push and pop fast or rewrite to clean up the new table and rule.

	return Err(Error::BackendError("No proof found for query".to_string()));
	return Err(Error::BackendError(format!("No proof found for query")));

-    /// This is used by queries, actions, and global evaluation
+    /// This function is invoked during the evaluation of queries (e.g., when checking if a query holds),
+    /// actions (such as let-bindings and unions in the proof script), and when processing global definitions.
+    /// Unifying expression evaluation in this function ensures that all parts of the proof checker interpret
+    /// expressions in a consistent manner, preventing subtle bugs that could arise from divergent evaluation logic.
+    /// If this logic were duplicated or fragmented, inconsistencies could lead to incorrect proof validation.

Proof checker #764

Are you sure you want to change the base?

Proof checker #764

Uh oh!

Conversation

oflatt commented Dec 13, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 13, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Dec 13, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 13, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 13, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 13, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 13, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 13, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 13, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 13, 2025

Choose a reason for hiding this comment

Uh oh!

codspeed-hq bot commented Dec 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #764 will degrade performances by 5.2%

Summary

Benchmarks breakdown

Footnotes

Uh oh!

saulshanabrook commented Dec 13, 2025

Uh oh!

oflatt commented Dec 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codspeed-hq bot commented Dec 13, 2025 •

edited

Loading