Skip to content

Add content-based monotype interning#9273

Draft
lukewilliamboswell wants to merge 8 commits intomainfrom
intern-monotype
Draft

Add content-based monotype interning#9273
lukewilliamboswell wants to merge 8 commits intomainfrom
intern-monotype

Conversation

@lukewilliamboswell
Copy link
Copy Markdown
Collaborator

@lukewilliamboswell lukewilliamboswell commented Mar 18, 2026

Summary

Replaces the append-only monotype store with content-based interning, ensuring structurally identical types get the same TypeId. This eliminates ~200 lines of structural equality workarounds and fixes downstream cache inconsistencies caused by duplicate monotype indices.

Key changes

  • Content-based interning: internFunc, internRecord, internTagUnion, etc. deduplicate by structural content using per-kind AutoHashMapUnmanaged intern maps
  • TypeId + resolve() API: Replace getMonotype() tagged-union pattern matching with resolve() + typed accessors (funcArgs, recordFields, tupleElems, etc.)
  • Structural name pool: Tag/field names use StructuralNameId from a shared pool instead of per-module Ident.Idx, eliminating cross-module ident remapping
  • Nominal instance cache: Deduplicates nominal type instantiations by (origin, type_name, args), replacing the old nominal_cycle_breakers map
  • O(1) list interning: Replace O(n) linear scans in internIdxList/internFieldList/internTagList with fingerprint-keyed hash maps with overflow chaining
  • Cross-module cleanup: remapMonotypeBetweenModules simplified to just .rec placeholder resolution (documented with regression tests)
  • Bug fixes: Double-resolve caching, tag comparison via identMatchesText() instead of fragile null-guard pattern

Files changed

File What
src/mir/Monotype.zig Core interning rewrite: TypeId, resolve(), intern maps, name pool, list hash maps
src/mir/Lower.zig Adapt to TypeId API, simplify cross-module remapping, nominal cache
src/lir/MirToLir.zig Adapt to resolve() + typed accessors, cache double-resolves
src/layout/mir_monotype_resolver.zig Adapt to new TypeId representation
src/mir/MIR.zig Store type alias update
src/mir/LambdaSet.zig Adapt to TypeId API
src/cli/main.zig Deinit signature update
src/eval/dev_evaluator.zig Deinit signature update
src/layout/store_test.zig Test adaptation
src/mir/test/lower_test.zig Test adaptation to new intern APIs
src/check/test/cross_module_mono_test.zig New cross-module recursive nominal tests
test/fx/cross_module_recursive_nominal.roc New test fixture

Test plan

  • zig build test-eval --summary all — 1227/1227 passed
  • Full CI

lukewilliamboswell and others added 8 commits March 19, 2026 08:23
…tical types

The monotype store was append-only with no deduplication, causing structurally
identical monotypes to get different Idx values from repeated type var resolution,
cross-module remapping, and independent unification paths. This led to downstream
inconsistencies in caches keyed by monotype index, papered over by seven separate
structural equality bandaids.

Add a ContentContext-based intern map to Monotype.Store.addMonotype that hashes
and compares by dereferenced span content. Remove ~200 lines of structural
equality workarounds across Lower.zig, MirToLir.zig, and mir_monotype_resolver.zig.
Fix remapMonotypeBetweenModulesRec to intern remapped monotypes instead of only
writing to placeholder slots.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ypeId API

Replace the old getMonotype() tagged-union API with a resolve()-based TypeId
approach across the compiler pipeline. Monotypes are now interned by structural
content (internFunc, internRecord, internTagUnion, etc.) and accessed through
typed accessor methods (funcArgs, funcRet, recordFields, tupleElems, etc.)
instead of pattern-matching on a tagged union.

Key changes:
- MonotypeStore uses content-based interning for deduplication
- Tag/field names use StructuralNameId from a shared name pool instead of
  per-module Ident.Idx, eliminating cross-module ident remapping
- Callers use resolve() + kind-based switching instead of getMonotype()
- @intFromEnum(mono_idx) replaced with @bitcast for new TypeId representation
- Cross-module remapMonotypeBetweenModules greatly simplified since structural
  names are already canonical
- Tests updated to use new intern* construction APIs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the var-based nominal_cycle_breakers with a canonical
NominalInstanceKey cache that deduplicates by (origin, type_name, args).
This avoids identity bugs from multiple type vars referring to the same
nominal instantiation, and skips unnecessary .rec indirection for
non-recursive nominals. Fix ident resolution to fall back to
scratches.ident_store when module_env is unavailable (layout tests).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…s nominal dedup

The separate cycle-breaking map for recursive nominal types is no longer
needed — the nominal instance cache introduced in the prior commit
already prevents infinite recursion by caching based on content identity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The `remapMonotypeBetweenModules` function was investigated for removal
but proven necessary: it deeply resolves `.rec` placeholder indirections
in TypeId children that arise from recursive nominal types. Without this,
cross-module TypeId comparisons fail when one module holds `List(rec(0))`
while another holds the equivalent `List(tag_union(1))`.

Added doc comments explaining the actual purpose (`.rec` resolution, not
name/index remapping), regression test comments, and two new cross-module
mono tests covering recursive nominal types through import chains.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…bug, dead param

- Cache resolve() results in locals where called twice on the same index
  (MirToLir.zig: 3 sites, Lower.zig: 1 site)
- Replace fragile identText() + null-guard + manual eql pattern with
  identMatchesText() in runtimeTagLayoutFromExpr, eliminating a latent
  null path that would silently skip all tags
- Remove unused allocator parameter from TypeInterner.deinit()
- Expand Store alias doc comment

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
internIdxList, internFieldList, and internTagList were doing O(n) linear
scans over all meta entries. Replace with AutoHashMapUnmanaged(u64, ListId)
keyed by fingerprint, with overflow chaining via a `next` field on the meta
structs to handle fingerprint collisions. This brings list interning to O(1)
amortized, matching the existing per-kind type intern maps.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…e tags

The function only searched the current module's ident store, so cross-module
tag names would silently fall through to Ident.Idx.NONE. Now searches all
module envs, matching what identMatchesText in MirToLir already does. Also
deduplicate the redundant findByString calls into a shared findIdentInEnv helper.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@lukewilliamboswell lukewilliamboswell marked this pull request as ready for review March 19, 2026 07:25
@lukewilliamboswell
Copy link
Copy Markdown
Collaborator Author

From what I can tell this CI failure is unrelated... I'm not sure what has caused it recently.

Running command: zig build -Dfuzz -Dsystem-afl=false -Doptimize=ReleaseFast -Dcpu=x86_64_v3 
/Users/runner/.cache/zig/p/AFLplusplus-4.21.0-aA1y4UtxAABpnSIF7ARSYDMRyqNcI-2Rwa5UeSsuw70v/build.zig:27:20: error: root source file struct 'std' has no member named 'BoundedArray'
    var flags = std.BoundedArray([]const u8, 16){};
                ~~~^~~~~~~~~~~~~
/Users/runner/work/_temp/34017541-0bc7-4652-ab5f-9887d3dde558/zig-x86_64-macos-0.15.2/lib/std/std.zig:1:1: note: struct declared here
pub const ArrayHashMap = array_hash_map.ArrayHashMap;
^~~
referenced by:
    runBuild__anon_81076: /Users/runner/work/_temp/34017541-0bc7-4652-ab5f-9887d3dde558/zig-x86_64-macos-0.15.2/lib/std/Build.zig:2215:44
    dependencyInner__anon_75499: /Users/runner/work/_temp/34017541-0bc7-4652-ab5f-9887d3dde558/zig-x86_64-macos-0.15.2/lib/std/Build.zig:2195:29
    10 reference(s) hidden; use '-freference-trace=12' to see all references
Command failed with exit code 2.
Error: Process completed with exit code 2.

@lukewilliamboswell lukewilliamboswell marked this pull request as draft March 19, 2026 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant