Skip to content

perforate-org/abtree-mo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

abtree

ABTree: Compact ordered map for Motoko, built on an (a,b)-tree variant with linked leaves. Compatible with enhanced orthogonal persistence (EOP).

Ordered-map API designed to feel close to core/Map. Smaller retained heap than Map (up to 1.3Γ— with Nat32 keys), and consistently fewer instructions and GC allocation on every benchmarked operation except remove β€” up to 5Γ— on ordered scans, 2–2.5Γ— on add, builds, and lookups.

When to use ABTree

  • Read-heavy or scan-heavy workloads: ordered iteration is up to 5Γ— faster, lookups 2Γ— on misses
  • Insert-heavy workloads: add is 2–2.5Γ— faster across all tested volumes
  • Memory-constrained canisters: smaller retained heap and significantly less GC pressure
  • Bulk-loading sorted data: fromSorted builds the tree in O(n) with minimal GC allocation

When Map is the better choice: delete-heavy workloads. remove is 1.4–1.6Γ— faster in Map. If your workload deletes a large fraction of entries frequently, prefer core/Map or use B=32 to narrow the gap.

Install

mops add abtree

Quick Start

import ABTree "mo:abtree";
import Nat "mo:core/Nat"; // needed for implicit compare resolution

persistent actor {
  let users = ABTree.empty<Nat, Text>();

  public func addUser(id : Nat, name : Text) : async () {
    users.add(id, name);
  };

  public query func getUser(id : Nat) : async ?Text {
    users.get(id);
  };
};

API Notes

ABTree follows the core/Map surface closely for the core ordered-map API, including:

  • entries, entriesFrom, reverseEntries, reverseEntriesFrom
  • minEntry, maxEntry, foldLeft, foldRight
  • filterMap, all, any, toText, compare, and iterator toMap

It also adds ABTree-specific constructor variants for branching-factor control: emptyWith, fromArrayWith, fromIterWith, fromVarArrayWith, and fromSortedWith.

Construction

Use the default constructors when the default branching factor B=64 is fine:

let fromArray = ABTree.fromArray<K, V>(entries, compare);
let fromIter = ABTree.fromIter<K, V>(iter, compare);
let fromVarArray = ABTree.fromVarArray<K, V>(buffer, compare);
let fromSorted = ABTree.fromSorted<K, V>(sorted);

Use the With variants when you want to tune branching factor explicitly:

let fromArray = ABTree.fromArrayWith<K, V>(entries, 16, compare);
let fromIter = ABTree.fromIterWith<K, V>(iter, 32, compare);
let fromVarArray = ABTree.fromVarArrayWith<K, V>(buffer, 32, compare);
let fromSorted = ABTree.fromSortedWith<K, V>(sorted, 64);

fromIter, fromArray, and fromVarArray use the default branching factor. The With variants use the provided branching factor and clamp values below 4. fromSorted and fromSortedWith expect ascending input, keep the last value for adjacent duplicate keys, and trap on descending input.

Benchmarks

All benchmarks below were measured on ICP replica via mops bench with 50,000 entries unless noted otherwise.

Summary: ABTree wins on reads, scans, builds, and inserts. Map wins on deletes. For the common read/write mix, ABTree uses fewer cycles and produces far less GC pressure.

ABTree vs Map β€” Instructions

Target-only microbenchmarks: setup is prebuilt outside the measured region. entriesFrom / reverseEntriesFrom scan 10 results from the midpoint. Lower is better.

Operation ABTree Map Result
fromSorted 35.2M β€” O(n) bulk load
fromArray 183.5M 470.6M ABTree 2.57Γ—
get (hit) 425.2M 429.0M Near parity
get (miss) 208.5M 518.5M ABTree 2.49Γ—
entries 8.90M 45.8M ABTree 5.15Γ—
entriesFrom 13.2K 23.2K ABTree 1.76Γ—
reverseEntries 12.9M 46.0M ABTree 3.56Γ—
reverseEntriesFrom 14.7K 23.6K ABTree 1.61Γ—
toArray 14.3M 65.3M ABTree 4.56Γ—

ABTree vs Map β€” GC Allocation

Same benchmark shapes as above. Lower is better.

Operation ABTree Map ABTree savings
fromSorted 464.9 KiB β€” Low-GC bulk load
fromArray 953.2 KiB 5.93 MiB 6.22Γ—
get (hit) 296 B 2.25 MiB Near-zero alloc
get (miss) 3.08 MiB 5.37 MiB 1.74Γ—
entries 348 B 2.26 MiB Near-zero alloc
entriesFrom 348 B 1.03 KiB 2.95Γ—
reverseEntries 348 B 2.26 MiB Near-zero alloc
reverseEntriesFrom 348 B 1.04 KiB 2.99Γ—
toArray 195.7 KiB 3.78 MiB 19.8Γ—

ABTree vs Map β€” Mutations

Target-only with prebuilt fresh fixtures per cell. Lower is better.

Operation ABTree Map Result
add 187.5M 464.1M ABTree 2.48Γ—
add (half keys) 91.1M 216.5M ABTree 2.38Γ—
add (10% keys) 16.1M 35.9M ABTree 2.23Γ—
remove (all keys) 1,145M 739M Map 1.55Γ—
remove (half keys) 455.0M 319.0M Map 1.43Γ—
remove (10% keys) 105.7M 76.6M Map 1.38Γ—

GC allocation for the same mutation suite:

Operation ABTree Map Result
add 2.84 MiB 4.59 MiB ABTree 1.62Γ—
add (half keys) 1.40 MiB 2.18 MiB ABTree 1.56Γ—
add (10% keys) 245.4 KiB 361.3 KiB ABTree 1.47Γ—
remove (all keys) 6.92 MiB 10.7 MiB ABTree 1.55Γ—
remove (half keys) 3.51 MiB 5.93 MiB ABTree 1.69Γ—
remove (10% keys) 923.8 KiB 1.49 MiB ABTree 1.61Γ—

Note: even though remove is slower in instructions, ABTree still allocates less GC memory during deletes.

Scenario Workloads

Full-workflow benchmarks including setup. Useful for deciding whether ABTree matches a specific usage pattern.

Workload ABTree Map ABTree speedup
Overwrite existing keys (Text) 1,120M 1,469M 1.31Γ—
Add (Nat64 keys) 305M 489M 1.60Γ—
Get (Nat64 keys) 537M 757M 1.41Γ—
Add+get small maps (10) 296M 305M 1.03Γ—
Add+get small maps (50) 298M 396M 1.33Γ—
Mixed read/write (Text) 709M 1,122M 1.58Γ—
Mixed read/write (Nat64) 417M 617M 1.48Γ—

Retained Heap Size

Actual retained heap bytes measured via rts_heap_size with forced GC (bench/heapRetained.bench.mo). Each structure is kept alive in a unique slot so GC does not reclaim it. Lower is better.

Nat32 keys (50K entries):

Structure Retained Heap vs Map
ABTree B=64 (default) 1.02 MiB 1.32Γ— smaller
ABTree B=32 1.08 MiB 1.25Γ— smaller
ABTree B=16 1.22 MiB 1.11Γ— smaller
Map 1.35 MiB baseline

Text keys (50K entries):

Structure Retained Heap vs Map
ABTree B=64 (default) 1.78 MiB 1.19Γ— smaller
ABTree B=32 1.85 MiB 1.14Γ— smaller
ABTree B=16 1.98 MiB 1.07Γ— smaller
Map 2.11 MiB baseline

The heap advantage comes from packing many entries per node (~64 per leaf) instead of one heap object per entry. The ratio narrows with larger key/value types because the per-entry payload dominates the per-node overhead.

How It Works

ABTree is an (a,b)-tree with a B+ tree-style leaf layer: internal nodes store routing keys only, all values live in the leaves, and leaves are doubly linked for efficient ordered scans in either direction.

Like core/Map (a B-tree with order 32), ABTree stores multiple key-value pairs per node in flat arrays. ABTree uses a wider default fanout (B=64) and adds linked leaves for fast sequential scans.

         [Internal node]
       keys: [K₁, Kβ‚‚, ...]        ← routing keys only
    children: [β†’, β†’, β†’, ...]
      /         |         \
   [Leaf]  ⇄  [Leaf]  ⇄  [Leaf]  ⇄  ...

    data: [(K,V), (K,V), ...]     ← up to 64 entries per leaf
  • Leaf nodes store ?(K, V) tuples in a single flat array β€” one Option unwrap per access
  • Internal nodes store only routing keys ?K β€” no value duplication
  • Bidirectionally linked leaves enable O(1)-per-entry sequential iteration in ascending and descending order
  • Node-local search uses binary search, with a linear fast path for very small leaves and a last-key short-circuit for common boundary cases
  • Prim.Array_init for direct mutable array allocation (no intermediate copies)

With B=64 and 50K entries: ~780 leaf nodes reachable in 2 hops from the root.

Why scans are fast

Linked leaves mean iteration never touches internal nodes after the initial seek. Each next() call reads the next slot in the current leaf's flat array; at the leaf boundary it follows a single pointer to the next leaf. Map's stack-based cursor is also amortized O(1) per element, but each step involves stack manipulation, variant pattern matching on #leaf/#internal, and periodic subtree traversals. ABTree's leaf-only path avoids all of that, and walking a flat array sequentially gives better spatial locality than chasing tree pointers.

Why remove is slower

Deleting from a wide-fanout node can trigger underflow, requiring a merge or redistribution with a sibling. With B=64, leaf arrays hold up to 129 entries, so each merge or shift may copy ~64 elements. Map's B-tree (order 32) has leaf arrays of at most 31 entries β€” roughly 4x fewer elements per operation. Although ABTree's wider nodes trigger rebalancing less frequently (more headroom before hitting the minimum), the per-operation cost when rebalancing occurs is higher, and at B=64 this outweighs the frequency advantage. ABTree's B+ tree structure does help in one way: deletes always target leaves directly, with no in-order predecessor replacement needed. Reducing B (e.g., B=32) shrinks the per-node merge cost at the expense of a slightly taller tree.

Branching Factor

B=128 is the fastest option in the engineering bench for add, get, fromSorted, and entries, but it regresses remove significantly (see table below). B=64 is the default because it keeps most of the read/bulk-load upside without the delete-path cliff, while B=32 is the better pick for delete-heavy workloads.

Use emptyWith for incremental builds, or the With bulk constructors when loading existing data:

let compact = ABTree.emptyWith<K, V>(16);  // less memory per node
let wide = ABTree.emptyWith<K, V>(64);     // shallower tree (default)

let fromArray = ABTree.fromArrayWith<K, V>(entries, 16, compare);
let fromIter = ABTree.fromIterWith<K, V>(iter, 32, compare);
let fromVarArray = ABTree.fromVarArrayWith<K, V>(buffer, 32, compare);
let fromSorted = ABTree.fromSortedWith<K, V>(sorted, 64);

Engineering benchmark (bench/branchingFactor.bench.mo, 50K entries). Unlike the headline comparison suite, these numbers include the full workflow (setup + target) in the measured region:

Operation B=8 B=16 B=32 B=64 B=128 Best
add 850M 600M 533M 480M 470M B=128
get 1,358M 1,089M 980M 910M 898M B=128
fromSorted 315M 310M 308M 307M 306M B=128
entries 861M 609M 543M 489M 479M B=128
remove 1,858M 1,552M 1,537M 1,696M 2,258M B=32

Minimum B is clamped to 4.

License

This project is licensed under either of Apache License, Version 2.0 or MIT License at your option.

About

Compact ordered map with linked-leaf (a,b)-tree for Motoko compatible with enhanced othogonal persistence.

Topics

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors