CascadeMind is the SemEval-2026 Task 4 system paper codebase for Narrative Story Similarity. Given an anchor story and two candidate stories, the system predicts which candidate is more narratively similar to the anchor across abstract theme, course of action, and outcome.
The public release is intentionally conservative: it exposes the cleaned Gemini runner, paper source, historical non-Claude experiment scripts, and reproducibility tooling, while keeping raw shared-task data, generated logs, review packages, and submission bundles out of git until redistribution rights are confirmed.
- Canonical source:
paper/latest-paperfeb12026/semeval2026_final.tex - Build notes:
paper/README.md - Public code URL for the paper:
https://github.com/epoch-learn/CascadeMind
The camera-ready paper should treat the official shared-task submission as the main result:
| Result type | Split | Accuracy | Notes |
|---|---|---|---|
| Official submission | Track A test | 72.75% | Listed as rank 10 in the task overview table |
| Archived local post-hoc file | Released Track A test labels | 73.0% | Diagnostic only, not a change to the official standing |
| Paper-era cascade diagnostics | Development subset | 81.0% | Useful for routing analysis; denominator differs from full dev baselines |
Use a fresh, rotated API key. Do not reuse any key that has appeared in chat, logs, git history, or screenshots.
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
# Fill in only the keys you need, then load them into your shell.
set -a
source .env
set +aPrimary entrypoints:
python best.py
python baseline.py
python train_ensemble.pyCamera-ready experiment runner:
python scripts/run_camera_ready_experiments.py \
--data data/dev_track_a.jsonl \
--suite balanced \
--max-examples 5The scripts read credentials from environment variables directly; they do not auto-load .env.
The SemEval task data and archived submissions are not tracked in this repository. See data/README.md for expected filenames, row counts, and checksums from the previous LFS pointers.
Local generated outputs should live under ignored artifacts/runs/ directories. If you run experiments for the camera-ready paper, keep the generated manifest with the model IDs, parameters, token/call counts, row counts, and git commit.
| Path | Purpose |
|---|---|
best.py |
Cleaned Gemini bidirectional evaluator |
baseline.py |
Minimal Gemini structured-output baseline |
train_ensemble.py |
Multi-signal symbolic ensemble trainer |
experiments/ |
Historical variants and ablations; not all are canonical entrypoints |
scripts/run_camera_ready_experiments.py |
Manifested Gemini rerun harness |
scripts/check_release.py |
Public-release and camera-ready sanity checks |
paper/ |
System paper source and style files |
This repository previously contained committed API credentials. Those literals have been removed from the working tree, but any exposed credential should be considered compromised and rotated. The Gemini key pasted during planning is also exposed and should not be used for experiments.
Before publishing, run:
python scripts/check_release.pyThis public repository is intended to have clean reachable history. Historical secret-bearing branches should remain unreachable from the public remote.
@inproceedings{kawada-holyoak-2026-cascademind,
title = {CascadeMind at SemEval-2026 Task 4: A Hybrid Neuro-Symbolic Cascade for Narrative Similarity},
author = {Kawada, Sebastien and Holyoak, Dylan},
booktitle = {Proceedings of the 20th International Workshop on Semantic Evaluation (SemEval-2026)},
year = {2026},
address = {San Diego, CA, USA}
}