LRIM: A Physics-Based Benchmark for Provably Evaluating Long-Range Capabilities in Graph Learning
Accurately modeling long-range dependencies in graph-structured data is critical for many real-world applications. However, incorporating long-range interactions beyond the nodes' immediate neighborhood in a scalable manner remains an open challenge for graph machine learning models. Existing benchmarks for evaluating long-range capabilities either cannot guarantee that their tasks actually depend on long-range information or are rather limited. Therefore, claims of long-range modeling improvements based on said performance remain questionable. We introduce the Long-Range Ising Model Graph Benchmark, a physics-based benchmark utilizing the well-studied Ising model whose ground truth provably depends on long-range dependencies. Our benchmark consists of ten datasets that scale from 256 to 65k nodes per graph, and provide controllable long-range dependencies through tunable parameters, allowing precise control over the hardness and ``long-rangedness". We provide model-agnostic evidence that local information is insufficient, further validating the design choices of our benchmark. Via experiments on classical message-passing architectures and graph transformers, we show that both perform far from the optimum, especially those with scalable complexity. Our goal is that our benchmark will foster the development of scalable methodologies that effectively model long-range interactions in graphs.
Visit our project website: https://lrim-graphbenchmark.com for informatin on how to get started, or explore the datastes and oracle baselines in an interactive manner.
lrim_graph_benchmark/
├── data-generation/ # Generate LRIM datasets from scratch, but we recommend downloading the official datasets
│
├── example-setup/ # Minimal training example
│ ├── setup.sh # Quick start training script
│ ├── train.py # Basic GCN, MLP baseline
│ └── README.md # Example documentationkk
│
└── model-training/ # Full training pipeline
├── setup.sh # Build container and run sample
├── run_training.sh # Train with custom configs
├── run_inference.sh # Run inference on checkpoints
├── verify_checkpoints.sh # Verify against reference models
├── lrim_configs/ # Model configurations
├── grit/ # model implementations
└── README.md # Documentation
Train a simple baseline model in minutes:
cd example-setup
./setup.shThis runs a minimal GCN/MLP baseline on one of the LRIM datasets. Note: These are demonstration models, not competitive baselines.
Train entire models:
cd model-training
./setup.shThis builds the container, trains an example configuration, and runs inference.
Create custom LRIM datasets, we strongly emphasize that to use the provided generated datasets, rather than regenerating them for comparisons:
cd data-generation
./setup.sh
./lrim_gen.sh 16 0.6 100000 # 16×16 matrices, σ=0.6, 100k samplesDownload datasets directly from HuggingFace:
- Dataset Repository: jmathys/lrim_graph_benchmark
- Available sizes: 16, 32, 64, 128, 256
- Difficulty levels: σ=0.6 (hard), σ=1.5 (easy)
- Apptainer/Singularity: For containerized execution (strongly recommended) download here
- CUDA-capable GPU: For training (CPU mode available but slower)
All Python dependencies are included in the provided Apptainer containers, therefore, one can recreate the environment in either conda or venv, however, we strongly suggest using apptainer instead.
💡 Tip for Compute Clusters: If you encounter issues building Apptainer containers on your compute cluster, we recommend building the
.siffile locally on your machine and then transferring it to the cluster. You do not need a GPU locally for container building—only for training.
If you use this benchmark in your research, please cite:
@misc{lrim2025,
title = {LRIM: A Physics-Based Benchmark for Provably Evaluating Long-Range Capabilities in Graph Learning},
author = {Jo{\"e}l Mathys and Henrik Christiansen and Federico Errica and Takashi Maruyama and Francesco Alesiani},
year = {2025},
}[TODO: Add license information]
- Website: Official Project Website
- Datasets: HuggingFace
For detailed documentation, see the README files in each subdirectory: