This document details how the benchmark suite integrates with NIM and OpenFold APIs, including payload formats, endpoint specifications, and configuration options.
The NIM (NVIDIA Inference Microservice) runs as a Docker container with the following configuration:
Required:
NGC_API_KEY: NVIDIA NGC API key for container authentication- Obtain from: https://ngc.nvidia.com/setup/api-key
- Set in environment:
export NGC_API_KEY=your_key_here
Optional:
NIM_OPENFOLD_BACKEND: Backend selection ("tensorrt" or "torch")- Default: "tensorrt"
- Override in config:
nim.backend
docker run \
--gpus '"device=0"' \ # GPU device specification
--shm-size=2g \ # Shared memory size
-e NGC_API_KEY=$NGC_API_KEY \ # NGC authentication
-e NIM_OPENFOLD_BACKEND=tensorrt \ # Backend selection
-v /path/to/cache:/opt/nim/.cache \ # Model cache mount
-p 8000:8000 \ # Port mapping
nvcr.io/nim/openfold/openfold2:1.0 # Container imageKey Parameters:
--gpus: GPU device specification- Single GPU:
'"device=0"' - Multiple GPUs:
'"device=0,1,2,3"' - All GPUs:
'all'
- Single GPU:
--shm-size: Shared memory size (minimum 2GB recommended)-v: Volume mount for model cache (persistent across restarts)-p: Port mapping (default: 8000 for HTTP API)
-
Pull Image:
docker pull nvcr.io/nim/openfold/openfold2:1.0
-
Start Container:
- Benchmark suite automatically starts container if
nim.base_urlisnull - Startup time: ~10-60 seconds depending on cache state
- Benchmark suite automatically starts container if
-
Health Check:
- Endpoint:
GET /health - Expected response:
{"status": "ready"} - Benchmark suite polls until ready
- Endpoint:
-
Metadata Retrieval:
- Endpoint:
GET /v1/metadata - Returns version, backend, model information
- Endpoint:
-
Shutdown:
- Automatic cleanup after benchmark completion
- Manual:
docker stop <container_name>
- Local container:
http://localhost:8000 - Remote service: Configured via
nim.base_urlin config
GET /health
Response:
{
"status": "ready"
}GET /v1/metadata
Response:
{
"version": "1.0.0",
"backend": "tensorrt",
"models_available": [1, 2, 3, 4, 5],
"container_digest": "sha256:abc123...",
"build_date": "2024-01-15"
}POST /v1/predict
Content-Type: application/json
The prediction request payload follows this schema:
{
"sequence": "MKTAYIAKQRQISFVKSHFSRQLE...",
"msa": {
"format": "a3m",
"content": ">query\nMKTAYIAKQRQISFVKSHFSRQLEMKTAYI...\n>homolog_1\nMKAAYIAKQRQISFVKSHFSRQLEMKTAYI...\n"
},
"models": [3],
"return_representations": false,
"return_timings": false
}sequence (string, required)
- Amino acid sequence using standard 20-letter code
- Valid characters:
ACDEFGHIKLMNPQRSTVWY - Length: Typically 50-2700 amino acids
- Example:
"MKTAYIAKQRQISFVKSHFSRQLE"
msa (object, required)
format: MSA format ("a3m" only for NIM)content: MSA content as string- A3M format: FASTA-like with
>header\nsequenceentries - First entry must be query sequence
- Subsequent entries are homologs
- A3M format: FASTA-like with
models (array of integers, required)
- AlphaFold parameter sets to use
- Valid values:
[1],[2],[3],[4],[5], or combinations - Common configurations:
- Single model:
[3](fastest, good accuracy) - Ensemble:
[1, 2, 3, 4, 5](highest accuracy, 5x slower)
- Single model:
- NIM uses AlphaFold v2.3.0 weights
return_representations (boolean, optional)
- If
true, returns intermediate model representations - Default:
false - Note: Increases response size and latency
return_timings (boolean, optional)
- If
true, returns internal timing breakdown - Default:
false - Useful for profiling NIM internals
Success Response (200 OK):
{
"structure": {
"format": "pdb",
"content": "ATOM 1 N MET A 1 10.123 20.456 30.789...\n"
},
"confidence": {
"plddt": [89.2, 91.5, 88.3, ...],
"mean_plddt": 89.7
},
"metadata": {
"models_used": [3],
"msa_depth": 128,
"sequence_length": 234,
"backend": "tensorrt"
}
}structure
format: "pdb"content: PDB format structure as string- Contains ATOM records for all atoms
- B-factor column contains per-residue pLDDT scores
confidence
plddt: Array of per-residue pLDDT scores (0-100)mean_plddt: Mean pLDDT across all residues
metadata
models_used: Models that were runmsa_depth: Number of MSA sequences processedsequence_length: Length of input sequencebackend: Backend used ("tensorrt" or "torch")
Error Response (4xx/5xx):
{
"error": {
"code": "INVALID_SEQUENCE",
"message": "Sequence contains invalid character: 'X'"
}
}NIM supports two inference backends:
Configuration:
nim:
backend: "tensorrt"Environment Variable:
export NIM_OPENFOLD_BACKEND=tensorrtCharacteristics:
- Performance: Fastest (3-5x speedup vs Torch)
- Precision: FP16 (mixed precision)
- Compilation: First run compiles TensorRT engines (cached)
- GPU Support: Requires NVIDIA GPU with TensorRT support
- Use Case: Production deployments, benchmarking
First-Run Behavior:
- Compiles optimized engines for specific GPU architecture
- Compilation time: ~5-10 minutes (one-time, cached)
- Subsequent runs use cached engines
Configuration:
nim:
backend: "torch"Environment Variable:
export NIM_OPENFOLD_BACKEND=torchCharacteristics:
- Performance: Slower than TensorRT (baseline)
- Precision: BF16 (configurable)
- Compilation: No ahead-of-time compilation
- GPU Support: Any GPU with PyTorch support
- Use Case: Debugging, algorithm development, CPU inference
Comparison:
| Aspect | TensorRT | Torch |
|---|---|---|
| Speed | 3-5x faster | Baseline |
| First Run | Slow (compilation) | Normal |
| Subsequent Runs | Fast (cached) | Normal |
| Accuracy | Identical | Identical |
| Precision | FP16 | BF16 |
OpenFold is invoked via command-line interface:
python /path/to/openfold/run_pretrained_openfold.py \
/path/to/fasta \ # Input FASTA file
/path/to/template_mmcif \ # Template database (unused)
--output_dir /path/to/output \ # Output directory
--model_device cuda:0 \ # GPU device
--config_preset model_3_ptm \ # Model preset
--use_precomputed_alignments /path/to/alignments \ # Alignment dir
--bf16 \ # BF16 precision
--save_outputs # Save PDB output-
Input FASTA: Single-sequence FASTA file
>target_id MKTAYIAKQRQISFVKSHFSRQLE... -
Template database path: Required but unused (templates disabled)
- Can point to empty directory
- Not used when
--use_precomputed_alignmentsis set
-
Output directory: Where PDB and confidence scores are written
--config_preset
- Model configuration preset
- Valid values: See Model Presets section
- Default:
model_1_ptm
--model_device
- GPU device specification
- Format:
cuda:0(single GPU) orcuda:0,1,2,3(multi-GPU)
--use_precomputed_alignments
- Path to directory containing precomputed alignments
- If set, skips MSA search
- See Alignment Directory Structure
--bf16
- Enable BF16 (bfloat16) mixed precision
- Reduces memory usage, minimal accuracy impact
- Recommended for benchmarking to match NIM precision
--save_outputs
- Save PDB structure and confidence scores
- Required for accuracy evaluation
When using --use_precomputed_alignments, OpenFold expects this directory structure:
alignment_dir/
bfd_uniclust_hits.a3m # Main MSA (BFD + UniClust)
uniref90_hits.sto # UniRef90 hits (Stockholm format)
mgnify_hits.sto # MGnify environmental hits
pdb70_hits.hhr # Template search results (HHsearch format)
bfd_uniclust_hits.a3m (required)
- Primary MSA in A3M format
- Combines BFD and UniClust30 database hits
- Most important for prediction accuracy
- Format: Same as NIM MSA input
uniref90_hits.sto (required)
- UniRef90 database hits in Stockholm format
- Used for MSA clustering and sampling
- Format: Stockholm 1.0
mgnify_hits.sto (optional)
- Environmental sequences from MGnify database
- Adds sequence diversity
- Format: Stockholm 1.0
pdb70_hits.hhr (required)
- Template search results from HHsearch against PDB70
- For no-template mode, use empty/minimal HHR:
Query query Match_columns 0 No_of_seqs 1 Done!
A3M Format:
>query
MKTAYIAKQRQISFVKSHFSRQLE
>homolog_1
MKTAAIIAKQRQISFVKSHFSRQLE
>homolog_2
MKTAYIAKQRQIAFVKSHFSRQLE
Stockholm Format:
# STOCKHOLM 1.0
query MKTAYIAKQRQISFVKSHFSRQLE
homolog_1 MKTAAIIAKQRQISFVKSHFSRQLE
homolog_2 MKTAYIAKQRQIAFVKSHFSRQLE
//
OpenFold supports multiple model configurations via --config_preset:
model_1_ptm, model_2_ptm, model_3_ptm, model_4_ptm, model_5_ptm
- Single AlphaFold parameter set with pTM (predicted TM-score) head
- Fastest inference
- Recommended:
model_3_ptm(good balance of speed/accuracy)
model_1, model_2, model_3, model_4, model_5
- Single parameter set without pTM head
- Slightly faster than PTM variants
- No inter-domain confidence estimates
finetuning_ptm, finetuning_no_ptm
- Fine-tuned models for specific use cases
- Not recommended for benchmarking
| Preset | pTM Head | Speed | Use Case |
|---|---|---|---|
model_3_ptm |
Yes | Fast | Recommended for benchmarking |
model_3 |
No | Fastest | Speed-critical applications |
| All 5 models | - | 5x slower | Maximum accuracy ensemble |
Control model weight source for apples-to-apples comparison with NIM:
openfold:
weights_source: "alphafold_official" # or "openfold"
weights_path: /custom/path/to/weights # optional override"openfold" (default)
- OpenFold-trained weights
- May differ slightly from AlphaFold official
- Use for OpenFold-specific evaluation
"alphafold_official"
- AlphaFold v2.3.0 official weights from DeepMind
- Recommended for NIM comparison (NIM uses these weights)
- Ensures model parameter parity
Custom Path:
- Set
weights_pathto use custom checkpoint - Must be compatible with OpenFold architecture
- Use for fine-tuned models or custom training
# Download AlphaFold v2.3.0 parameters
wget https://storage.googleapis.com/alphafold/alphafold_params_2022-12-06.tar
tar -xf alphafold_params_2022-12-06.tar -C /path/to/weightsBy default, NIM and OpenFold generate MSAs independently, making the benchmark measure MSA pipeline differences rather than pure inference performance. Precomputed MSAs solve this:
Benefits:
- Identical inputs for both systems
- Reproducible across runs
- Faster benchmarking (skip MSA generation)
- Isolates inference performance
suites:
- name: inference_only
precomputed_msa_dir: "data/precomputed/casp15_msa128"
inference_only_mode: trueprecomputed_msa_dir/
{target_id}/
manifest.json # Metadata and hashes
nim_msa.a3m # NIM MSA input
openfold_alignments/ # OpenFold alignment directory
bfd_uniclust_hits.a3m # (identical to nim_msa.a3m)
uniref90_hits.sto
mgnify_hits.sto
pdb70_hits.hhr
{
"target_id": "T1234",
"sequence": "MKTAYIAKQRQISFVKSHFSRQLE...",
"msa_depth": 128,
"msa_hash": "a3f2e8b4c9d7f1a2b5e6d8c4...",
"template_hash": null,
"nim_a3m_path": "T1234/nim_msa.a3m",
"openfold_alignment_dir": "T1234/openfold_alignments",
"template_dir": null,
"created_at": "2026-02-10T14:30:22Z",
"source": "synthetic"
}The benchmark suite automatically verifies MSA integrity:
- Checks manifest exists
- Verifies all files exist
- Computes SHA256 hash of MSA content
- Compares to manifest hash
- Fails if mismatch detected
This ensures data integrity and reproducibility.
Use the provided CLI tool:
python scripts/precompute_msas.py \
--targets bench/dataset/casp15_targets.yaml \
--output data/precomputed/casp15_msa128 \
--msa-depth 128NIM:
NGC_API_KEY not set: Set environment variableContainer failed to start: Check GPU availability, shared memoryTimeout waiting for /health: Container startup issue, check logsInvalid sequence: Check for non-standard amino acids
OpenFold:
CUDA out of memory: Reduce batch size or use smaller modelAlignment directory not found: Checkuse_precomputed_alignmentspathTemplate database required: Provide path (even if unused)
NIM Container Logs:
docker logs <container_name>OpenFold Verbose Output:
# Add to OpenFold command
--verbose- Use TensorRT backend for NIM (default)
- Enable BF16 for OpenFold (
--bf16flag) - Precompute MSAs for faster iteration
- Pin versions with digest hashes
- Use model_3_ptm for single-model benchmarks
- Warm up before measurement passes
| Component | Version | Notes |
|---|---|---|
| NIM Container | 1.0+ | Pin with digest |
| OpenFold | Latest main | Specify commit hash |
| AlphaFold Weights | v2.3.0 | Used by NIM |
| Docker | 20.10+ | Required for NIM |
| CUDA | 11.8+ | GPU driver requirement |
- NIM Documentation: https://docs.nvidia.com/nim/
- OpenFold GitHub: https://github.com/aqlaboratory/openfold
- AlphaFold Paper: Jumper et al. (2021). Nature, 596:583-589
- AlphaFold Weights: https://github.com/deepmind/alphafold