Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 49 additions & 0 deletions docker/Dockerfile.base
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# OpenJudge Dockerfile
# Base image for running OpenJudge evaluation tasks
# For training Judge models with verl/sglang/vllm, use Dockerfile.train instead

FROM dsw-registry.cn-wulanchabu.cr.aliyuncs.com/pai/pytorch:2.8.0-gpu-py312-cu128-ubuntu24.04-3995b779-1764359181

# Set environment variables
ENV PYTHONUNBUFFERED=1
ENV DEBIAN_FRONTEND=noninteractive

# Set working directory
WORKDIR /workspace

# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
git \
curl \
build-essential \
&& rm -rf /var/lib/apt/lists/*
Comment on lines +15 to +19
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To optimize the Docker image size and reduce the number of layers, it's a good practice to chain related RUN commands. You can combine this apt-get installation with the subsequent pip install and pip cache purge commands into a single RUN instruction. This creates a single layer, making the image more compact and potentially speeding up builds.


# Install OpenJudge with all dependencies
RUN pip install --no-cache-dir \
"pandas>=2.2.3,<3.0.0" \
"loguru>=0.7.3,<0.8.0" \
"json_repair>=0.54.0,<1.0.0" \
"pydantic>=2.11.5,<3.0.0" \
"openai>=1.85.0,<2.0.0" \
"tenacity>=9.1.0,<10.0.0" \
"math-verify>=0.7.0,<0.8.0" \
"tqdm>=4.66.0,<5.0.0" \
"fire" \
"numpy>=1.22.0,<2.0.0" \
"dashscope>=1.19.0" \
"tiktoken>=0.7.0" \
"nltk>=3.8.1" \
"jieba>=0.42.1" \
"sacrebleu>=2.0.0" \
"rouge-score>=0.1.2" \
"python-Levenshtein>=0.20.0" \
"scikit-learn>=1.0.0"

# Install OpenJudge from GitHub
RUN pip install --no-cache-dir git+https://github.com/agentscope-ai/OpenJudge.git
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Installing from a git repository URL without pinning to a specific commit hash or tag can lead to non-reproducible builds, as it will pull the latest commit from the main branch. This might introduce breaking changes unexpectedly. For stable and predictable builds, please pin the installation to a specific commit hash or tag.

For example:

RUN pip install --no-cache-dir git+https://github.com/agentscope-ai/OpenJudge.git@your-commit-hash


# Clean up
RUN pip cache purge

# Set default command
CMD ["/bin/bash"]
123 changes: 123 additions & 0 deletions docker/Dockerfile.train
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# =============================================================================
# OpenJudge Training Dockerfile
# =============================================================================
# Training image for Judge model SFT/RL with:
# - PyTorch + CUDA support
# - Inference frameworks: sglang, vllm
# - Training frameworks: verl, transformers, accelerate
# - FlashAttention, FlashInfer
# - OpenJudge
#
# For basic installation (evaluation only), use Dockerfile instead
# =============================================================================

# Base image: PAI PyTorch
FROM dsw-registry.cn-wulanchabu.cr.aliyuncs.com/pai/pytorch:2.8.0-gpu-py312-cu128-ubuntu24.04-3995b779-1764359181

# Set environment variables
ENV USE_MEGATRON=0
ENV USE_SGLANG=1
ENV MAX_JOBS=32
ENV DEBIAN_FRONTEND=noninteractive

# Set working directory
WORKDIR /workspace

# =============================================================================
# 1. Inference Frameworks (sglang, vllm)
# =============================================================================
RUN pip install --no-cache-dir "sglang[all]==0.5.2" && \
pip install --no-cache-dir torch-memory-saver && \
pip install --no-cache-dir "vllm==0.11.0"
Comment on lines +29 to +31
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

These pip install commands can be combined into a single RUN instruction. This improves readability and allows pip to resolve all dependencies in one go.

RUN pip install --no-cache-dir "sglang[all]==0.5.2" torch-memory-saver "vllm==0.11.0"


# =============================================================================
# 2. Training & ML Packages
# =============================================================================
RUN pip install --no-cache-dir \
"transformers[hf_xet]>=4.51.0" \
accelerate \
datasets \
peft \
hf-transfer \
"numpy<2.0.0" \
"pyarrow>=15.0.0" \
pandas \
"tensordict>=0.8.0,<=0.10.0,!=0.9.0" \
torchdata \
"ray[default]" \
codetiming \
hydra-core \
pylatexenc \
qwen-vl-utils \
wandb \
swanlab \
dill \
pybind11 \
liger-kernel \
mathruler \
pytest \
py-spy \
pre-commit \
ruff \
tensorboard

# =============================================================================
# 3. Additional Dependencies
# =============================================================================
RUN pip install --no-cache-dir \
"nvidia-ml-py>=12.560.30" \
"fastapi[standard]>=0.115.0" \
"optree>=0.13.0" \
"pydantic>=2.9" \
"grpcio>=1.62.1"

# =============================================================================
# 4. FlashAttention & FlashInfer (Python 3.12 + CUDA 12)
# =============================================================================
RUN curl -L -O "https://github.com/Dao-AILab/flash-attention/releases/download/v2.8.1/flash_attn-2.8.1+cu12torch2.8cxx11abiFALSE-cp312-cp312-linux_x86_64.whl" && \
pip install --no-cache-dir flash_attn-2.8.1+cu12torch2.8cxx11abiFALSE-cp312-cp312-linux_x86_64.whl && \
rm -f flash_attn-2.8.1+cu12torch2.8cxx11abiFALSE-cp312-cp312-linux_x86_64.whl

RUN pip install --no-cache-dir flashinfer-python==0.3.1

# =============================================================================
# 5. OpenCV Fix
# =============================================================================
RUN pip install --no-cache-dir opencv-python opencv-fixer && \
python -c "from opencv_fixer import AutoFix; AutoFix()"

# =============================================================================
# 6. verl (RL Training Framework)
# =============================================================================
RUN pip install --no-cache-dir --no-deps git+https://github.com/volcengine/[email protected]

# =============================================================================
# 7. OpenJudge Dependencies
# =============================================================================
RUN pip install --no-cache-dir \
"loguru>=0.7.3,<0.8.0" \
"json_repair>=0.54.0,<1.0.0" \
"openai>=1.85.0,<2.0.0" \
"tenacity>=9.1.0,<10.0.0" \
"math-verify>=0.7.0,<0.8.0" \
"tqdm>=4.66.0,<5.0.0" \
fire \
"dashscope>=1.19.0" \
"tiktoken>=0.7.0" \
"nltk>=3.8.1" \
"jieba>=0.42.1" \
"sacrebleu>=2.0.0" \
"rouge-score>=0.1.2" \
"python-Levenshtein>=0.20.0" \
"scikit-learn>=1.0.0"

# =============================================================================
# 8. OpenJudge
# =============================================================================
RUN pip install --no-cache-dir --no-deps git+https://github.com/agentscope-ai/OpenJudge.git
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Installing from a git repository URL without pinning to a specific commit hash or tag can lead to non-reproducible builds, as it will pull the latest commit from the main branch. This might introduce breaking changes unexpectedly. For stable and predictable builds, please pin the installation to a specific commit hash or tag.

For example:

RUN pip install --no-cache-dir --no-deps git+https://github.com/agentscope-ai/OpenJudge.git@your-commit-hash


# Clean up cache
RUN pip cache purge

# Set default command
CMD ["/bin/bash"]
84 changes: 84 additions & 0 deletions docker/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# OpenJudge Docker Guide

This document describes how to deploy OpenJudge using Docker. We provide two images:

| Image | Purpose | Dockerfile |
|-------|---------|------------|
| **Base Image** | Running evaluation tasks (API calls) | `Dockerfile.base` |
| **Training Image** | Judge model SFT/RL training | `Dockerfile.train` |

---

## Option 1: Base Installation (Evaluation)

For scenarios using OpenJudge for evaluation (calling LLMs via API).

### 1.1 Build Image

```bash
cd OpenJudge
docker build -f docker/Dockerfile.base -t openjudge:latest .
```

### 1.2 Start Container

```bash
docker run -it \
-v $(pwd):/workspace/OpenJudge \
-e OPENAI_API_KEY=your_api_key \
--name openjudge \
openjudge:latest
```

---

## Option 2: Training Environment

For scenarios using the [verl](https://github.com/volcengine/verl) framework for Judge model SFT/RL training.

### Environment Details

- **Base Image**: PAI PyTorch 2.8.0 + CUDA 12.8 + Python 3.12
- **Training Framework**: verl v0.6.1 (FSDP distributed training)
- **Inference Frameworks**: vLLM 0.11.0, SGLang 0.5.2

### 2.1 Build Image

```bash
cd OpenJudge
docker build -f docker/Dockerfile.train -t openjudge-train:latest .
```

### 2.2 Start Container

```bash
docker run --gpus all -it \
--shm-size=64g \
-v $(pwd):/workspace/OpenJudge \
-v /path/to/your/models:/models \
-v /path/to/your/data:/data \
--name openjudge-train \
openjudge-train:latest
```

**Parameter Description:**

| Parameter | Description |
|-----------|-------------|
| `--gpus all` | Use all GPUs |
| `--shm-size=64g` | Set shared memory to 64GB (required for training) |
| `-v $(pwd):/workspace/OpenJudge` | Mount current directory to container |
| `-v /path/to/your/models:/models` | Mount model directory (modify path as needed) |
| `-v /path/to/your/data:/data` | Mount data directory (modify path as needed) |
| `--name` | Container name |

### 2.3 Run Training

After entering the container:

```bash
cd /workspace/OpenJudge/cookbooks/training_judge_model/sft
bash run_sft_rm.sh
```

---
48 changes: 44 additions & 4 deletions docs/building_graders/training_judge_models.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,51 @@ OpenJudge provides training pipelines for building custom judge models. Each met
| **Bradley-Terry** | Scalar score | Preference pairs | ❌ No | RLHF judge modeling, ranking |
| **GRPO** | Generative (text) | Labeled responses | ✅ Yes | Interpretable evaluation with reasoning |

**Common Requirements:**
**Environment Setup:**

```bash
pip install verl==0.6.1
```
=== "Docker (Recommended)"

Use the pre-configured training Docker image:

```bash
cd OpenJudge

# Build the training image
docker build -f docker/Dockerfile.train -t openjudge-train:latest .

# Run the container
docker run --gpus all -it \
--shm-size=64g \
-v $(pwd):/workspace/OpenJudge \
-v /path/to/your/models:/models \
-v /path/to/your/data:/data \
--name openjudge-train \
openjudge-train:latest
```

=== "Manual Installation"

For custom environments, follow the [verl installation guide](https://verl.readthedocs.io/en/latest/start/install.html):

```bash
# 1. Create conda environment
conda create -n openjudge_train python==3.12
conda activate openjudge_train

# 2. Install dependencies (vLLM, SGLang, etc.)
git clone https://github.com/volcengine/verl && cd verl
USE_MEGATRON=0 bash scripts/install_vllm_sglang_mcore.sh

# 3. Install verl
pip install --no-deps -e .

# 4. Install OpenJudge (verl dependencies already installed in step 2)
cd /path/to/OpenJudge
pip install -e .
```

!!! warning "Important"
Simply running `pip install verl` is **not sufficient**. verl requires specific versions of vLLM, SGLang, FlashAttention, and other dependencies. Please follow the full installation guide or use the Docker image.


## Datasets
Expand Down
16 changes: 16 additions & 0 deletions docs/get_started/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,22 @@ Get started with OpenJudge in 5 minutes. This guide walks you through installati
pip install -e .[verl] # With VerL option for training scenarios
```

=== "Docker"

```bash
git clone https://github.com/agentscope-ai/OpenJudge.git
cd OpenJudge

# Build the Docker image
docker build -f docker/Dockerfile.base -t openjudge:latest .

# Run the container
docker run -it \
-v $(pwd):/workspace/OpenJudge \
--name openjudge \
openjudge:latest
```

> **Tips:**
> OpenJudge requires Python version >=3.10 and <3.13. For best compatibility, we recommend using Python 3.10 or 3.11.

Expand Down