CodeSandBox is a FastAPI-based judging node for running programming submissions against test cases. It supports local problems stored on disk and ad-hoc problems submitted with test data in the request.
Important: this project is designed for Unix/Linux environments. The runtime checks for
uname,python3,gcc,g++,java, andjavacbefore starting.Must use Docker: run this project inside the provided Docker image or Docker Compose service. Direct host execution is only useful for development on a compatible Linux machine and is not the supported runtime path.
Security warning: this service compiles and runs user-submitted code. Treat every submission as untrusted and potentially malicious. Do not run the judge directly on a host machine; use an isolated environment such as Docker, a locked-down container runtime, or a dedicated sandbox/VM boundary.
- FastAPI HTTP API for submission judging and node status.
- Language handlers for Python, C, C++, and Java.
- Disk-backed local problem storage under
problems/. - Inline testcase payload support for remote/ad-hoc judging.
- Time and memory limits per problem.
- Automatic problem reloads when files under
problems/change. - Docker and Docker Compose setup for reproducible Linux execution.
| Requirement | Purpose |
|---|---|
| Python 3.12+ | Runs the API and Python submissions |
| GCC | Compiles C submissions |
| G++ | Compiles C++ submissions |
| Java/Javac | Runs and compiles Java submissions |
| Unix/Linux shell tools | Required by environment checks and process guards |
Install system packages on Debian/Ubuntu:
sudo apt update
sudo apt install -y python3.12 python3-venv gcc g++ openjdk-21-jdkCodeSandBox/
|-- Module/ # Language handlers
| |-- abstract.py
| |-- cpp.py
| |-- gcc.py
| |-- java.py
| |-- python.py
| `-- register.py
|-- Route/ # FastAPI routers
| |-- info.py
| `-- submit.py
|-- models/ # Pydantic models
|-- problems/ # Local problem definitions and testcases
|-- utils/ # Storage, logging, guards, safe import helpers
|-- checker.py # Startup environment validation
|-- main.py # Application entry point
|-- Dockerfile
|-- docker-compose.yml
|-- pyproject.toml
`-- requirements.txt
For development on a compatible Linux machine, create and activate a virtual environment:
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtStart the API for development:
python3 main.pyBy default the server listens on:
http://127.0.0.1:8000
Build and run the image:
docker build -t problem-judge .
docker run -p 8000:8000 problem-judgeOr use Docker Compose:
docker compose up -d
docker compose logs -f
docker compose downThe Docker image uses Ubuntu 24.04, installs GCC, G++, OpenJDK 21, Python, and starts the service with:
python3 main.pyFor production, deploy CodeSandBox only as an isolated judge worker. The API accepts source files and executes them, so the process must not share a trust boundary with application servers, databases, secrets, or other sensitive workloads.
Minimum production expectations:
- Run the judge in Docker, a hardened container runtime, or a dedicated VM.
- Do not mount host directories that contain secrets or unrelated application data.
- Limit CPU, memory, process count, and disk usage at the container or orchestration layer.
- Keep the service on a private network unless an upstream API gateway or controller validates and rate-limits requests.
- Run as an unprivileged user and avoid privileged containers.
- Rotate containers regularly and treat runtime directories such as
run/as disposable.
Example Docker Compose production override:
services:
judge:
restart: unless-stopped
read_only: false
user: "1000:1000"
security_opt:
- no-new-privileges:true
cap_drop:
- ALL
pids_limit: 256
mem_limit: 512m
cpus: 1.0The exact limits depend on the problems you run, but the isolation requirement does not change.
The language names accepted by the module form field are registered in Module/register.py:
python3
java
c++98
c++11
c++17
c++
c11
c13
c17
Language support is implemented through handler classes in Module/. A module handler is responsible for writing the submitted file into its run directory, compiling it if needed, executing it against each testcase, and setting the submission status.
To add a new language or runtime:
- Create a new handler file in
Module/, for exampleModule/ruby.py. - Subclass
Module.abstract.Modules. - Implement
compile(self) -> None. - Implement
test(self, input: str, output: str) -> Optional[int]. - Register the handler name in
Module/register.py. - Add the required compiler/runtime packages to
Dockerfile. - Rebuild the Docker image and verify
GET /api/info/languagesincludes the new module name.
Minimal handler skeleton:
import os
from typing import Optional
from .abstract import Modules
from utils.data.enums import Status
from utils.runGuard import RunGuard
from utils.tokencompare import token_compare
class Ruby(Modules):
def compile(self) -> None:
if self.submission.status != Status.Running:
return
with open(self.workdir / "main.rb", "wb") as f:
f.write(self.submission.file_content)
def test(self, input: str, output: str) -> Optional[int]:
runguard = RunGuard(
self.problem.time_limit_sec,
self.problem.memory_limit_mb,
)
stdout, stderr = runguard.run(f"ruby {self.workdir / 'main.rb'}", input)
if os.WIFEXITED(runguard.status):
exit_code = os.WEXITSTATUS(runguard.status)
if exit_code == 0:
if token_compare(stdout, output):
self.submission.running_time = runguard.execution_time
return runguard.get_memory_usage()
self.submission.status = Status.WrongAnswer
self.submission.message = (
f"Expected: {output[:60]}, GOT: {stdout[:60]}"
)
return None
self.submission.status = Status.RUNTIME_ERROR
self.submission.message = stderr or None
return None
if runguard.is_tle():
self.submission.status = Status.TIME_LIMIT_EXCEEDED
elif runguard.is_mle():
self.submission.status = Status.MEMORY_LIMIT_EXCEEDED
else:
self.submission.status = Status.RUNTIME_ERROR
self.submission.message = stderr or runguard.message or None
return NoneRegister the module in Module/register.py:
import Module.ruby as ruby
def init(self):
self.register_module(
"ruby",
lambda submission, problem: ruby.Ruby(submission, problem),
)If a language has versions or standards, register each accepted request name separately and pass the selected variant to the handler:
for module in ["ruby3", "ruby"]:
self.register_module(
module,
lambda submission, problem, m=module: ruby.Ruby(
submission,
problem,
variant=m,
),
)Handler rules:
compile()should leavesubmission.statusasStatus.Runningwhen setup succeeds.- Set
Status.COMPILE_ERRORwhen compilation fails. - Set
Status.DENIED_SUBMISSIONwhen a source safety check rejects the file. test()should return memory usage for a passed testcase, orNoneafter setting a failure status.- Use
RunGuardfor execution so time and memory limits are enforced consistently. - Use
token_compare(stdout, output)unless the language needs a different output comparison strategy. - Keep all generated files inside
self.workdir; it is removed bycleanup()after the request.
Local problems live in problems/<problem_id>/.
Each problem needs a config.json file:
{
"test_cases": 2,
"time_limit": 1,
"storage_limit_mb": 50
}Testcases live in problems/<problem_id>/testcases/ and are numbered from 1:
problems/print_input/
|-- config.json
`-- testcases/
|-- 1.in
|-- 1.out
|-- 2.in
`-- 2.out
test_cases controls how many numbered .in and .out pairs are loaded.
Returns the API version.
Example response:
{
"version": "1.0.0"
}Returns current judge node capacity.
Example response:
{
"status": "IDLE",
"available_slot": 16,
"max_concurrency": 16
}Returns problem IDs loaded from the problems/ directory.
Example response:
[
"print_helloworld",
"print_input"
]Returns available submission modules.
Example response:
[
"python3",
"java",
"c++98",
"c++11",
"c++17",
"c++",
"c11",
"c13",
"c17"
]Submits a solution against a local problem from problems/.
Form fields:
| Field | Type | Required | Description |
|---|---|---|---|
id |
string | Yes | Submission ID |
problem_id |
string | Yes | Local problem ID |
module |
string | Yes | Language module, such as python3 or c++17 |
file |
file | Yes | Source file |
Example:
curl -X POST http://127.0.0.1:8000/problem/submit_problem_local \
-F "id=submission-001" \
-F "problem_id=print_input" \
-F "module=python3" \
-F "file=@solution.py"Submits a solution with testcases included in the request.
Form fields:
| Field | Type | Required | Description |
|---|---|---|---|
id |
string | Yes | Submission ID |
data_test |
string | Yes | JSON testcase payload |
module |
string | Yes | Language module, such as python3 or c++17 |
file |
file | Yes | Source file |
time_limit_sec |
int | Yes | Time limit in seconds |
memory_limit_mb |
int | Yes | Memory limit in MB |
data_test format:
{
"testcases": [
{
"input": "hello world",
"output": "hello world"
},
{
"input": "Line1\nLine2",
"output": "Line1\nLine2"
}
]
}Example:
curl -X POST http://127.0.0.1:8000/problem/submit_problem \
-F "id=submission-002" \
-F "module=python3" \
-F "time_limit_sec=1" \
-F "memory_limit_mb=128" \
-F 'data_test={"testcases":[{"input":"hello","output":"hello"}]}' \
-F "file=@solution.py"Successful requests return a JSON result with timing and memory information.
Example accepted response:
{
"id": "submission-001",
"problem_name": "print_input",
"module_used": "python3",
"status": "Accepted",
"message": "All testcases passed",
"running_time": 0.15,
"usage_memory": 34.8
}Possible status values include:
| Status | Meaning |
|---|---|
Accepted |
All testcases passed |
WrongAnswer |
Output did not match expected output |
RUNTIME_ERROR |
Program failed at runtime |
TIME_LIMIT_EXCEEDED |
Program exceeded the time limit |
MEMORY_LIMIT_EXCEEDED |
Program exceeded the memory limit |
COMPILE_ERROR |
Compilation failed |
DENIED_SUBMISSION |
Source code was blocked by safety checks |
PROBLEM_NOT_AVAILABLE |
Problem could not be loaded or created |
- The application entry point is
main.py. - API routes are mounted from
Route/submit.pyandRoute/info.py. - Language handlers are registered in
Module/register.py. - Local problems are scanned on startup and reloaded when
problems/changes. run/,runner/, and generated cache directories are runtime artifacts.