Skip to content

ShindouAris/CodeSandBox

Repository files navigation

CodeSandBox

CodeSandBox is a FastAPI-based judging node for running programming submissions against test cases. It supports local problems stored on disk and ad-hoc problems submitted with test data in the request.

Important: this project is designed for Unix/Linux environments. The runtime checks for uname, python3, gcc, g++, java, and javac before starting.

Must use Docker: run this project inside the provided Docker image or Docker Compose service. Direct host execution is only useful for development on a compatible Linux machine and is not the supported runtime path.

Security warning: this service compiles and runs user-submitted code. Treat every submission as untrusted and potentially malicious. Do not run the judge directly on a host machine; use an isolated environment such as Docker, a locked-down container runtime, or a dedicated sandbox/VM boundary.

Features

  • FastAPI HTTP API for submission judging and node status.
  • Language handlers for Python, C, C++, and Java.
  • Disk-backed local problem storage under problems/.
  • Inline testcase payload support for remote/ad-hoc judging.
  • Time and memory limits per problem.
  • Automatic problem reloads when files under problems/ change.
  • Docker and Docker Compose setup for reproducible Linux execution.

Requirements

Requirement Purpose
Python 3.12+ Runs the API and Python submissions
GCC Compiles C submissions
G++ Compiles C++ submissions
Java/Javac Runs and compiles Java submissions
Unix/Linux shell tools Required by environment checks and process guards

Install system packages on Debian/Ubuntu:

sudo apt update
sudo apt install -y python3.12 python3-venv gcc g++ openjdk-21-jdk

Project Structure

CodeSandBox/
|-- Module/              # Language handlers
|   |-- abstract.py
|   |-- cpp.py
|   |-- gcc.py
|   |-- java.py
|   |-- python.py
|   `-- register.py
|-- Route/               # FastAPI routers
|   |-- info.py
|   `-- submit.py
|-- models/              # Pydantic models
|-- problems/            # Local problem definitions and testcases
|-- utils/               # Storage, logging, guards, safe import helpers
|-- checker.py           # Startup environment validation
|-- main.py              # Application entry point
|-- Dockerfile
|-- docker-compose.yml
|-- pyproject.toml
`-- requirements.txt

Development Setup

For development on a compatible Linux machine, create and activate a virtual environment:

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Start the API for development:

python3 main.py

By default the server listens on:

http://127.0.0.1:8000

Required Docker Setup

Build and run the image:

docker build -t problem-judge .
docker run -p 8000:8000 problem-judge

Or use Docker Compose:

docker compose up -d
docker compose logs -f
docker compose down

The Docker image uses Ubuntu 24.04, installs GCC, G++, OpenJDK 21, Python, and starts the service with:

python3 main.py

Production Use

For production, deploy CodeSandBox only as an isolated judge worker. The API accepts source files and executes them, so the process must not share a trust boundary with application servers, databases, secrets, or other sensitive workloads.

Minimum production expectations:

  • Run the judge in Docker, a hardened container runtime, or a dedicated VM.
  • Do not mount host directories that contain secrets or unrelated application data.
  • Limit CPU, memory, process count, and disk usage at the container or orchestration layer.
  • Keep the service on a private network unless an upstream API gateway or controller validates and rate-limits requests.
  • Run as an unprivileged user and avoid privileged containers.
  • Rotate containers regularly and treat runtime directories such as run/ as disposable.

Example Docker Compose production override:

services:
  judge:
    restart: unless-stopped
    read_only: false
    user: "1000:1000"
    security_opt:
      - no-new-privileges:true
    cap_drop:
      - ALL
    pids_limit: 256
    mem_limit: 512m
    cpus: 1.0

The exact limits depend on the problems you run, but the isolation requirement does not change.

Supported Languages

The language names accepted by the module form field are registered in Module/register.py:

python3
java
c++98
c++11
c++17
c++
c11
c13
c17

Adding New Modules

Language support is implemented through handler classes in Module/. A module handler is responsible for writing the submitted file into its run directory, compiling it if needed, executing it against each testcase, and setting the submission status.

To add a new language or runtime:

  1. Create a new handler file in Module/, for example Module/ruby.py.
  2. Subclass Module.abstract.Modules.
  3. Implement compile(self) -> None.
  4. Implement test(self, input: str, output: str) -> Optional[int].
  5. Register the handler name in Module/register.py.
  6. Add the required compiler/runtime packages to Dockerfile.
  7. Rebuild the Docker image and verify GET /api/info/languages includes the new module name.

Minimal handler skeleton:

import os
from typing import Optional

from .abstract import Modules
from utils.data.enums import Status
from utils.runGuard import RunGuard
from utils.tokencompare import token_compare


class Ruby(Modules):
    def compile(self) -> None:
        if self.submission.status != Status.Running:
            return

        with open(self.workdir / "main.rb", "wb") as f:
            f.write(self.submission.file_content)

    def test(self, input: str, output: str) -> Optional[int]:
        runguard = RunGuard(
            self.problem.time_limit_sec,
            self.problem.memory_limit_mb,
        )

        stdout, stderr = runguard.run(f"ruby {self.workdir / 'main.rb'}", input)

        if os.WIFEXITED(runguard.status):
            exit_code = os.WEXITSTATUS(runguard.status)
            if exit_code == 0:
                if token_compare(stdout, output):
                    self.submission.running_time = runguard.execution_time
                    return runguard.get_memory_usage()

                self.submission.status = Status.WrongAnswer
                self.submission.message = (
                    f"Expected: {output[:60]}, GOT: {stdout[:60]}"
                )
                return None

            self.submission.status = Status.RUNTIME_ERROR
            self.submission.message = stderr or None
            return None

        if runguard.is_tle():
            self.submission.status = Status.TIME_LIMIT_EXCEEDED
        elif runguard.is_mle():
            self.submission.status = Status.MEMORY_LIMIT_EXCEEDED
        else:
            self.submission.status = Status.RUNTIME_ERROR
            self.submission.message = stderr or runguard.message or None

        return None

Register the module in Module/register.py:

import Module.ruby as ruby


def init(self):
    self.register_module(
        "ruby",
        lambda submission, problem: ruby.Ruby(submission, problem),
    )

If a language has versions or standards, register each accepted request name separately and pass the selected variant to the handler:

for module in ["ruby3", "ruby"]:
    self.register_module(
        module,
        lambda submission, problem, m=module: ruby.Ruby(
            submission,
            problem,
            variant=m,
        ),
    )

Handler rules:

  • compile() should leave submission.status as Status.Running when setup succeeds.
  • Set Status.COMPILE_ERROR when compilation fails.
  • Set Status.DENIED_SUBMISSION when a source safety check rejects the file.
  • test() should return memory usage for a passed testcase, or None after setting a failure status.
  • Use RunGuard for execution so time and memory limits are enforced consistently.
  • Use token_compare(stdout, output) unless the language needs a different output comparison strategy.
  • Keep all generated files inside self.workdir; it is removed by cleanup() after the request.

Problem Format

Local problems live in problems/<problem_id>/.

Each problem needs a config.json file:

{
  "test_cases": 2,
  "time_limit": 1,
  "storage_limit_mb": 50
}

Testcases live in problems/<problem_id>/testcases/ and are numbered from 1:

problems/print_input/
|-- config.json
`-- testcases/
    |-- 1.in
    |-- 1.out
    |-- 2.in
    `-- 2.out

test_cases controls how many numbered .in and .out pairs are loaded.

API

GET /version

Returns the API version.

Example response:

{
  "version": "1.0.0"
}

GET /problem/status

Returns current judge node capacity.

Example response:

{
  "status": "IDLE",
  "available_slot": 16,
  "max_concurrency": 16
}

GET /api/info/problems

Returns problem IDs loaded from the problems/ directory.

Example response:

[
  "print_helloworld",
  "print_input"
]

GET /api/info/languages

Returns available submission modules.

Example response:

[
  "python3",
  "java",
  "c++98",
  "c++11",
  "c++17",
  "c++",
  "c11",
  "c13",
  "c17"
]

POST /problem/submit_problem_local

Submits a solution against a local problem from problems/.

Form fields:

Field Type Required Description
id string Yes Submission ID
problem_id string Yes Local problem ID
module string Yes Language module, such as python3 or c++17
file file Yes Source file

Example:

curl -X POST http://127.0.0.1:8000/problem/submit_problem_local \
  -F "id=submission-001" \
  -F "problem_id=print_input" \
  -F "module=python3" \
  -F "file=@solution.py"

POST /problem/submit_problem

Submits a solution with testcases included in the request.

Form fields:

Field Type Required Description
id string Yes Submission ID
data_test string Yes JSON testcase payload
module string Yes Language module, such as python3 or c++17
file file Yes Source file
time_limit_sec int Yes Time limit in seconds
memory_limit_mb int Yes Memory limit in MB

data_test format:

{
  "testcases": [
    {
      "input": "hello world",
      "output": "hello world"
    },
    {
      "input": "Line1\nLine2",
      "output": "Line1\nLine2"
    }
  ]
}

Example:

curl -X POST http://127.0.0.1:8000/problem/submit_problem \
  -F "id=submission-002" \
  -F "module=python3" \
  -F "time_limit_sec=1" \
  -F "memory_limit_mb=128" \
  -F 'data_test={"testcases":[{"input":"hello","output":"hello"}]}' \
  -F "file=@solution.py"

Submission Response

Successful requests return a JSON result with timing and memory information.

Example accepted response:

{
  "id": "submission-001",
  "problem_name": "print_input",
  "module_used": "python3",
  "status": "Accepted",
  "message": "All testcases passed",
  "running_time": 0.15,
  "usage_memory": 34.8
}

Possible status values include:

Status Meaning
Accepted All testcases passed
WrongAnswer Output did not match expected output
RUNTIME_ERROR Program failed at runtime
TIME_LIMIT_EXCEEDED Program exceeded the time limit
MEMORY_LIMIT_EXCEEDED Program exceeded the memory limit
COMPILE_ERROR Compilation failed
DENIED_SUBMISSION Source code was blocked by safety checks
PROBLEM_NOT_AVAILABLE Problem could not be loaded or created

Development Notes

  • The application entry point is main.py.
  • API routes are mounted from Route/submit.py and Route/info.py.
  • Language handlers are registered in Module/register.py.
  • Local problems are scanned on startup and reloaded when problems/ changes.
  • run/, runner/, and generated cache directories are runtime artifacts.

About

An Judgment Node, written in python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors