feat(jobs): add volume mounting support for buckets and repos by XciD · Pull Request #3936 · huggingface/huggingface_hub

XciD · 2026-03-16T19:28:07Z

Summary

Add support for mounting HuggingFace Buckets and Repos (models, datasets, spaces) as volumes in Job containers.

Python API

from huggingface_hub import run_job, JobVolume

job = run_job(
    image="python:3.12",
    command=["python", "-c", "import os; print(os.listdir('/data'))"],
    volumes=[
        JobVolume(type="dataset", source="username/my-dataset", mount_path="/data"),
        JobVolume(type="bucket", source="username/my-bucket", mount_path="/output"),
    ],
)

CLI

hf jobs run -v datasets/username/my-dataset:/data -v buckets/username/my-bucket:/output python:3.12 python script.py

Changes

_jobs_api.py: new JobVolume dataclass and JobVolumeType enum, volumes field added to JobInfo/JobSpec/_create_job_spec
hf_api.py: volumes parameter added to run_job, run_uv_job, create_scheduled_job, create_scheduled_uv_job
cli/jobs.py: --volume/-v CLI option with Docker-like syntax (TYPE/SOURCE:/MOUNT_PATH[:ro])
__init__.py: export JobVolume, JobVolumeType

Add `volumes` parameter to `run_job`, `create_scheduled_job`, `run_uv_job`, and `create_scheduled_uv_job` to mount HuggingFace Buckets and Repos (models, datasets, spaces) as volumes in job containers. - Add `JobVolume` dataclass and `JobVolumeType` enum - Add `volumes` field to `JobInfo` and `JobSpec` responses - Add `-v/--volume` CLI option with Docker-like syntax (e.g. `-v models/gpt2:/data` or `-v buckets/org/bucket:/mnt:ro`) - Serialize volumes to camelCase for the Hub API

bot-ci-comment · 2026-03-16T19:38:15Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

- Remove dead isinstance check in _create_job_spec serialization - Add volumes field to JobInfo docstring - Preserve original input in _parse_volumes error messages - Restructure tests: parametrize, merge into existing classes, top-level imports

lhoestq · 2026-03-16T21:20:54Z

src/huggingface_hub/cli/jobs.py

+        # Parse type from source_part (first segment before /)
+        slash_idx = source_part.find("/")
+        if slash_idx == -1:
+            # No slash: bare source like "gpt2:/data" -> model type
+            vol_type_str = JobVolumeType.MODEL.value
+            source = source_part
+        else:
+            vol_type_str = source_part[:slash_idx]
+            source = source_part[slash_idx + 1 :]
+            # If the first segment isn't a known type, treat the whole thing as a model source
+            # e.g. "org/my-model:/data" -> type=model, source="org/my-model"
+            if vol_type_str not in _VOLUME_TYPES:
+                vol_type_str = JobVolumeType.MODEL.value
+                source = source_part
+
+        result.append(
+            JobVolume(
+                type=vol_type_str,
+                source=source,
+                mount_path=mount_path,
+                read_only=read_only,
+            )
+        )


with this change you can support revisions (including special refs) and paths in repo/bucket:

Suggested change

# Parse type from source_part (first segment before /)

slash_idx = source_part.find("/")

if slash_idx == -1:

# No slash: bare source like "gpt2:/data" -> model type

vol_type_str = JobVolumeType.MODEL.value

source = source_part

else:

vol_type_str = source_part[:slash_idx]

source = source_part[slash_idx + 1 :]

# If the first segment isn't a known type, treat the whole thing as a model source

# e.g. "org/my-model:/data" -> type=model, source="org/my-model"

if vol_type_str not in _VOLUME_TYPES:

vol_type_str = JobVolumeType.MODEL.value

source = source_part

result.append(

JobVolume(

type=vol_type_str,

source=source,

mount_path=mount_path,

read_only=read_only,

)

)

resolved_path = hffs.resolve_path(source_part)

if isinstance(resolved_path, HfFileSystemResolvedRepositoryPath):

result.append(

JobVolume(

type=resolved_path.repo_type,

source=resolved_path.repo_id,

mount_path=mount_path,

revision=resolved_path.revision,

read_only=read_only,

path=resolved_path.path_in_repo,

)

)

else:

result.append(

JobVolume(

type=JobVolumeType.BUCKET.value,

source=resolved_path.bucket_id,

mount_path=mount_path,

read_only=read_only,

path=resolved_path.path,

)

)

for example here are supported paths:

# buckets "hf://buckets/username/bucket" "hf://buckets/username/bucket/path" # repos "hf://gpt2" "hf://user/model" "hf://datasets/user/dataset" "hf://user/model/path/in/repo" "hf://user/model@revision" "hf://user/model@refs/pr/1"

(it works with and without the hf:// prefix)

your will need these imports

from huggingface_hub import hffs from huggingface_hub.hf_file_system import HfFileSystemResolvedBucketPath, HfFileSystemResolvedRepositoryPath

it will also raise an error if the repo / bucket doesn't exist

lhoestq · 2026-03-16T21:28:18Z

love it ! quick question for the CLI: should we require the hf:// prefix for the source path ? to make sure it doesn't look like a local path (and in case we want to support local path at some point)

davanstrien · 2026-03-17T11:13:55Z

quick question for the CLI: should we require the hf:// prefix for the source path ? to make sure it doesn't look like a local path (and in case we want to support local path at some point)

Think this makes sense IMO. For Jobs I have quite a lot of use cases in mind where you do something like

hf jobs uv run whisper-transcribe.py some-local-dir/audiofiles.mp3

XciD requested review from Wauplin, hanouticelina and lhoestq March 16, 2026 19:28

XciD added 2 commits March 16, 2026 20:29

style: fix ruff formatting

14b24d0

docs: regenerate CLI reference for volume option

f2376fe

XciD added 4 commits March 16, 2026 20:41

feat: default volume type to model, remove plural aliases

3469fa5

refactor: replace _VOLUME_TYPE_ALIASES dict with _VOLUME_TYPES set

0601934

test: add tests for JobVolume, _parse_volumes, and volume serialization

4b4d130

lhoestq reviewed Mar 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(jobs): add volume mounting support for buckets and repos#3936

feat(jobs): add volume mounting support for buckets and repos#3936
XciD wants to merge 7 commits intomainfrom
feat/job-volumes

XciD commented Mar 16, 2026 •

edited

Loading

Uh oh!

bot-ci-comment bot commented Mar 16, 2026

Uh oh!

lhoestq Mar 16, 2026 •

edited

Loading

Uh oh!

lhoestq commented Mar 16, 2026

Uh oh!

davanstrien commented Mar 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

XciD commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Python API

CLI

Changes

Uh oh!

bot-ci-comment bot commented Mar 16, 2026

Uh oh!

lhoestq Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lhoestq commented Mar 16, 2026

Uh oh!

davanstrien commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

XciD commented Mar 16, 2026 •

edited

Loading

lhoestq Mar 16, 2026 •

edited

Loading

davanstrien commented Mar 17, 2026 •

edited

Loading