Skip to content

[v4] Improve download progress tracking (model cache registry and define which files will be loaded for pipelines)#1511

Merged
xenova merged 62 commits intomainfrom
v4-cache-handler
Mar 1, 2026
Merged

[v4] Improve download progress tracking (model cache registry and define which files will be loaded for pipelines)#1511
xenova merged 62 commits intomainfrom
v4-cache-handler

Conversation

@nico-martin
Copy link
Collaborator

@nico-martin nico-martin commented Feb 3, 2026

Improved Download Progress Tracking

Problem

Transformers.js couldn't reliably track total download progress because:

  • File lists weren't known before downloads started
  • File sizes were inconsistent (compressed vs uncompressed)
  • No cache awareness before initiating downloads

Solution

New Exported Functions

  • get_files(): Determines required files before downloading
  • get_model_files() / get_tokenizer_files() / get_processor_files(): Helper functions to identify files for each component
  • get_file_metadata(): Fetches file metadata using Range requests without downloading full content
    • Returns fromCache boolean to identify cached files
    • Ensures consistent uncompressed file sizes
  • is_cached(): Checks if all files from a model are already in cache

Enhanced Progress Tracking

  • readResponse() with expectedSize: Falls back to metadata when content-length header is missing
  • total_progress callback: Provides aggregate progress across all files

Review

One thing I am not super confident is the get_model_files function. I tried to test it with different model architectures, but maybe I missed some that load files that are not in that function. @xenova, could you smoke-test some models and write mie the models that fail?

Easiest way to do that is:

import {
  get_files,
  pipeline,
} from "@huggingface/transformers";

const expectedFiles = await get_files(
  "onnx-community/gemma-3-270m-it-ONNX",
  {
    dtype: "fp32",
    device: "webgpu",
  }
);
const loadedFiles = new Set();
const pipe = await pipeline(
  "text-generation",
  "onnx-community/gemma-3-270m-it-ONNX",
  {
    dtype: "fp32",
    device: "webgpu",
    progress_callback: (e) => {
      if (e.file) loadedFiles.add(e.file);
    },
  }
);

console.log(
  "SAME FILES:",
  expectedFiles.sort().join(",") === Array.from(loadedFiles).sort().join(",")
);

Closes #1345

@nico-martin nico-martin requested a review from xenova February 3, 2026 15:24
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@xenova xenova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very exciting PR! 🙌 Just a quick review from scanning the PR briefly.

@xenova xenova changed the base branch from v4 to main February 13, 2026 17:03
@xenova xenova self-requested a review February 18, 2026 17:08
Copy link
Collaborator

@xenova xenova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid progress! Thanks 🔥

@xenova xenova changed the title V4 cache handler [v4] Improve download progress tracking (model cache registry and define which files will be loaded for pipelines) Feb 19, 2026
Copy link
Collaborator

@xenova xenova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huge PR! 🔥 Thanks so much @nico-martin.

@xenova xenova merged commit 4811a61 into main Mar 1, 2026
4 checks passed
@xenova xenova deleted the v4-cache-handler branch March 1, 2026 00:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a supported API

4 participants