Skip to content

audio_model.onnx: Protobuf parsing failed with onnxruntime-node 1.21.0 – causes OOM crash (exit 137) #93

@Bagoboga

Description

@Bagoboga

Bug: audio_model.onnx fails with "Protobuf parsing failed" on onnxruntime-node 1.21.0

Environment

  • OS: Windows 11 with WSL2 + Docker Desktop
  • GPU: NVIDIA GeForce RTX 4070 (8GB)
  • CUDA: 12.8.0 / Driver: 591.74
  • Image: edit-mind-background-jobs:latest-gpu (v0.14.3)
  • onnxruntime-node: 1.21.0

Description
Every time background-jobs processes audio embeddings, it fails with:

Failed to process audio embedding: Error: Load model from /ml-models/embedding-models/Xenova/clap-htsat-unfused/onnx/audio_model.onnx failed: Protobuf parsing failed.

Verified

  • The file is NOT corrupt: 112.1 MB, valid ONNX header (08 prefix)
  • Deleting and re-downloading the model does not fix the issue
  • The error occurs on every scene, causing massive log spam and eventual OOM crash (exit 137)

File info

Filstorlek: 117528416 bytes
Första bytes: 08061207
onnxruntime version: { common: '1.21.0', node: '1.21.0' }

Impact

  • Audio and visual embeddings fail completely
  • background-jobs crashes with OOM (exit 137) due to the model being loaded once per scene in parallel
  • Videos get stuck at "Embedding Visuals 100%" indefinitely

Question
Is there an environment variable or config option to disable audio embeddings as a workaround?

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions