QVAC-20424 feat[api]: img2vid for POST /v1/videos by lauripiisang · Pull Request #2481 · tetherto/qvac

lauripiisang · 2026-06-08T12:22:45Z

🎯 What problem does this PR solve?

POST /v1/videos only supported text-to-video; callers had no way to animate a still image via the CLI OpenAI server.

📝 How does it solve it?

Accepts multipart/form-data on POST /v1/videos alongside the existing JSON body (no breaking change — JSON txt2vid continues to work unchanged).
Mode is inferred from the presence of an init_image file field: provided → img2vid, absent → txt2vid.
strength (0–1) controls denoise intensity for img2vid; coerced from string for multipart compatibility.
Invalid strength values return 400 invalid_strength.
Updated packages/cli/docs/serve-openai.md and docs/website/content/docs/ai-capabilities/video-generation.mdx to document both modes and the I2V model family.

🧪 How was it tested?

Unit tests cover schema acceptance/rejection of init_image and strength, extractVideoCreateParams mode selection, strength coercion and range validation (371/371 pass).
TypeScript compilation verified with the dev SDK build from the PR QVAC-19845 feat[bc|api]: add img2vid (image-to-video) support to video generation in SDK #2436 merge run. The dev build was installed locally as a package alias (not committed) — npm install @qvac/sdk@npm:@tetherto/sdk-mono@0.12.2-tmp.runid-27142936775 — and npx tsc --noEmit passed clean. package.json was restored before committing.
E2e / functional tests are not included — actual img2vid execution requires a model loaded with clipVisionModelSrc which needs hardware. These should be added as a follow-up.

🔌 API Changes

POST /v1/videos — new optional fields (multipart only for init_image):

// txt2vid (unchanged):
fetch('/v1/videos', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ model: 'wan', prompt: 'a cat surfing' })
})

// img2vid (new):
const form = new FormData()
form.append('model', 'wan')
form.append('prompt', 'the subject slowly turns and smiles')
form.append('init_image', imageBlob, 'frame.png')
form.append('strength', '0.85')
fetch('/v1/videos', { method: 'POST', body: form })

⚠️ Merge blocker

Do not merge until @qvac/sdk is published with img2vid support (from PR #2436). The CLI routes VideoClientParams at runtime through the SDK — without the released types and execution pipeline, img2vid requests will fail at the SDK call site. Once a public SDK release including those changes is available, update @qvac/sdk in packages/cli/package.json to that version and re-run bun run build before merging.

…ST /v1/audio/speech Adds GET /v1/audio/voices and /v1/audio/models discovery endpoints (QVAC-17706 Open WebUI gap) alongside the encoding feature. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…iption, discovery endpoint smoke tests - Correct audio/x-pcm → audio/L16; rate=<sr>; channels=1 in OpenAPI description - Update serve-openai.md: response_format table, headers table, error table, route index, add /v1/audio/voices and /v1/audio/models sections - Remove stale "wav + pcm only" caveat from README.md - Add explicit BATS smoke tests for GET /v1/audio/models and GET /v1/audio/voices Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… with exactOptionalPropertyTypes

…tion

…ibility

…fusion-cpp

lauripiisang and others added 9 commits June 4, 2026 19:29

QVAC-18807 feat[api]: ffmpeg-backed mp3/opus/aac/flac encoding for PO…

c6984f2

…ST /v1/audio/speech Adds GET /v1/audio/voices and /v1/audio/models discovery endpoints (QVAC-17706 Open WebUI gap) alongside the encoding feature. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

QVAC-20424 feat[api]: img2vid for POST /v1/videos

d08ef85

QVAC-20424 test: img2vid unit tests for video schema

e821c42

QVAC-20424 doc: document img2vid video generation

68bb24b

QVAC-20424 fix: type direct params as Record to avoid union inference…

a9689d3

… with exactOptionalPropertyTypes

QVAC-20424 fix: add clipVisionModelSrc to sdcpp-video constant resolu…

f858065

…tion

QVAC-20424 fix: coerce numeric video body fields for multipart compat…

842f241

…ibility

QVAC-20424 fix: require video size to be multiples of 16 to match dif…

9ae8fe2

…fusion-cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QVAC-20424 feat[api]: img2vid for POST /v1/videos#2481

QVAC-20424 feat[api]: img2vid for POST /v1/videos#2481
lauripiisang wants to merge 9 commits into
mainfrom
worktree-qvac-20424

lauripiisang commented Jun 8, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lauripiisang commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🎯 What problem does this PR solve?

📝 How does it solve it?

🧪 How was it tested?

🔌 API Changes

⚠️ Merge blocker

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lauripiisang commented Jun 8, 2026 •

edited

Loading