Open-source CLI that turns one short topic into a photoreal 6–10 slide carousel for TikTok / Instagram / LinkedIn. Optionally stitches the slides into a vertical 9:16 MP4.
cd into this folder, launch your coding agent, say "build me a carousel about X" — and walk away. The workflow lives in AGENTS.md (read by most agents) and CLAUDE.md (Claude Code). Claude Code users also get an auto-triggering skill at .claude/skills/carousel/SKILL.md.
No API keys. Uses OAuth-authed local CLIs against subscriptions you already pay for.
| 📱 Candid iPhone aesthetic | Looks like a human grabbed a phone and shot a quick frame. No glossy retouch. No Shutterstock vibe. |
| ✍️ Hand-typed captions | When text appears, it looks like someone typed it on TikTok — not like an AI overlay. No rounded white pill boxes. |
| 🔑 No API keys | Image gen runs through Codex CLI (OAuth against your ChatGPT account) or Higgsfield CLI (your existing sub). |
| 🤖 Agent-agnostic | Drop into Claude Code, Codex CLI, Cursor, Aider, or Gemini CLI. The workflow lives in AGENTS.md. |
| 🪄 Open source | Clone, run, ship. Or fork the prompts and presets. |
brew install --cask codex # GPT Image 2 — free, OAuth
brew install ffmpeg # video assembly
brew install imagemagick # captions
codex login # browser OAuth, no API keyOptional (only if you want Nano Banana 2 / FLUX.2 / Soul):
npm i -g @higgsfield/cli
higgsfield auth loginThen any coding agent works:
- Claude Code — reads
CLAUDE.md+ auto-fires the skill - Codex CLI — reads
AGENTS.md - Cursor, Aider, Gemini CLI, or any agent that supports
AGENTS.md
cd carousel-factory
claude # or: codex, cursor-agent, aider, gemini, etc.Then in the chat:
build me a carousel about the new Claude Haiku 4.5 launch
Your agent reads AGENTS.md (or CLAUDE.md + .claude/skills/carousel/SKILL.md for Claude Code), then:
- 🔎 Researches the topic with its built-in web tool.
- 🧭 Plans an 8-slide narrative (hook → friction → proof → payoff).
- 🖼️ Generates each slide via Codex CLI into
output/<slug>/. - ✍️ Captions any slides that need text (hand-typed look, no boxes).
- 📝 Writes a post caption (
output/<slug>/caption.txt).
Say "and make a video" to also get a 9:16 MP4 (Ken Burns pan, crossfades, optional music).
output/<slug>/
├── plan.md # slide-by-slide narrative
├── research.md # facts pulled from web research
├── prompts/
│ ├── slide-01.txt
│ ├── slide-02.txt
│ └── ...
├── slide-01.png
├── slide-02.png
├── ...
├── slide-01-captioned.png # only if caption was added
├── carousel.mp4 # only if you asked for video
└── caption.txt # post caption + hashtags
| Path | What it does |
|---|---|
AGENTS.md |
Agent-agnostic workflow (Codex, Cursor, Aider, Gemini CLI, etc.) |
CLAUDE.md |
Claude Code mirror of AGENTS.md (auto-loaded) |
.claude/skills/carousel/SKILL.md |
Auto-triggering Claude Code skill |
prompts/realism-engine.md |
Photoreal camera, lighting, texture rules |
prompts/theme-builder.md |
How to compose a new theme |
prompts/slide-archetypes.md |
Narrative templates (how-to, mistakes, before-after, …) |
prompts/caption-rules.md |
When to add text and what it should look like |
presets/*.json |
Reusable theme presets (candid-iphone, night-desk, coffee-shop, golden-hour) |
scripts/gen-image.sh |
Codex CLI image gen wrapper |
scripts/gen-image-higgsfield.sh |
Optional Higgsfield wrapper (Nano Banana 2, Soul) |
scripts/caption-overlay.sh |
ImageMagick hand-typed caption |
scripts/assemble-video.sh |
ffmpeg 9:16 video stitcher with Ken Burns + xfade |
examples/ |
Sample runs you can read for prompt patterns |
output/ |
Where generated carousels land (gitignored) |
quickstart.html |
Single-page giveaway guide for humans |
- New theme → copy
presets/candid-iphone.jsontopresets/<your-name>.json, edit, save. Reference it by name: "use the cozy-bedroom preset." - New archetype → add a section to
prompts/slide-archetypes.md. - Different image model → tell your agent "use Higgsfield with Nano Banana 2" and it routes through
scripts/gen-image-higgsfield.sh. - No-text carousel → say "no text overlays" and only slide 1 hook gets text.
- Different coding agent → all workflow lives in
AGENTS.md. Any agent that respects that file picks it up automatically.
The shipped templates contain no creator names, private paths, API keys, analytics, or remote dependencies. Everything runs locally. output/ is gitignored.
| Symptom | Fix |
|---|---|
codex: command not found |
brew install --cask codex && codex login |
ffmpeg: command not found |
brew install ffmpeg |
magick: command not found |
brew install imagemagick |
| Codex says "rate limit" | Wait, or codex login against a Plus/Pro account |
| Images look glossy / AI | Tell your agent "re-roll slide N with anti-AI rescue" and the realism-engine rescue prompt gets applied |
MIT. Fork it, remix it, ship your own version.
—
Made for creators who'd rather press one button than open Photoshop.

