Skip to content

noCode-Human/carousel-factory

Repository files navigation

Candid late-night desk slide Hand-typed yellow legal pad slide

🎞️ Carousel Factory

One topic in. Photoreal 9:16 carousel out. Built for any coding agent. Zero API keys.

License: MIT Works With Output


Open-source CLI that turns one short topic into a photoreal 6–10 slide carousel for TikTok / Instagram / LinkedIn. Optionally stitches the slides into a vertical 9:16 MP4.

cd into this folder, launch your coding agent, say "build me a carousel about X" — and walk away. The workflow lives in AGENTS.md (read by most agents) and CLAUDE.md (Claude Code). Claude Code users also get an auto-triggering skill at .claude/skills/carousel/SKILL.md.

No API keys. Uses OAuth-authed local CLIs against subscriptions you already pay for.

✨ What makes it different

📱 Candid iPhone aesthetic Looks like a human grabbed a phone and shot a quick frame. No glossy retouch. No Shutterstock vibe.
✍️ Hand-typed captions When text appears, it looks like someone typed it on TikTok — not like an AI overlay. No rounded white pill boxes.
🔑 No API keys Image gen runs through Codex CLI (OAuth against your ChatGPT account) or Higgsfield CLI (your existing sub).
🤖 Agent-agnostic Drop into Claude Code, Codex CLI, Cursor, Aider, or Gemini CLI. The workflow lives in AGENTS.md.
🪄 Open source Clone, run, ship. Or fork the prompts and presets.

🛠️ Install (one-time)

brew install --cask codex      # GPT Image 2 — free, OAuth
brew install ffmpeg            # video assembly
brew install imagemagick       # captions
codex login                    # browser OAuth, no API key

Optional (only if you want Nano Banana 2 / FLUX.2 / Soul):

npm i -g @higgsfield/cli
higgsfield auth login

Then any coding agent works:

🚀 Use it

cd carousel-factory
claude        # or: codex, cursor-agent, aider, gemini, etc.

Then in the chat:

build me a carousel about the new Claude Haiku 4.5 launch

Your agent reads AGENTS.md (or CLAUDE.md + .claude/skills/carousel/SKILL.md for Claude Code), then:

  1. 🔎 Researches the topic with its built-in web tool.
  2. 🧭 Plans an 8-slide narrative (hook → friction → proof → payoff).
  3. 🖼️ Generates each slide via Codex CLI into output/<slug>/.
  4. ✍️ Captions any slides that need text (hand-typed look, no boxes).
  5. 📝 Writes a post caption (output/<slug>/caption.txt).

Say "and make a video" to also get a 9:16 MP4 (Ken Burns pan, crossfades, optional music).

📦 What you get

output/<slug>/
├── plan.md                   # slide-by-slide narrative
├── research.md               # facts pulled from web research
├── prompts/
│   ├── slide-01.txt
│   ├── slide-02.txt
│   └── ...
├── slide-01.png
├── slide-02.png
├── ...
├── slide-01-captioned.png    # only if caption was added
├── carousel.mp4              # only if you asked for video
└── caption.txt               # post caption + hashtags

🗂️ Folder map

Path What it does
AGENTS.md Agent-agnostic workflow (Codex, Cursor, Aider, Gemini CLI, etc.)
CLAUDE.md Claude Code mirror of AGENTS.md (auto-loaded)
.claude/skills/carousel/SKILL.md Auto-triggering Claude Code skill
prompts/realism-engine.md Photoreal camera, lighting, texture rules
prompts/theme-builder.md How to compose a new theme
prompts/slide-archetypes.md Narrative templates (how-to, mistakes, before-after, …)
prompts/caption-rules.md When to add text and what it should look like
presets/*.json Reusable theme presets (candid-iphone, night-desk, coffee-shop, golden-hour)
scripts/gen-image.sh Codex CLI image gen wrapper
scripts/gen-image-higgsfield.sh Optional Higgsfield wrapper (Nano Banana 2, Soul)
scripts/caption-overlay.sh ImageMagick hand-typed caption
scripts/assemble-video.sh ffmpeg 9:16 video stitcher with Ken Burns + xfade
examples/ Sample runs you can read for prompt patterns
output/ Where generated carousels land (gitignored)
quickstart.html Single-page giveaway guide for humans

🎨 Customize

  • New theme → copy presets/candid-iphone.json to presets/<your-name>.json, edit, save. Reference it by name: "use the cozy-bedroom preset."
  • New archetype → add a section to prompts/slide-archetypes.md.
  • Different image model → tell your agent "use Higgsfield with Nano Banana 2" and it routes through scripts/gen-image-higgsfield.sh.
  • No-text carousel → say "no text overlays" and only slide 1 hook gets text.
  • Different coding agent → all workflow lives in AGENTS.md. Any agent that respects that file picks it up automatically.

🔒 Privacy

The shipped templates contain no creator names, private paths, API keys, analytics, or remote dependencies. Everything runs locally. output/ is gitignored.

🩹 Troubleshooting

Symptom Fix
codex: command not found brew install --cask codex && codex login
ffmpeg: command not found brew install ffmpeg
magick: command not found brew install imagemagick
Codex says "rate limit" Wait, or codex login against a Plus/Pro account
Images look glossy / AI Tell your agent "re-roll slide N with anti-AI rescue" and the realism-engine rescue prompt gets applied

📄 License

MIT. Fork it, remix it, ship your own version.

Made for creators who'd rather press one button than open Photoshop.

About

Photoreal TikTok/IG/LinkedIn carousels from one topic. Agent-agnostic CLI. No API keys.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors