A Next.js web app for live presentation coaching with three memory layers:
- Long-term memory: project corpus (
memory/longMarkdown files). - Short-term memory: demo deck/project context (
memory/shortMarkdown files) + active slide context. - Live memory: real-time transcript window for coaching cues.
- Rehearsal workspace with a glassmorphic assistant bubble.
- Expandable assistant panel with fast navigation between
Chat,Memory, andDeck. - Dual deck ingestion:
- Google Slides URL import (Google Slides API when configured, fallback-safe).
- PDF upload import (text extraction per page -> deck slide model).
- Live Speechmatics realtime STT wired from browser microphone.
- Real-time cue generation API with Backboard LLM (with heuristic fallback).
- ElevenLabs TTS (preferred) with browser speech fallback for spoken Plant feedback.
- Memory reference retrieval API (local memory-first, Backboard optional, fallback-safe).
- Integration status endpoint for Speechmatics, Backboard, and Google.
- Next.js 16 (App Router)
- TypeScript
- Tailwind CSS v4
- Framer Motion
- Zod
- Install dependencies:
npm install- Install Poppler tools (needed for PDF import):
brew install poppler- Configure environment variables:
cp .env.example .env.local- Start the app:
npm run devBACKBOARD_API_KEY: your Backboard API key (already provided in your local.env.local).BACKBOARD_API_URL: Backboard base URL used for memory retrieval.BACKBOARD_ASSISTANT_ID: optional existing assistant ID to reuse (if unset, one is created automatically).BACKBOARD_LLM_PROVIDER: provider used by Backboard for cue generation (defaultopenai).BACKBOARD_LLM_MODEL: model used by Backboard for cue generation (defaultgpt-5.2).BACKBOARD_MEMORY_MODE: Backboard memory mode for cue runs (defaultReadonly).PROJECT_MEMORY_ROOT: local project memory root folder (defaultmemory).PROJECT_MEMORY_ACTIVE_PROJECT: optional active project id override for local memory selection.SPEECHMATICS_API_KEY: Speechmatics API key for live transcription connectivity.SPEECH_API_KEY: backward-compatible alias for Speechmatics key.SPEECHMATICS_REALTIME_URL: Speechmatics realtime websocket URL.SPEECHMATICS_REGION: Speechmatics region for realtime JWT minting (eu,usa,au).SPEECHMATICS_TTS_URL: Speechmatics TTS base URL (default preview endpoint).SPEECHMATICS_TTS_VOICE: Speechmatics TTS voice id (defaultsarah).ELEVENLABS_API_KEY: optional preferred TTS provider key.ELEVENLABS_VOICE_ID: ElevenLabs voice id (defaultpNInz6obpgDQGcFmaJgB).ELEVENLABS_MODEL: ElevenLabs model id (defaulteleven_turbo_v2_5).ELEVENLABS_WARM_SPEED: ElevenLabs native synthesis speed (default0.9, lower is slower).GOOGLE_API_KEY: Google API key for importing public Slides.GOOGLE_OAUTH_ACCESS_TOKEN: OAuth token for importing private Slides.
POST /api/session/start- Starts a rehearsal session, returns session ID and a Backboard thread ID when available.
POST /api/coach/cue- Generates one coaching cue from transcript + active slide + memory depth.
GET /api/speechmatics/token- Mints a short-lived Speechmatics realtime token for the browser client.
POST /api/speechmatics/tts- Synthesizes feedback text with ElevenLabs (client-side browser fallback can be enabled if playback fails).
POST /api/memory/references- Retrieves memory references from the active project's local memory (
projects/<project>/shortin shallow mode, plusprojects/<project>/longin deep mode), then Backboard, then fallback.
- Retrieves memory references from the active project's local memory (
POST /api/deck/google/import- Parses Google Slides URL and imports deck context via Google Slides API.
POST /api/deck/pdf/import- Accepts
multipart/form-datawithfilefield and imports PDF pages as deck slides.
- Accepts
GET /api/integrations/status- Returns connection status for Speechmatics, Backboard, and Google.
- Google Slides import needs
GOOGLE_API_KEY(public deck) orGOOGLE_OAUTH_ACCESS_TOKEN(private deck). - PDF import extracts text content; image-only scanned PDFs may produce weak slide text.
- Local memory retrieval is markdown-driven from the active project under
memory/projects/<project>/short/*.mdandmemory/projects/<project>/long/*.md(legacymemory/shortandmemory/longare still supported). memoryDepth: "shallow"uses short memory only (demo mode);memoryDepth: "deep"expands retrieval to short + long memory.- Rehearsal uses the active project's short memory and prefers colocated pairs like
short/<deck>.pdf+short/<deck>.pdf.md. - Plant supports three live modes:
off,listens,talks(plusPlant thinkas a deep-memory action). - App boots in wake-listening mode (mic stream active) and supports voice commands (
Plant listen,Plant talk,Plant think,Plant remember,Plant off) even while Plant isoff. Plant thinkruns deep retrieval (short + long memory) with a visible loading step and a 15-second cap before fast fallback.Plant rememberupdatesshort/project-advices-memory.mdfor the active project (add/remove recurring advice that survives reloads).- The
/memorypage lets you edit markdown files directly, includingproject-advices-memory.md. - Delivery signals include fillers, disfluency tags, and silence events (end-of-utterance).
- Cue generation uses Backboard LLM when configured; deterministic heuristics remain as fallback.
- Complete Google OAuth refresh flow and token management for private Slides.
- Replace the temporary ScriptProcessor capture path with AudioWorklet-based PCM capture.
- Tune Backboard model/provider defaults and prompt strategy for latency vs coaching depth.
- Add persistent storage for sessions, transcripts, and reports.
- Product requirements: docs/PRD.md