Skip to content

A thumb-sized push-to-talk device for sending voice notes to your AI agent (e.g. @openclaw).

Notifications You must be signed in to change notification settings

akoenig/flux-buddy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Flux Buddy

A tiny push-to-talk voice recorder that sends your voice messages straight to Discord (via a relay deployed on Cloudflare). Built on the M5Stack Atom Echo — a thumb-sized ESP32 device with a built-in microphone, speaker, and LED.

M5Stack Atom Echo with TailBat battery pack

Hold the button, speak, release. Your recording shows up in Discord within seconds. I use this to send voice notes to my OpenClaw personal assistant. On the go, my mobile device acts as a WiFi hotspot.

How It Works

Hold button ─── [Beep] "I'm awake"
       |
       |  LED Blue ── Connecting to WiFi ...
       |
       |  [Beep] ──── "Start speaking"
       |
       |  LED Yellow ─ Recording + streaming
       |
Release button
       |
       |  [Beep] ──── Success or failure tone
       |  LED ─────── Green (done) or red (error)
       |
     Deep sleep ───── Waiting for next press

The audio is streamed in real-time to a small Cloudflare Worker (the "relay"), which converts the raw audio to a WAV file and posts it to Discord via a webhook — all within seconds.

What You Need

Hardware

Accounts & Tools

  • WiFi network (2.4 GHz, WPA2 / Android and iOS offer the option to create a personal hotspot)
  • Discord server where you can create a webhook
  • Cloudflare account (the free tier is sufficient)
  • PlatformIO for building and flashing the firmware (no account required; available as VS Code extension or CLI)
  • Node.js 22+ and pnpm for the relay worker

Setup Guide

Step 1: Clone the Repository

git clone https://github.com/akoenig/flux-buddy.git
cd flux

Step 2: Create a Discord Webhook

  1. Open your Discord server
  2. Go to Server Settings > Integrations > Webhooks
  3. Click New Webhook
  4. Choose the channel where voice messages should appear
  5. Copy the Webhook URL — you'll need it in the next step

Step 3: Deploy the Relay Worker

The relay is a small Cloudflare Worker that sits between the device and Discord. It receives raw audio from the Atom Echo, wraps it in a WAV file, and forwards it to your Discord webhook.

cd relay
pnpm install

Generate a strong API key. This shared secret authenticates requests from your device so nobody else can use your relay:

openssl rand -base64 32

Save that key somewhere — you'll need it for the firmware config too.

Now set the two required secrets:

pnpx wrangler secret put API_KEY
# Paste the API key you just generated

pnpx wrangler secret put DISCORD_WEBHOOK_URL
# Paste your Discord webhook URL from Step 2

Deploy:

pnpm run deploy

Wrangler will print the URL of your deployed worker, e.g.:

https://flux-buddy-relay.<your-subdomain>.workers.dev

Note this URL down — the firmware needs it.

cd ..

Step 4: Configure the Firmware

cp src/config.h.example src/config.h

Open src/config.h in your editor and fill in your credentials:

static const WiFiCredential WIFI_CREDENTIALS[] = {
  { "your-wifi-ssid", "your-wifi-password" },
};

You can add multiple WiFi networks. The device tries them in order and falls back to the next one if a connection fails:

static const WiFiCredential WIFI_CREDENTIALS[] = {
  { "home-wifi",   "home-password" },
  { "office-wifi", "office-password" },
  { "phone-hotspot", "hotspot-password" },
};

Then set the relay URL and API key:

#define UPLOAD_URL "https://flux-buddy-relay.<your-subdomain>.workers.dev/upload"
#define API_KEY    "same-key-you-set-in-wrangler"

src/config.h is gitignored and will never be committed. Your credentials stay on your machine.

Step 5: Build and Flash

Connect the Atom Echo via USB-C, then:

pio run -t upload

That's it. The firmware is now on the device.

Step 6: Test

  1. Press and hold the button on the Atom Echo
  2. Wait for two beeps (boot beep, then a higher-pitched ready beep)
  3. Speak your message
  4. Release the button
  5. Listen for the success tone (ascending chirp) and watch for a green LED
  6. Check your Discord channel — the WAV file should appear within seconds

Optional: Serial Monitor

For debugging or verifying that everything works, you can watch the device logs:

pio device monitor

All log lines are prefixed with [FB] and show WiFi connection, HTTPS handshake, recording stats, and upload progress.

LED Reference

Color Meaning
Blue (solid) Connecting to WiFi
Red (brief flash) WiFi network failed, trying next one
Yellow Recording and streaming
Green Upload succeeded
Red Error (all WiFi networks failed, HTTPS failure, or upload error)
Blue (blinking) Recording paused (ring buffer full, draining)

Audio Feedback

Tone Meaning
Short beep (1200 Hz) Button press registered, waking up
Short beep (1500 Hz) Connected, start speaking now
Ascending chirp Upload succeeded
Descending tone Something went wrong

Project Structure

flux/
  platformio.ini            Build configuration (board, libs, flags)
  src/
    config.h.example        Template — copy to config.h
    config.h                Your credentials and tuning (gitignored)
    main.cpp                Boot flow, deep sleep
    audio.h / audio.cpp     Microphone, speaker, ring buffer, audio processing
    network.h / network.cpp WiFi, HTTPS streaming upload
    led.h / led.cpp         LED control (solid, blink, flash)
  relay/
    src/index.ts            Worker: auth, multipart parsing, orchestration
    src/wav.ts              PCM-to-WAV conversion (44-byte RIFF header)
    src/discord.ts          Discord webhook file upload
    wrangler.toml           Worker configuration
    package.json            Dependencies

How It Works (In Depth)

Streaming Architecture

The Atom Echo has only 320 KB of RAM — far too little to buffer a long recording and then upload it. Instead, the firmware streams audio to the server in real-time as it's being recorded.

A 64 KB lock-free ring buffer bridges two tasks running on the ESP32's dual cores:

  • Core 0 runs the recording task: reads samples from the PDM microphone, applies a high-pass filter and gain, writes to the ring buffer
  • Core 1 runs the upload task: reads from the ring buffer, sends the data over HTTPS using chunked transfer encoding

The HTTPS connection (including the TLS handshake) is established before recording begins, so audio starts flowing to the server immediately — no data is lost during connection setup.

Audio Processing

Every audio chunk goes through three stages:

  1. High-pass filter — Removes DC offset and low-frequency rumble (~40 Hz cutoff)
  2. Gain — Amplifies the quiet PDM microphone output (10x / ~20 dB by default)
  3. Soft clamp — Prevents digital distortion by clamping to the int16 range

WiFi Fallback

The device tries each configured WiFi network in order. If a network fails within the 3-second timeout, the LED briefly flashes red and the next network is attempted. This is useful if you move between locations (home, office, mobile hotspot).

Deep Sleep

Between recordings, the device enters deep sleep and draws minimal current. A button press on GPIO39 wakes it up. Bluetooth memory (~30 KB) is released at boot since it's never used.

The Relay Worker

The Cloudflare Worker receives a multipart POST from the device containing:

  • Audio metadata (sample rate, bit depth, channels)
  • Raw PCM audio data

It validates the API key, prepends a 44-byte WAV header to the PCM data, and uploads the resulting WAV file to Discord via the configured webhook.

Configuration Reference

All tuning parameters live in src/config.h. The defaults work well out of the box — you typically only need to set the WiFi credentials, relay URL, and API key.

Parameter Default Description
WIFI_TIMEOUT_MS 3000 Per-network connection timeout (ms)
SAMPLE_RATE 16000 Audio sample rate in Hz
AUDIO_GAIN 10 Microphone amplification (10x = ~20 dB)
RING_BUFFER_SIZE 65536 Ring buffer size in bytes (64 KB = ~2s of audio)
RECORD_TASK_PRIORITY 19 FreeRTOS priority for the recording task
RECORD_TASK_CORE 0 CPU core for recording (0 = with WiFi)

Troubleshooting

No beep when pressing the button

Make sure the device is properly flashed and the USB cable supports data (not just charging). Try pio device monitor to see if any logs appear.

LED turns red immediately

All configured WiFi networks failed. Verify your SSID and password in src/config.h. Make sure you're within range of a 2.4 GHz network (the ESP32 doesn't support 5 GHz).

Success tone but nothing in Discord

The device thinks the upload worked, but the relay may have failed to forward to Discord. Check your Cloudflare Worker logs via pnpx wrangler tail in the relay/ directory. Verify that DISCORD_WEBHOOK_URL is set correctly.

Audio is too quiet or too loud

Adjust AUDIO_GAIN in src/config.h. The default of 10 (20 dB) works well for speech at arm's length. Increase for quieter environments, decrease if you hear clipping.

Recording cuts off unexpectedly

This is protected against by a debounce mechanism, but if it happens, check the serial monitor for details. The ESP32's GPIO39 can produce spurious readings during WiFi activity — the firmware requires 3 consecutive button-release readings 30 ms apart to confirm a genuine release.

License

MIT

About

A thumb-sized push-to-talk device for sending voice notes to your AI agent (e.g. @openclaw).

Resources

Stars

Watchers

Forks

Contributors