A tiny push-to-talk voice recorder that sends your voice messages straight to Discord (via a relay deployed on Cloudflare). Built on the M5Stack Atom Echo — a thumb-sized ESP32 device with a built-in microphone, speaker, and LED.
Hold the button, speak, release. Your recording shows up in Discord within seconds. I use this to send voice notes to my OpenClaw personal assistant. On the go, my mobile device acts as a WiFi hotspot.
Hold button ─── [Beep] "I'm awake"
|
| LED Blue ── Connecting to WiFi ...
|
| [Beep] ──── "Start speaking"
|
| LED Yellow ─ Recording + streaming
|
Release button
|
| [Beep] ──── Success or failure tone
| LED ─────── Green (done) or red (error)
|
Deep sleep ───── Waiting for next press
The audio is streamed in real-time to a small Cloudflare Worker (the "relay"), which converts the raw audio to a WAV file and posts it to Discord via a webhook — all within seconds.
- M5Stack Atom Echo (~$13)
- ATOM TailBat - Battery Pack for mobile usage (~$9.50)
- USB-C cable for flashing
- WiFi network (2.4 GHz, WPA2 / Android and iOS offer the option to create a personal hotspot)
- Discord server where you can create a webhook
- Cloudflare account (the free tier is sufficient)
- PlatformIO for building and flashing the firmware (no account required; available as VS Code extension or CLI)
- Node.js 22+ and pnpm for the relay worker
git clone https://github.com/akoenig/flux-buddy.git
cd flux- Open your Discord server
- Go to Server Settings > Integrations > Webhooks
- Click New Webhook
- Choose the channel where voice messages should appear
- Copy the Webhook URL — you'll need it in the next step
The relay is a small Cloudflare Worker that sits between the device and Discord. It receives raw audio from the Atom Echo, wraps it in a WAV file, and forwards it to your Discord webhook.
cd relay
pnpm installGenerate a strong API key. This shared secret authenticates requests from your device so nobody else can use your relay:
openssl rand -base64 32Save that key somewhere — you'll need it for the firmware config too.
Now set the two required secrets:
pnpx wrangler secret put API_KEY
# Paste the API key you just generated
pnpx wrangler secret put DISCORD_WEBHOOK_URL
# Paste your Discord webhook URL from Step 2Deploy:
pnpm run deployWrangler will print the URL of your deployed worker, e.g.:
https://flux-buddy-relay.<your-subdomain>.workers.dev
Note this URL down — the firmware needs it.
cd ..cp src/config.h.example src/config.hOpen src/config.h in your editor and fill in your credentials:
static const WiFiCredential WIFI_CREDENTIALS[] = {
{ "your-wifi-ssid", "your-wifi-password" },
};You can add multiple WiFi networks. The device tries them in order and falls back to the next one if a connection fails:
static const WiFiCredential WIFI_CREDENTIALS[] = {
{ "home-wifi", "home-password" },
{ "office-wifi", "office-password" },
{ "phone-hotspot", "hotspot-password" },
};Then set the relay URL and API key:
#define UPLOAD_URL "https://flux-buddy-relay.<your-subdomain>.workers.dev/upload"
#define API_KEY "same-key-you-set-in-wrangler"
src/config.his gitignored and will never be committed. Your credentials stay on your machine.
Connect the Atom Echo via USB-C, then:
pio run -t uploadThat's it. The firmware is now on the device.
- Press and hold the button on the Atom Echo
- Wait for two beeps (boot beep, then a higher-pitched ready beep)
- Speak your message
- Release the button
- Listen for the success tone (ascending chirp) and watch for a green LED
- Check your Discord channel — the WAV file should appear within seconds
For debugging or verifying that everything works, you can watch the device logs:
pio device monitorAll log lines are prefixed with [FB] and show WiFi connection, HTTPS handshake, recording stats, and upload progress.
| Color | Meaning |
|---|---|
| Blue (solid) | Connecting to WiFi |
| Red (brief flash) | WiFi network failed, trying next one |
| Yellow | Recording and streaming |
| Green | Upload succeeded |
| Red | Error (all WiFi networks failed, HTTPS failure, or upload error) |
| Blue (blinking) | Recording paused (ring buffer full, draining) |
| Tone | Meaning |
|---|---|
| Short beep (1200 Hz) | Button press registered, waking up |
| Short beep (1500 Hz) | Connected, start speaking now |
| Ascending chirp | Upload succeeded |
| Descending tone | Something went wrong |
flux/
platformio.ini Build configuration (board, libs, flags)
src/
config.h.example Template — copy to config.h
config.h Your credentials and tuning (gitignored)
main.cpp Boot flow, deep sleep
audio.h / audio.cpp Microphone, speaker, ring buffer, audio processing
network.h / network.cpp WiFi, HTTPS streaming upload
led.h / led.cpp LED control (solid, blink, flash)
relay/
src/index.ts Worker: auth, multipart parsing, orchestration
src/wav.ts PCM-to-WAV conversion (44-byte RIFF header)
src/discord.ts Discord webhook file upload
wrangler.toml Worker configuration
package.json Dependencies
The Atom Echo has only 320 KB of RAM — far too little to buffer a long recording and then upload it. Instead, the firmware streams audio to the server in real-time as it's being recorded.
A 64 KB lock-free ring buffer bridges two tasks running on the ESP32's dual cores:
- Core 0 runs the recording task: reads samples from the PDM microphone, applies a high-pass filter and gain, writes to the ring buffer
- Core 1 runs the upload task: reads from the ring buffer, sends the data over HTTPS using chunked transfer encoding
The HTTPS connection (including the TLS handshake) is established before recording begins, so audio starts flowing to the server immediately — no data is lost during connection setup.
Every audio chunk goes through three stages:
- High-pass filter — Removes DC offset and low-frequency rumble (~40 Hz cutoff)
- Gain — Amplifies the quiet PDM microphone output (10x / ~20 dB by default)
- Soft clamp — Prevents digital distortion by clamping to the int16 range
The device tries each configured WiFi network in order. If a network fails within the 3-second timeout, the LED briefly flashes red and the next network is attempted. This is useful if you move between locations (home, office, mobile hotspot).
Between recordings, the device enters deep sleep and draws minimal current. A button press on GPIO39 wakes it up. Bluetooth memory (~30 KB) is released at boot since it's never used.
The Cloudflare Worker receives a multipart POST from the device containing:
- Audio metadata (sample rate, bit depth, channels)
- Raw PCM audio data
It validates the API key, prepends a 44-byte WAV header to the PCM data, and uploads the resulting WAV file to Discord via the configured webhook.
All tuning parameters live in src/config.h. The defaults work well out of the box — you typically only need to set the WiFi credentials, relay URL, and API key.
| Parameter | Default | Description |
|---|---|---|
WIFI_TIMEOUT_MS |
3000 | Per-network connection timeout (ms) |
SAMPLE_RATE |
16000 | Audio sample rate in Hz |
AUDIO_GAIN |
10 | Microphone amplification (10x = ~20 dB) |
RING_BUFFER_SIZE |
65536 | Ring buffer size in bytes (64 KB = ~2s of audio) |
RECORD_TASK_PRIORITY |
19 | FreeRTOS priority for the recording task |
RECORD_TASK_CORE |
0 | CPU core for recording (0 = with WiFi) |
Make sure the device is properly flashed and the USB cable supports data (not just charging). Try pio device monitor to see if any logs appear.
All configured WiFi networks failed. Verify your SSID and password in src/config.h. Make sure you're within range of a 2.4 GHz network (the ESP32 doesn't support 5 GHz).
The device thinks the upload worked, but the relay may have failed to forward to Discord. Check your Cloudflare Worker logs via pnpx wrangler tail in the relay/ directory. Verify that DISCORD_WEBHOOK_URL is set correctly.
Adjust AUDIO_GAIN in src/config.h. The default of 10 (20 dB) works well for speech at arm's length. Increase for quieter environments, decrease if you hear clipping.
This is protected against by a debounce mechanism, but if it happens, check the serial monitor for details. The ESP32's GPIO39 can produce spurious readings during WiFi activity — the firmware requires 3 consecutive button-release readings 30 ms apart to confirm a genuine release.
MIT
