clawbar
GitHub: stevenvo780/clawbar
What it is
Sección titulada «What it is»clawbar is a voice + status integration for Wayland desktops running Hyprland + waybar. It wraps an OpenClaw AI agent running in Docker into a hands-free desktop assistant:
- Push-to-talk or fully hands-free (VAD): press
SUPER+SHIFT+Vor left-click the waybar module; clawbar records your mic, auto-stops on silence (~1.5 s of trailing silence), transcribes with Whisper, optionally grabs a screenshot, sends the query to the agent, and speaks the reply back with Kokoro TTS. - Live waybar module: shows the current assistant phase with icon, colour, and animation — updates instantly on every phase change via
SIGRTMIN+8. - One-click actions: left-click to talk, right-click to open the dashboard, middle-click to drop into the agent’s TUI terminal.
- Intentionally thin: never modifies the agent container. Talks to the agent with
docker execand to the audio sidecar over plain HTTP. The agent stays fully isolated.
Used daily on CachyOS (Arch-based) with Hyprland.
How the VAD works
Sección titulada «How the VAD works»bin/clawbar-vad.py reads raw 16 kHz mono PCM from ffmpeg -f pulse and computes RMS energy in dBFS per 30 ms frame:
- Waits for speech to begin — never cuts on initial silence (no false trigger on breath/ambient noise).
- Once
VAD_MIN_SPEECH_MS(default 300 ms) of speech has accumulated, arms the silence watchdog. - After
VAD_SILENCE_MS(default 1500 ms) of continuous silence, exits0→claw-talkends the turn and sends it. - Exits
2if speech never starts withinVAD_START_GRACE_MS; exits1on EOF.
A MAX_REC_SECS hard cap and the manual toggle remain as fallbacks. Run the self-test without a mic:
python3 bin/clawbar-vad.py --selftest # synthetic speech + silence sequencepython3 bin/clawbar-vad.py --wav some.wav # test against a 16 kHz mono PCM WAVPhase states
Sección titulada «Phase states»| State | Icon | waybar class | CSS animation |
|---|---|---|---|
| idle | 🐾 | idle | none |
| listening | 🎙 | listening | pulse |
| transcribing | 📝 | transcribing | none |
| looking | 👁 | looking | none |
| thinking | 🤔 | thinking | blink |
| speaking | 🗣 | speaking | pulse |
| error | ⚠ | error | blink |
Style each phase via #custom-claw.<class> in your waybar CSS. Default styles ship in waybar/style.css using the Catppuccin Mocha palette.
Requirements
Sección titulada «Requirements»- Linux with PipeWire (
pw-record+pw-play/paplay) - waybar + Hyprland (the keybind and bar module;
claw-talkitself works on any session) - ffmpeg, python3, jq, curl, notify-send
- Docker, with:
- Optional:
spectacleorgrim+ ImageMagick (magick/convert) for the screen-vision feature
Installation
Sección titulada «Installation»git clone https://github.com/stevenvo780/clawbar.gitcd clawbar./install.shThe installer:
- Copies
bin/claw-talk,bin/clawbar-vad.py,bin/clawbar-statusto~/.local/bin/(backs up any existing version with a timestamp). - Creates
~/.config/clawbar/clawbar.envfrom the example (only if absent — re-running is safe). - Merges the
custom/clawwaybar module into your config + modules file, then validates that waybar still parses — reverts if not. - Appends styles to your waybar
style.css. - Adds the Hyprland keybind (
SUPER+SHIFT+V) only if noclaw-talkbind exists yet. - Reloads waybar (
SIGUSR2) and runshyprctl reload.
Everything is backed up as *.bak-clawbar-<timestamp>. Re-running is safe.
Uninstall
Sección titulada «Uninstall»./uninstall.shRestores the newest *.bak-clawbar-* backup for each file and removes the installed scripts. Your clawbar.env configuration is kept.
Configuration
Sección titulada «Configuration»All knobs live in clawbar.env (copy from clawbar.env.example). No secrets — the agent’s auth lives in the container or your secret store.
CLAW_CTR=claw # name of the OpenClaw agent containerAUDIO_CTR=claw-audio # name of the STT/TTS audio sidecar
VOICE=ef_dora # Kokoro voice IDLANG_STT=es # Whisper language hint
CLAWBAR_VAD=on # enable hands-free auto-stopVAD_SILENCE_MS=1500 # trailing silence that ends the turn (ms)VAD_THRESH_DB=-38 # speech detection threshold (dBFS)VAD_MIN_SPEECH_MS=300 # minimum speech before cutoff arms (ms)VAD_START_GRACE_MS=5000 # timeout if speech never starts (ms)
MAX_REC_SECS=120 # hard cap on recording length (seconds)
CLAWBAR_WAYBAR_SIGNAL=8 # must match "signal" in the waybar module JSONclaw-talk searches for clawbar.env at: $CLAWBAR_ENV → next to the script → ~/.config/clawbar/clawbar.env → legacy locations.
| Action | How |
|---|---|
| Talk (toggle / hands-free) | SUPER+SHIFT+V, left-click the bar icon, or claw-talk toggle |
| Force-send while recording | Press the keybind again (overrides VAD, sends immediately) |
| Open dashboard | Right-click the bar icon |
| Agent terminal (TUI) | Middle-click the bar icon |
| Speak arbitrary text | claw-talk say "mensaje aquí" |
| Self-check | claw-talk test |
| Read current phase | claw-talk state |
Say “mirá la pantalla y decime qué ves” (configurable regex, works in Spanish and English) to trigger the screen-vision path — clawbar grabs a screenshot, downscales it, and hands it to the agent.
| Layer | Technology |
|---|---|
| Core scripts | Bash |
| VAD module | Python 3 (clawbar-vad.py, RMS dBFS per 30 ms frame) |
| Audio capture | PipeWire + ffmpeg |
| STT | Whisper (via speaches sidecar, OpenAI-compatible API) |
| TTS | Kokoro (via speaches sidecar) |
| Agent bridge | docker exec |
| Bar integration | waybar JSON protocol + SIGRTMIN+8 |
| Compositor | Hyprland (keybind + hyprctl reload) |
| Vision (optional) | spectacle / grim + ImageMagick |
| Color scheme | Catppuccin Mocha |
| License | MIT |