Say "Hey Jarvis." A glowing arc-reactor HUD built in raylib, a streaming Claude brain with working hands — Gmail, GitHub, Mac control, timers — conversation memory that survives relaunches, and a dry, sarcastic British voice synthesized locally. He starts answering out loud in about a second, and you can talk over him.
Always-listening wake word (local openWakeWord, no cloud). Speech is transcribed on-device with faster-whisper. After each reply the mic stays hot for five seconds — just keep talking, like a real conversation.
The brain streams sentence-by-sentence into a pipelined TTS queue: sentence one is already playing while sentence three is still being generated. First words in ~1.5s. Barge in any time — saying "Hey Jarvis" cuts him off mid-word.
Native Claude tool use: he reads your Gmail, checks your
GitHub notifications and PRs, opens apps, drives Music/Spotify, sets the
volume, flips dark mode, and runs countdown timers — announced aloud, drawn as a gold
ring on the HUD.
A rolling history rides along with every request and persists across launches. He remembers what you asked yesterday. "Forget our conversation" wipes the slate.
The RMS envelope of every synthesized sentence drives the 72-bar visualizer and the reactor's glow — the rings genuinely pulse with his voice, and the transcript reveals typewriter-style in sync with the audio.
Piper neural TTS (en_GB-alan) runs entirely offline. Falls back to
macOS Daniel; plugs into ElevenLabs for the full Bettany. The wit is film-grade:
deadpan, devoted, faintly exasperated.
⌘M shrinks him to a 230px always-on-top arc reactor you can park in a corner all day. Click it — or say "Hey Jarvis" — and the full HUD snaps back.
Synthesized sci-fi one-shots — boot power-up sweep, readout ticks, wake chirp, sonar ping while thinking, triple-ding timers. Generated with numpy at first launch; no audio assets shipped.
The signed .app checks GitHub releases at launch and quietly swaps newer builds in place — then tells you, out loud, that he's upgraded himself. Naturally.
mic ─▶ openWakeWord ("hey jarvis", 80ms frames, local onnx) ─▶ capture until trailing silence ─▶ faster-whisper (local, int8) ─▶ Claude — streaming + tool loop ⚒ gmail · github · mac · timers ─▶ sentence splitter ─▶ Piper synth ──┐ (next sentence synthesizes ─▶ afplay ◀──────────────────────────┘ while this one plays) ─▶ RMS envelope ─▶ visualizer bars + reactor glow + transcript typewriter
# the easy way — signed app, self-updates download & unzip JARVIS-macos.zip, open JARVIS.app # from source git clone https://github.com/binRick/jarvis && cd jarvis ./jarvis # first run builds the venv + signed .app # give him a faster brain (optional — the claude CLI works too) echo sk-ant-... > ~/.jarvis/anthropic_key.txt