Claudia — Build Guide (2026.05.19)

A single-path, checkpoint-driven build guide for a Raspberry Pi Zero 2 W + PiSugar Whisplay HAT voice assistant powered by the Claude API. Say "Hey Jarvis" (or train your own "Claudia" wake word — Appendix A), ask anything, and Claude talks back.

github.com/mindattic/Claudia

What's new in 2026.05.19 (vs. 2026.05.18) #

Wake word is now the default activation method. Out-of-the-box you say "Hey Jarvis" — that's the closest pre-trained model openWakeWord ships. Want it to actually answer to "Claudia"? Appendix A walks through training a custom model.
Part 7.3 installs the wake-word engine (openWakeWord in a Python 3.11 pyenv). Adds ~3 minutes to first install; no extra cloud API keys.
Mic decision flipped. Wake word needs reasonable pickup, so the budget USB mic is now the recommended floor, not the on-board mics.
Button is still wired up — fall back to it by setting WAKE_WORD_ENABLED=false in .env.

What's new in 2026.05.18 (vs. previous) #

One path, not two. The PiSugar whisplay-ai-chatbot repo is purpose-built for this exact hardware and is the right choice for a Pi Zero 2 W.
Verified install commands against the live upstream repos.
Checkpoint tests at the end of each Part.
First-run healthcheck.sh script (Part 8).
systemd unit dropped — the repo's startup.sh sets up chatbot.service properly.
Mic decision moved to the front as a 3-question flowchart.
Cost tiers cut from 4 to 2. Desktop or portable.

Pick your build before you buy #

Two questions decide everything.

Q1 — Where will this device sit?

On my desk, within arm's reach → desktop build (cheaper, simpler)
Roaming around the house / outside / unplugged → portable build (add the PiSugar 3 battery)

Q2 — How far away will you be when you talk to it?

Because wake-word activation is on by default, the mic has to reliably hear "Hey Jarvis" without you leaning into it. The on-board Whisplay mics technically work but you'll be repeating yourself.

Within 3 feet, quiet room, OK with occasional misses → on-board Whisplay mics (no extra purchase). Acceptable only if you'll mostly use the button.
4–10 feet, normal room (recommended floor) → add the SunFounder USB mini mic (~$13) + an OTG adapter
Across the room, possibly noisy, "Alexa-class" pickup → add the reSpeaker XVF3800 mic array (~$60) + an OTG adapter — its on-board AEC + beamforming dramatically improves wake-word reliability.

That's it. Now build the cart.

Part 1: Shopping list #

Core build (required) #

Item	Why	~Price	Notes
Raspberry Pi Zero 2 WH	The brain. The "H" matters — it has the GPIO header pre-soldered, which the Whisplay HAT needs. Don't buy the bare "Pi Zero 2 W" or you'll need to solder 40 pins yourself.	~$22	Search Amazon or Adafruit for "Raspberry Pi Zero 2 WH"
PiSugar Whisplay HAT	LCD, speaker, dual mic, RGB LED, buttons — everything except the CPU.	~$36	Direct from PiSugar's site or Amazon
microSD card, 32 GB Class 10 (SanDisk Ultra or equivalent)	The "hard drive." 16 GB works but 32 GB gives you headroom.	~$9	Any reputable brand
Official Raspberry Pi 12.5W micro-USB power supply (5V/2.5A)	The Pi Zero 2 W is rated for 5V/2.5A. Phone chargers will seem to work and then cause random crashes. Buy the official one or a known-good 2.5A+ supply.	~$10	Pi Hut, Adafruit, official resellers

Pick-your-mic (optional) #

Item	When to buy	~Price
SunFounder USB Mini Mic	If you need 4–10 ft pickup.	~$13
Seeed reSpeaker XVF3800 USB Mic Array	If you want across-the-room pickup with built-in echo cancellation, beamforming, and noise suppression.	~$60
micro-USB OTG adapter	Required if you bought either USB mic above — the Pi Zero only has a micro-USB data port.	~$7

Portable (optional) #

Item	When to buy	~Price
PiSugar 3 1200 mAh battery	If you want to unplug and carry it around. Snaps onto the Pi via magnetic pogo pins — no soldering.	~$40

Totals #

Build	What you get	Total
Desktop	Core + Whisplay's mics + wall power	~$77
Desktop + better mic (budget)	+ SunFounder + OTG	~$97
Desktop + better mic (premium)	+ reSpeaker XVF3800 + OTG	~$144
Portable + premium mic	All of the above + PiSugar 3 battery	~$184

Prices fluctuate. Verify before checkout. Where I've named vendors, search by product name rather than trusting a link.

What you do not need #

A separate speaker (the Whisplay has one)
A monitor or keyboard (we do the whole setup over SSH from your PC)
A USB hub (one USB mic on the OTG port is fine)

Part 2: Assemble the hardware #

Total time: ~5 minutes. No soldering.

Do not insert the microSD yet. Flash it first in Part 3.
Align the Whisplay HAT's 40-pin socket with the Pi Zero 2 WH's GPIO header pins. The Whisplay's buttons should be on the same side as the Pi's USB ports.
Press the HAT down firmly and evenly until fully seated. Hold the PCB edges; do not press on the glass LCD. Press the LCD and it cracks.
Peel off the LCD protective film.
(Optional, portable build) Snap the PiSugar 3 battery onto the underside of the Pi using its magnetic pogo pins.

Final stack (top → bottom): Whisplay HAT → Pi Zero 2 WH → PiSugar 3 (optional)

✅ Checkpoint: The stack feels solid, the LCD is exposed and undamaged, nothing wobbles.

Part 3: Flash the microSD card #

3.1 Install Raspberry Pi Imager #

Download from raspberrypi.com/software (Windows, macOS, Linux).

3.2 Flash #

Open Raspberry Pi Imager.
Choose Device → Raspberry Pi Zero 2 W.
Choose OS → Raspberry Pi OS (other) → Raspberry Pi OS (64-bit) (the full version, not Lite).
- The chatbot repo's install script expects packages from the full image. Lite will work but you'll need extra apt installs and may hit surprises.
Choose Storage → your microSD card.
Click the gear icon (⚙) for Edit Settings and configure:
- Hostname: claudia
- Username: pi
- Password: something secure
- Enable SSH: ✅ password auth
- Wireless LAN: SSID + password for your home Wi-Fi
- Locale: America/Chicago, keyboard us
Save, then Write. Takes 2–5 minutes.

3.3 First boot #

Insert the microSD into the Pi.
Plug the official power supply into the PWR IN micro-USB port (the one nearest the corner, labeled PWR IN on the silkscreen). Not the middle port labeled USB.
Wait 60–90 seconds.
From your PC:
```
ssh pi@claudia.local
```
If claudia.local doesn't resolve, find the Pi's IP in your router's admin page and use ssh pi@192.168.x.x.

✅ Checkpoint: You see the pi@claudia:~ $ prompt. Run cat /etc/os-release and confirm it says Debian/Raspberry Pi OS. Run free -h — you should see ~430 MB of Mem: (the Pi Zero 2 W has 512 MB total).

Part 4: System setup #

Run these from the SSH session. One at a time. Wait for each to finish.

4.1 Update #

sudo apt update && sudo apt full-upgrade -y

This takes 5–15 minutes on a Pi Zero. Be patient.

4.2 Free up RAM (Pi Zero only has 512 MB) #

The Pi Zero 2 W is RAM-constrained. Disable services you don't need:

# Disable Bluetooth (not used by this build)
sudo systemctl disable hciuart bluetooth

# Disable triggerhappy (gamepad daemon, not needed)
sudo systemctl disable triggerhappy

4.3 Install build dependencies #

sudo apt install -y git curl build-essential python3-pip python3-venv \
  portaudio19-dev libsndfile1 ffmpeg alsa-utils libatlas-base-dev

4.4 Install the Whisplay HAT driver (LCD + audio + buttons + LEDs) #

This is the official PiSugar driver. The install script also enables the I2C, SPI, and I2S buses automatically.

cd ~
git clone https://github.com/PiSugar/Whisplay.git --depth 1
cd Whisplay/Driver
sudo bash install_wm8960_drive.sh
sudo reboot

Wait ~60 seconds, then SSH back in:

ssh pi@claudia.local

✅ Checkpoint 1 — driver loaded:

aplay -l

You should see a card whose name contains wm8960. If not, the driver didn't load — re-run the install script and re-check.

✅ Checkpoint 2 — speaker works:

speaker-test -t sine -f 440 -l 1 -D plughw:CARD=wm8960soundcard

You should hear a one-second 440 Hz beep from the Whisplay's speaker. If you hear nothing, run alsamixer, press F6, pick the wm8960 card, and turn up the playback levels.

✅ Checkpoint 3 — on-board mic works:

arecord -d 5 -f cd /tmp/mic_test.wav && aplay /tmp/mic_test.wav

Record 5 seconds, then play it back. You should hear your voice. If it's silent or garbled, raise the "Capture" levels in alsamixer (F4 toggles between Playback and Capture).

If all three checkpoints pass and you're using the on-board mic, skip to Part 5. Otherwise, do Part 4.5 first.

Part 4.5: Configure a USB microphone (skip if using on-board mics) #

Plug it in #

Power down: sudo shutdown -h now and unplug power.
Connect the OTG adapter to the Pi's data micro-USB port — that's the middle one labeled USB, not PWR IN.
Plug the USB mic into the OTG adapter.
Power back on, SSH in.

Verify it's detected #

arecord -l

You should now see two capture cards:

card 0: wm8960soundcard ...
card 1: ArrayUAC10 / Mini Mic ...  (your USB mic — name varies)

If card 1 is missing: try a different OTG cable (some are charge-only), or make sure you're in the data port not the power one. Run lsusb and confirm the device shows up.

Test recording from the USB mic #

arecord -D plughw:1,0 -d 5 -f cd /tmp/usb_test.wav
aplay /tmp/usb_test.wav

If it's quiet, run alsamixer, press F6 to switch to the USB card, F4 for Capture controls, and raise it to ~80%.

Make the USB mic the system default capture device #

This way the chatbot picks it up automatically — speaker stays on the Whisplay.

nano ~/.asoundrc

Paste:

# Playback on Whisplay speaker (card 0), capture on USB mic (card 1)
pcm.!default {
    type asym
    playback.pcm {
        type plug
        slave.pcm "hw:0,0"
    }
    capture.pcm {
        type plug
        slave.pcm "hw:1,0"
    }
}

ctl.!default {
    type hw
    card 1
}

Save with Ctrl+X, Y, Enter.

Pin the card numbering across reboots #

Linux's USB card numbers can swap on reboot. Lock the USB mic to card 1:

echo "options snd_usb_audio index=1" | sudo tee /etc/modprobe.d/alsa-base.conf

Reboot:

sudo reboot

Verify the default config works end-to-end #

ssh pi@claudia.local
arecord -d 5 -f cd /tmp/default_test.wav && aplay /tmp/default_test.wav

This should record from the USB mic and play through the Whisplay speaker — without specifying any device flags.

✅ Checkpoint: Recording and playback both use the right devices via the system default.

reSpeaker XVF3800 note: The on-board AEC, beamforming, and noise suppression run automatically. No extra software needed. For raw multi-channel access or firmware tweaks, see the Seeed reSpeaker XVF3800 wiki.

Part 5: Install the chatbot software #

This is the PiSugar whisplay-ai-chatbot repo. It's purpose-built for the Pi Zero 2 W + Whisplay combo. Say "Hey Jarvis" → ask your question → Claude answers out loud. The on-board button still works as a fallback.

cd ~
git clone https://github.com/PiSugar/whisplay-ai-chatbot.git
cd whisplay-ai-chatbot
bash install_dependencies.sh
source ~/.bashrc

The dependency install pulls Node.js, Python packages, and audio libraries. This takes 15–25 minutes on a Pi Zero 2 W. Let it finish.

The source ~/.bashrc line is important — the installer sets PATH entries you need in your current shell session.

✅ Checkpoint: install_dependencies.sh finishes without errors. Test that Node is on PATH:

node --version

You should see v20.x or similar.

Part 6: Get a Claude API key #

Go to console.anthropic.com and sign in (or create an account).
Add a payment method and put a small amount of credit on the account (e.g., $5 — that lasts a long time on Haiku).
Navigate to API Keys → Create Key.
Name it claudia. Copy the key now — you can't see it again later.
Treat the key like a password.

Approximate cost: Casual personal use on claude-haiku-4-5-20251001 typically runs a few dollars per month at most. Check current pricing at anthropic.com/pricing.

Which model to pick #

Model ID	Speed	Quality	When to use
`claude-haiku-4-5-20251001`	Fastest	Good	Default for this device. Latency matters more than essay-grade prose for a voice assistant.
`claude-sonnet-4-6`	Medium	Excellent	If you want richer answers and don't mind a slightly slower response.
`claude-opus-4-7`	Slowest	Best	Overkill for spoken Q&A. Use for hard reasoning tasks only.

Model IDs change over time. The current list lives at docs.claude.com.

Part 7: Configure the chatbot #

7.1 Create your `.env` #

cd ~/whisplay-ai-chatbot
cp .env.template .env
nano .env

The template ships with many fields for different ASR/LLM/TTS providers. For a Claude-based build, you need the LLM section set to Anthropic and the wake-word section enabled. Find and set:

# === LLM (the AI brain) ===
LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-YOUR-KEY-HERE
ANTHROPIC_MODEL=claude-haiku-4-5-20251001

# === System prompt — shapes the assistant's voice ===
SYSTEM_PROMPT=You are a concise, friendly voice assistant. Answer in plain spoken English — no markdown, no bullet lists, no headings. Keep responses to 1–3 sentences unless the user explicitly asks for more.

# === Wake word ===
# Default is "Hey Jarvis" — the closest pre-trained openWakeWord model. To
# answer to "Claudia" instead, train a custom model (Appendix A), drop the
# .tflite onto the Pi, and set WAKE_WORDS=claudia + WAKE_WORD_MODEL_PATHS=...
# To disable wake word entirely and fall back to the button, set
# WAKE_WORD_ENABLED=false.
WAKE_WORD_ENABLED=true
WAKE_WORDS=hey_jarvis
WAKE_WORD_PYTHON_PATH=/home/pi/.pyenv/versions/python311/bin/python
WAKE_WORD_THRESHOLD=0.5
WAKE_WORD_IDLE_TIMEOUT_SEC=60
WAKE_WORD_END_KEYWORDS=byebye,goodbye,stop

For speech-to-text (ASR) and text-to-speech (TTS), the template defaults usually work. If you want fully local (no extra API keys), pick whisper for ASR and piper for TTS. If you want higher-quality cloud STT/TTS, see the wiki for OpenAI / Google / Volcengine options — each needs its own API key.

The .env.template evolves. If your file looks different from this guide, the live template at github.com/PiSugar/whisplay-ai-chatbot/blob/master/.env.template is the source of truth.

Save: Ctrl+X, Y, Enter.

7.2 Build the project #

bash build.sh

This compiles the TypeScript and prepares assets. ~5–10 minutes on a Pi Zero 2 W.

✅ Checkpoint: build.sh exits cleanly with no errors.

7.3 Install the wake-word engine (openWakeWord) #

The chatbot's wake-word frontend is openWakeWord. Raspberry Pi OS ships Python 3.12, but openWakeWord needs 3.11 for its prebuilt wheels — so we install Python 3.11 in a pyenv sandbox to avoid touching the system Python.

# 1. Build prerequisites + pyenv installer
sudo apt install -y build-essential make gcc \
  libssl-dev zlib1g-dev libbz2-dev libreadline-dev \
  libsqlite3-dev curl llvm libncursesw5-dev xz-utils \
  tk-dev libxml2-dev libxmlsec1-dev libffi-dev liblzma-dev git
curl https://pyenv.run | bash

# 2. Wire pyenv into your shell (one-time)
cat >> ~/.bashrc <<'EOF'

# pyenv (for openWakeWord)
export PATH="$HOME/.pyenv/bin:$PATH"
eval "$(pyenv init -)"
eval "$(pyenv virtualenv-init -)"
EOF
source ~/.bashrc

# 3. Build Python 3.11 (~10–15 min on a Pi Zero — go get a coffee)
pyenv install 3.11.0
pyenv virtualenv 3.11.0 python311
pyenv activate python311

# 4. Install openWakeWord + download the model files
pip install openwakeword "numpy<2"
python -c "import openwakeword.utils as u; u.download_models()"
pyenv deactivate

The .env line WAKE_WORD_PYTHON_PATH=/home/pi/.pyenv/versions/python311/bin/python (set in 7.1 above) is what the chatbot uses to spawn the wake-word listener in this sandbox.

✅ Checkpoint: the model files exist on disk.

ls ~/.pyenv/versions/python311/lib/python3.11/site-packages/openwakeword/resources/models/ | head

You should see filenames like hey_jarvis_v0.1.tflite, alexa_v0.1.tflite, etc.

Part 8: First-run sanity check #

Before launching the full chatbot, run a 90-second healthcheck that verifies every layer end-to-end: speaker, mic, network, Claude API.

Create the script:

nano ~/healthcheck.sh

Paste:

#!/bin/bash
# claudia healthcheck — quick end-to-end smoke test
# Usage: bash ~/healthcheck.sh

set -u
ENV_FILE="$HOME/whisplay-ai-chatbot/.env"
PASS="\033[0;32m✓\033[0m"
FAIL="\033[0;31m✗\033[0m"
exit_code=0

step() { printf "\n%s\n" "── $1 ──"; }
ok()   { printf "  $PASS %s\n" "$1"; }
bad()  { printf "  $FAIL %s\n" "$1"; exit_code=1; }

step "1. Audio devices"
aplay -l | grep -q wm8960 && ok "wm8960 playback card detected" || bad "wm8960 NOT detected (driver issue?)"
arecord -l | grep -q card && ok "at least one capture card detected" || bad "no capture card detected"

step "2. Speaker test (1s beep)"
speaker-test -t sine -f 440 -l 1 -s 1 >/dev/null 2>&1 \
  && ok "speaker-test completed (did you hear a beep?)" \
  || bad "speaker-test failed"

step "3. Mic test (3s record-and-replay)"
echo "  (speak for 3 seconds now…)"
arecord -d 3 -f cd /tmp/hc_mic.wav >/dev/null 2>&1
[ -s /tmp/hc_mic.wav ] && ok "captured audio file written" || bad "no audio captured"
aplay /tmp/hc_mic.wav >/dev/null 2>&1 && ok "playback OK (did you hear yourself?)" || bad "playback failed"

step "4. Network reachability"
ping -c 1 -W 3 api.anthropic.com >/dev/null 2>&1 \
  && ok "api.anthropic.com is reachable" \
  || bad "cannot reach api.anthropic.com (Wi-Fi or DNS issue)"

step "5. Claude API call"
if [ ! -f "$ENV_FILE" ]; then
  bad "$ENV_FILE not found — finish Part 7 first"
else
  # shellcheck disable=SC1090
  set -a; source "$ENV_FILE"; set +a
  if [ -z "${ANTHROPIC_API_KEY:-}" ]; then
    bad "ANTHROPIC_API_KEY is empty in .env"
  else
    response=$(curl -s -w "\n%{http_code}" https://api.anthropic.com/v1/messages \
      -H "x-api-key: $ANTHROPIC_API_KEY" \
      -H "anthropic-version: 2023-06-01" \
      -H "content-type: application/json" \
      -d "{\"model\":\"${ANTHROPIC_MODEL:-claude-haiku-4-5-20251001}\",\"max_tokens\":50,\"messages\":[{\"role\":\"user\",\"content\":\"Say hello in exactly 5 words.\"}]}")
    http_code=$(echo "$response" | tail -n1)
    body=$(echo "$response" | sed '$d')
    if [ "$http_code" = "200" ]; then
      ok "Claude API responded HTTP 200"
      echo "  Reply: $(echo "$body" | grep -o '"text":"[^"]*"' | head -1 | sed 's/"text":"//;s/"$//')"
    else
      bad "Claude API returned HTTP $http_code"
      echo "  $body" | head -3
    fi
  fi
fi

echo
if [ $exit_code -eq 0 ]; then
  printf "$PASS All checks passed. You're ready for Part 9.\n"
else
  printf "$FAIL One or more checks failed. Fix above before running the chatbot.\n"
fi
exit $exit_code

Run it:

chmod +x ~/healthcheck.sh
bash ~/healthcheck.sh

✅ Checkpoint: All five sections print green check marks. If anything fails, fix that piece before moving on — running the full chatbot before this passes just makes debugging harder.

Part 9: Run the chatbot #

Manual launch (foreground, for testing) #

cd ~/whisplay-ai-chatbot
bash run_chatbot.sh

The LCD lights up with status. Say "Hey Jarvis" — Claudia plays a chime to indicate she's listening, you ask your question, Claude answers out loud. The session ends automatically after 60 seconds of silence (WAKE_WORD_IDLE_TIMEOUT_SEC) or when you say a stop word — by default byebye, goodbye, or stop.

You can also press the on-board button to wake her without the keyword — useful when there's background noise the wake-word detector is missing.

Stop the foreground process with Ctrl+C.

Set it to start on boot #

The repo provides an opinionated startup installer that registers a chatbot.service systemd unit and sets the system to multi-user (headless) mode. Use it:

cd ~/whisplay-ai-chatbot
bash startup.sh

After this, the chatbot starts automatically on every boot. Verify:

sudo systemctl status chatbot.service

You should see Active: active (running).

Live logs #

tail -f ~/whisplay-ai-chatbot/chatbot.log
# or
journalctl -u chatbot.service -f

Tuning wake-word reliability #

Too many false wakes (TV, conversations) → raise WAKE_WORD_THRESHOLD in .env from 0.5 toward 0.7.
Missing real wakes (you have to say it twice) → drop the threshold toward 0.3, or move closer / get a better mic.
Chime too aggressive → tweak WAKE_WORD_COOLDOWN_SEC.

Falling back to button-only #

If wake word is too unreliable for your room or you'd rather not have a CPU listener running 24/7:

cd ~/whisplay-ai-chatbot
sed -i 's/^WAKE_WORD_ENABLED=.*/WAKE_WORD_ENABLED=false/' .env
sudo systemctl restart chatbot.service

The button activation continues to work either way.

Part 10: Optional case (3D-printed) #

PiSugar publishes free STL files for case shells:

No printer? Upload the STL to a print service like JLC3DP or Craftcloud — a few dollars shipped.

Part 11: Troubleshooting #

Nothing plays through the speaker #

aplay -l should list wm8960. If not, re-run the driver install in Part 4.4.
Run alsamixer → F6 → wm8960 card → confirm Speaker is unmuted (no MM label) and above 0%.

Mic captures silence or garbage #

Run arecord -l, confirm the card you expect is listed.
Run alsamixer → F6 → mic card → F4 (Capture) → raise to ~80%.
For a USB mic: re-check Part 4.5; the OTG cable being charge-only is a common gotcha.

USB mic disappears after reboot #

Linux can renumber cards. The /etc/modprobe.d/alsa-base.conf line in Part 4.5 pins it. If you skipped that, do it now.

Build fails out of memory #

The Pi Zero 2 W only has 512 MB. Add swap if build.sh gets OOM-killed:

sudo dphys-swapfile swapoff
sudo sed -i 's/^CONF_SWAPSIZE=.*/CONF_SWAPSIZE=1024/' /etc/dphys-swapfile
sudo dphys-swapfile setup
sudo dphys-swapfile swapon

Service won't start #

sudo systemctl status chatbot.service --no-pager
journalctl -u chatbot.service -n 60 --no-pager

Look for the first ERROR line — usually a missing .env key or a wrong path.

Claude API returns 401 #

API key is invalid or expired. Re-copy from console.anthropic.com → API Keys.

Claude API returns 429 #

You're rate-limited. Add credit at console.anthropic.com → Billing.

Wake word never triggers #

Confirm WAKE_WORD_ENABLED=true in .env and the service was restarted after the change.
Confirm the Python sandbox exists and the path matches WAKE_WORD_PYTHON_PATH:
```
ls /home/pi/.pyenv/versions/python311/bin/python
```

Check the model file is downloaded:

ls ~/.pyenv/versions/python311/lib/python3.11/site-packages/openwakeword/resources/models/ | grep -i jarvis

Lower WAKE_WORD_THRESHOLD toward 0.3 in .env. On a noisy mic the default 0.5 can be too strict.
Watch the live log while saying the wake word — if you see no detection events at all, the listener isn't running. Look for Python tracebacks at service start: journalctl -u chatbot.service -n 100.

Wake word triggers on TV / unrelated speech #

Raise WAKE_WORD_THRESHOLD toward 0.7.
The hey_jarvis model is liberal by design. A custom-trained "Claudia" model (Appendix A) usually has fewer false positives because you only train against your own voice samples.

Responses feel slow #

Use claude-haiku-4-5-20251001 (Part 6 — it's the recommended default for this reason).
The Pi Zero 2 W's Wi-Fi antenna is weak. Move it closer to the router.
Local Whisper STT is the slowest step on a Pi Zero. If you have a cloud STT key (OpenAI, Google), switching to one of those in .env cuts perceived latency dramatically.

Need to re-run the healthcheck #

bash ~/healthcheck.sh

SD card filling up #

df -h
sudo apt clean
# clear chatbot recordings:
rm -f ~/whisplay-ai-chatbot/data/recordings/*.wav 2>/dev/null

Reference #

Whisplay HAT driver: https://github.com/PiSugar/Whisplay
Chatbot repo: https://github.com/PiSugar/whisplay-ai-chatbot
Chatbot wiki (wake word, image gen, battery display): https://github.com/PiSugar/whisplay-ai-chatbot/wiki
Pre-built SD card images: https://github.com/PiSugar/whisplay-ai-chatbot/wiki (skips most of Parts 4–7)
Claude API docs: https://docs.claude.com
Claude model catalog: https://docs.claude.com/en/docs/about-claude/models/overview
Pricing: https://anthropic.com/pricing

Summary stack #

Layer	What it is
Hardware	Pi Zero 2 WH + PiSugar Whisplay HAT (+ optional PiSugar 3 battery, USB mic)
OS	Raspberry Pi OS 64-bit
Audio driver	WM8960 (Whisplay)
Activation	Wake word (`hey_jarvis` by default; trained `claudia` via Appendix A) + button fallback
Wake-word engine	openWakeWord in a Python 3.11 pyenv
Speech → text	Local Whisper, or cloud STT if configured
LLM	Claude API (Anthropic)
Text → speech	Piper (local) or cloud TTS if configured
Service manager	systemd (`chatbot.service`, set up by `startup.sh`)

Only Claude runs in the cloud. Everything else can run on-device if you want it to.

Appendix A: Train your own "Claudia" wake word #

openWakeWord ships with pre-trained models for hey_jarvis, hey_mycroft, alexa, and hey_rhasspy. There is no pre-trained "Claudia" model — the only way to get the device to actually answer to its name is to train one yourself. This is a one-time chore: ~30 minutes of recording, ~15–30 minutes of training on a free Colab GPU, then a one-line .env swap on the Pi.

What you need #

A laptop or desktop with a working microphone (recording the positive samples).
A free Google Colab account (the training runs on a free T4 GPU — no local GPU required).
~45 minutes total.

A.1 Collect ~100 positive samples ("Claudia") #

The model learns from your voice. Record yourself saying "Claudia" naturally — same room and mic conditions you'll use the device in, ideally the same person who'll use it most.

Use any recorder that emits 16-bit, 16 kHz mono WAV. Audacity works fine: Tracks → Add New → Mono Track, set the project rate to 16000 Hz, record 1 second of "Claudia", export as WAV, repeat.

Aim for ~100 clips, each ~1 second long, with natural variation:

Different distances from the mic (close, normal, across-the-room)
Different intonations (statement, question, sleepy, energetic)
Some with light background noise (TV at low volume, fan)
A couple of "almost-claudia" pronunciations (drawing out the a, swallowing the i) to make the model robust

Save them all into a folder like claudia_samples/.

A.2 Run the openWakeWord training notebook #

Open the official training Colab linked from the openWakeWord repo: https://github.com/dscripka/openWakeWord → "Training new models" section. The notebook is named openwakeword_training_tutorial.ipynb.

In the notebook:

Runtime → Change runtime type → GPU.
Upload your claudia_samples/ folder into the Colab session (Files panel → upload).
Set the wake-word name to claudia and point the positive-samples path at your uploaded folder.
Keep the synthetic-data generator enabled — it'll fabricate thousands of additional "claudia" pronunciations using a TTS model, which is what actually makes a hundred real samples enough.
Run all cells. Training takes 15–30 minutes on the free T4.
When it's done, the notebook writes claudia.tflite (or claudia.onnx, depending on the notebook revision). Download it.

A.3 Drop the model onto the Pi #

From your PC:

scp claudia.tflite pi@claudia.local:/home/pi/wakeword/

(Create the dir first if it doesn't exist: ssh pi@claudia.local mkdir -p /home/pi/wakeword)

A.4 Switch `.env` to the new model #

SSH in and edit ~/whisplay-ai-chatbot/.env:

WAKE_WORDS=claudia
WAKE_WORD_MODEL_PATHS=/home/pi/wakeword/claudia.tflite

Or do it from your Windows PC via the Console:

.\Claudia.Console.bat set-wakeword claudia
# (then SSH in once to set WAKE_WORD_MODEL_PATHS — the console doesn't
#  write that key since it depends on where you put the .tflite)

Restart the service:

sudo systemctl restart chatbot.service

A.5 Sanity check #

journalctl -u chatbot.service -f

Say "Claudia" — you should see a detection event in the log and hear the chime. If false positives are too frequent in the first day of use, retrain with more "hard negative" samples (your voice saying close-but-not-claudia words) and bump WAKE_WORD_THRESHOLD toward 0.6–0.7.

CPU note: Custom single-keyword models are often less CPU-hungry than the multi-keyword default, so wake-word listening typically gets cheaper after this switch — useful on a Pi Zero 2 W.

Built for MindAttic LLC — 2026.05.19

Parts gallery

Click a card to jump to the best-known buy URL (chosen via find-deals if you've picked one, otherwise the Amazon tier).

smarthome — Optional locally-controllable smart plugs. Lets Claudia run commands like 'Claudia turn on the living room' without any cloud round-trip beyond Claude itself.

TP-Link Kasa HS103 / KP125M smart plug (local control via python-kasa)

Works locally over the LAN with the python-kasa library — no cloud round-trip. Tell Claudia 'turn on the living room' and a one-line shell call flips it.

Shelly Plug US (local HTTP / MQTT, no cloud required)

REST endpoint at http://<ip>/relay/0?turn=on — Claudia can flip it with a single curl. No vendor account needed.

Sonoff S31 (re-flashable with Tasmota for local MQTT control)

Out-of-the-box uses the eWeLink cloud; flash with Tasmota to get fully local MQTT/HTTP control.

wakeword — Optional dedicated low-power wake-word frontends. They run the keyword detector themselves so the Pi can sleep until you say the trigger. Not required if you're happy with the on-board button or the wiki's software wake-word approach.

Hiwonder WonderEcho offline AI voice module (I2C, wake-word + canned TTS)

Different architecture: it runs its own on-device ASR/TTS over I2C — useful as a wake-word frontend so the Pi can sleep, or as a fully offline fallback if Wi-Fi is down. NOT a drop-in replacement for the USB mic + Claude pipeline.

Generated by Claudia build-html.js from Claudia.md