Claudia — Build Guide (2026.05.19)

A single-path, checkpoint-driven build guide for a Raspberry Pi Zero 2 W + PiSugar Whisplay HAT voice assistant powered by the Claude API. Say "Hey Jarvis" (or train your own "Claudia" wake word — Appendix A), ask anything, and Claude talks back.

github.com/mindattic/Claudia


What's new in 2026.05.19 (vs. 2026.05.18) #

What's new in 2026.05.18 (vs. previous) #


Pick your build before you buy #

Two questions decide everything.

Q1 — Where will this device sit?

Q2 — How far away will you be when you talk to it?

Because wake-word activation is on by default, the mic has to reliably hear "Hey Jarvis" without you leaning into it. The on-board Whisplay mics technically work but you'll be repeating yourself.

That's it. Now build the cart.


Part 1: Shopping list #

Core build (required) #

Item Why ~Price Notes
Raspberry Pi Zero 2 WH The brain. The "H" matters — it has the GPIO header pre-soldered, which the Whisplay HAT needs. Don't buy the bare "Pi Zero 2 W" or you'll need to solder 40 pins yourself. ~$22 Search Amazon or Adafruit for "Raspberry Pi Zero 2 WH"
PiSugar Whisplay HAT LCD, speaker, dual mic, RGB LED, buttons — everything except the CPU. ~$36 Direct from PiSugar's site or Amazon
microSD card, 32 GB Class 10 (SanDisk Ultra or equivalent) The "hard drive." 16 GB works but 32 GB gives you headroom. ~$9 Any reputable brand
Official Raspberry Pi 12.5W micro-USB power supply (5V/2.5A) The Pi Zero 2 W is rated for 5V/2.5A. Phone chargers will seem to work and then cause random crashes. Buy the official one or a known-good 2.5A+ supply. ~$10 Pi Hut, Adafruit, official resellers

Pick-your-mic (optional) #

Item When to buy ~Price
SunFounder USB Mini Mic If you need 4–10 ft pickup. ~$13
Seeed reSpeaker XVF3800 USB Mic Array If you want across-the-room pickup with built-in echo cancellation, beamforming, and noise suppression. ~$60
micro-USB OTG adapter Required if you bought either USB mic above — the Pi Zero only has a micro-USB data port. ~$7

Portable (optional) #

Item When to buy ~Price
PiSugar 3 1200 mAh battery If you want to unplug and carry it around. Snaps onto the Pi via magnetic pogo pins — no soldering. ~$40

Totals #

Build What you get Total
Desktop Core + Whisplay's mics + wall power ~$77
Desktop + better mic (budget) + SunFounder + OTG ~$97
Desktop + better mic (premium) + reSpeaker XVF3800 + OTG ~$144
Portable + premium mic All of the above + PiSugar 3 battery ~$184

Prices fluctuate. Verify before checkout. Where I've named vendors, search by product name rather than trusting a link.

What you do not need #


Part 2: Assemble the hardware #

Total time: ~5 minutes. No soldering.

  1. Do not insert the microSD yet. Flash it first in Part 3.
  2. Align the Whisplay HAT's 40-pin socket with the Pi Zero 2 WH's GPIO header pins. The Whisplay's buttons should be on the same side as the Pi's USB ports.
  3. Press the HAT down firmly and evenly until fully seated. Hold the PCB edges; do not press on the glass LCD. Press the LCD and it cracks.
  4. Peel off the LCD protective film.
  5. (Optional, portable build) Snap the PiSugar 3 battery onto the underside of the Pi using its magnetic pogo pins.

Final stack (top → bottom): Whisplay HAT → Pi Zero 2 WH → PiSugar 3 (optional)

Checkpoint: The stack feels solid, the LCD is exposed and undamaged, nothing wobbles.


Part 3: Flash the microSD card #

3.1 Install Raspberry Pi Imager #

Download from raspberrypi.com/software (Windows, macOS, Linux).

3.2 Flash #

  1. Open Raspberry Pi Imager.
  2. Choose DeviceRaspberry Pi Zero 2 W.
  3. Choose OSRaspberry Pi OS (other)Raspberry Pi OS (64-bit) (the full version, not Lite).
    • The chatbot repo's install script expects packages from the full image. Lite will work but you'll need extra apt installs and may hit surprises.
  4. Choose Storage → your microSD card.
  5. Click the gear icon (⚙) for Edit Settings and configure:
    • Hostname: claudia
    • Username: pi
    • Password: something secure
    • Enable SSH: ✅ password auth
    • Wireless LAN: SSID + password for your home Wi-Fi
    • Locale: America/Chicago, keyboard us
  6. Save, then Write. Takes 2–5 minutes.

3.3 First boot #

  1. Insert the microSD into the Pi.

  2. Plug the official power supply into the PWR IN micro-USB port (the one nearest the corner, labeled PWR IN on the silkscreen). Not the middle port labeled USB.

  3. Wait 60–90 seconds.

  4. From your PC:

    ssh pi@claudia.local
    

    If claudia.local doesn't resolve, find the Pi's IP in your router's admin page and use ssh pi@192.168.x.x.

Checkpoint: You see the pi@claudia:~ $ prompt. Run cat /etc/os-release and confirm it says Debian/Raspberry Pi OS. Run free -h — you should see ~430 MB of Mem: (the Pi Zero 2 W has 512 MB total).


Part 4: System setup #

Run these from the SSH session. One at a time. Wait for each to finish.

4.1 Update #

sudo apt update && sudo apt full-upgrade -y

This takes 5–15 minutes on a Pi Zero. Be patient.

4.2 Free up RAM (Pi Zero only has 512 MB) #

The Pi Zero 2 W is RAM-constrained. Disable services you don't need:

# Disable Bluetooth (not used by this build)
sudo systemctl disable hciuart bluetooth

# Disable triggerhappy (gamepad daemon, not needed)
sudo systemctl disable triggerhappy

4.3 Install build dependencies #

sudo apt install -y git curl build-essential python3-pip python3-venv \
  portaudio19-dev libsndfile1 ffmpeg alsa-utils libatlas-base-dev

4.4 Install the Whisplay HAT driver (LCD + audio + buttons + LEDs) #

This is the official PiSugar driver. The install script also enables the I2C, SPI, and I2S buses automatically.

cd ~
git clone https://github.com/PiSugar/Whisplay.git --depth 1
cd Whisplay/Driver
sudo bash install_wm8960_drive.sh
sudo reboot

Wait ~60 seconds, then SSH back in:

ssh pi@claudia.local

Checkpoint 1 — driver loaded:

aplay -l

You should see a card whose name contains wm8960. If not, the driver didn't load — re-run the install script and re-check.

Checkpoint 2 — speaker works:

speaker-test -t sine -f 440 -l 1 -D plughw:CARD=wm8960soundcard

You should hear a one-second 440 Hz beep from the Whisplay's speaker. If you hear nothing, run alsamixer, press F6, pick the wm8960 card, and turn up the playback levels.

Checkpoint 3 — on-board mic works:

arecord -d 5 -f cd /tmp/mic_test.wav && aplay /tmp/mic_test.wav

Record 5 seconds, then play it back. You should hear your voice. If it's silent or garbled, raise the "Capture" levels in alsamixer (F4 toggles between Playback and Capture).

If all three checkpoints pass and you're using the on-board mic, skip to Part 5. Otherwise, do Part 4.5 first.


Part 4.5: Configure a USB microphone (skip if using on-board mics) #

Plug it in #

  1. Power down: sudo shutdown -h now and unplug power.
  2. Connect the OTG adapter to the Pi's data micro-USB port — that's the middle one labeled USB, not PWR IN.
  3. Plug the USB mic into the OTG adapter.
  4. Power back on, SSH in.

Verify it's detected #

arecord -l

You should now see two capture cards:

card 0: wm8960soundcard ...
card 1: ArrayUAC10 / Mini Mic ...  (your USB mic — name varies)

If card 1 is missing: try a different OTG cable (some are charge-only), or make sure you're in the data port not the power one. Run lsusb and confirm the device shows up.

Test recording from the USB mic #

arecord -D plughw:1,0 -d 5 -f cd /tmp/usb_test.wav
aplay /tmp/usb_test.wav

If it's quiet, run alsamixer, press F6 to switch to the USB card, F4 for Capture controls, and raise it to ~80%.

Make the USB mic the system default capture device #

This way the chatbot picks it up automatically — speaker stays on the Whisplay.

nano ~/.asoundrc

Paste:

# Playback on Whisplay speaker (card 0), capture on USB mic (card 1)
pcm.!default {
    type asym
    playback.pcm {
        type plug
        slave.pcm "hw:0,0"
    }
    capture.pcm {
        type plug
        slave.pcm "hw:1,0"
    }
}

ctl.!default {
    type hw
    card 1
}

Save with Ctrl+X, Y, Enter.

Pin the card numbering across reboots #

Linux's USB card numbers can swap on reboot. Lock the USB mic to card 1:

echo "options snd_usb_audio index=1" | sudo tee /etc/modprobe.d/alsa-base.conf

Reboot:

sudo reboot

Verify the default config works end-to-end #

ssh pi@claudia.local
arecord -d 5 -f cd /tmp/default_test.wav && aplay /tmp/default_test.wav

This should record from the USB mic and play through the Whisplay speaker — without specifying any device flags.

Checkpoint: Recording and playback both use the right devices via the system default.

reSpeaker XVF3800 note: The on-board AEC, beamforming, and noise suppression run automatically. No extra software needed. For raw multi-channel access or firmware tweaks, see the Seeed reSpeaker XVF3800 wiki.


Part 5: Install the chatbot software #

This is the PiSugar whisplay-ai-chatbot repo. It's purpose-built for the Pi Zero 2 W + Whisplay combo. Say "Hey Jarvis" → ask your question → Claude answers out loud. The on-board button still works as a fallback.

cd ~
git clone https://github.com/PiSugar/whisplay-ai-chatbot.git
cd whisplay-ai-chatbot
bash install_dependencies.sh
source ~/.bashrc

The dependency install pulls Node.js, Python packages, and audio libraries. This takes 15–25 minutes on a Pi Zero 2 W. Let it finish.

The source ~/.bashrc line is important — the installer sets PATH entries you need in your current shell session.

Checkpoint: install_dependencies.sh finishes without errors. Test that Node is on PATH:

node --version

You should see v20.x or similar.


Part 6: Get a Claude API key #

  1. Go to console.anthropic.com and sign in (or create an account).
  2. Add a payment method and put a small amount of credit on the account (e.g., $5 — that lasts a long time on Haiku).
  3. Navigate to API KeysCreate Key.
  4. Name it claudia. Copy the key now — you can't see it again later.
  5. Treat the key like a password.

Approximate cost: Casual personal use on claude-haiku-4-5-20251001 typically runs a few dollars per month at most. Check current pricing at anthropic.com/pricing.

Which model to pick #

Model ID Speed Quality When to use
claude-haiku-4-5-20251001 Fastest Good Default for this device. Latency matters more than essay-grade prose for a voice assistant.
claude-sonnet-4-6 Medium Excellent If you want richer answers and don't mind a slightly slower response.
claude-opus-4-7 Slowest Best Overkill for spoken Q&A. Use for hard reasoning tasks only.

Model IDs change over time. The current list lives at docs.claude.com.


Part 7: Configure the chatbot #

7.1 Create your .env #

cd ~/whisplay-ai-chatbot
cp .env.template .env
nano .env

The template ships with many fields for different ASR/LLM/TTS providers. For a Claude-based build, you need the LLM section set to Anthropic and the wake-word section enabled. Find and set:

# === LLM (the AI brain) ===
LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-YOUR-KEY-HERE
ANTHROPIC_MODEL=claude-haiku-4-5-20251001

# === System prompt — shapes the assistant's voice ===
SYSTEM_PROMPT=You are a concise, friendly voice assistant. Answer in plain spoken English — no markdown, no bullet lists, no headings. Keep responses to 1–3 sentences unless the user explicitly asks for more.

# === Wake word ===
# Default is "Hey Jarvis" — the closest pre-trained openWakeWord model. To
# answer to "Claudia" instead, train a custom model (Appendix A), drop the
# .tflite onto the Pi, and set WAKE_WORDS=claudia + WAKE_WORD_MODEL_PATHS=...
# To disable wake word entirely and fall back to the button, set
# WAKE_WORD_ENABLED=false.
WAKE_WORD_ENABLED=true
WAKE_WORDS=hey_jarvis
WAKE_WORD_PYTHON_PATH=/home/pi/.pyenv/versions/python311/bin/python
WAKE_WORD_THRESHOLD=0.5
WAKE_WORD_IDLE_TIMEOUT_SEC=60
WAKE_WORD_END_KEYWORDS=byebye,goodbye,stop

For speech-to-text (ASR) and text-to-speech (TTS), the template defaults usually work. If you want fully local (no extra API keys), pick whisper for ASR and piper for TTS. If you want higher-quality cloud STT/TTS, see the wiki for OpenAI / Google / Volcengine options — each needs its own API key.

The .env.template evolves. If your file looks different from this guide, the live template at github.com/PiSugar/whisplay-ai-chatbot/blob/master/.env.template is the source of truth.

Save: Ctrl+X, Y, Enter.

7.2 Build the project #

bash build.sh

This compiles the TypeScript and prepares assets. ~5–10 minutes on a Pi Zero 2 W.

Checkpoint: build.sh exits cleanly with no errors.

7.3 Install the wake-word engine (openWakeWord) #

The chatbot's wake-word frontend is openWakeWord. Raspberry Pi OS ships Python 3.12, but openWakeWord needs 3.11 for its prebuilt wheels — so we install Python 3.11 in a pyenv sandbox to avoid touching the system Python.

# 1. Build prerequisites + pyenv installer
sudo apt install -y build-essential make gcc \
  libssl-dev zlib1g-dev libbz2-dev libreadline-dev \
  libsqlite3-dev curl llvm libncursesw5-dev xz-utils \
  tk-dev libxml2-dev libxmlsec1-dev libffi-dev liblzma-dev git
curl https://pyenv.run | bash

# 2. Wire pyenv into your shell (one-time)
cat >> ~/.bashrc <<'EOF'

# pyenv (for openWakeWord)
export PATH="$HOME/.pyenv/bin:$PATH"
eval "$(pyenv init -)"
eval "$(pyenv virtualenv-init -)"
EOF
source ~/.bashrc

# 3. Build Python 3.11 (~10–15 min on a Pi Zero — go get a coffee)
pyenv install 3.11.0
pyenv virtualenv 3.11.0 python311
pyenv activate python311

# 4. Install openWakeWord + download the model files
pip install openwakeword "numpy<2"
python -c "import openwakeword.utils as u; u.download_models()"
pyenv deactivate

The .env line WAKE_WORD_PYTHON_PATH=/home/pi/.pyenv/versions/python311/bin/python (set in 7.1 above) is what the chatbot uses to spawn the wake-word listener in this sandbox.

Checkpoint: the model files exist on disk.

ls ~/.pyenv/versions/python311/lib/python3.11/site-packages/openwakeword/resources/models/ | head

You should see filenames like hey_jarvis_v0.1.tflite, alexa_v0.1.tflite, etc.


Part 8: First-run sanity check #

Before launching the full chatbot, run a 90-second healthcheck that verifies every layer end-to-end: speaker, mic, network, Claude API.

Create the script:

nano ~/healthcheck.sh

Paste:

#!/bin/bash
# claudia healthcheck — quick end-to-end smoke test
# Usage: bash ~/healthcheck.sh

set -u
ENV_FILE="$HOME/whisplay-ai-chatbot/.env"
PASS="\033[0;32m✓\033[0m"
FAIL="\033[0;31m✗\033[0m"
exit_code=0

step() { printf "\n%s\n" "── $1 ──"; }
ok()   { printf "  $PASS %s\n" "$1"; }
bad()  { printf "  $FAIL %s\n" "$1"; exit_code=1; }

step "1. Audio devices"
aplay -l | grep -q wm8960 && ok "wm8960 playback card detected" || bad "wm8960 NOT detected (driver issue?)"
arecord -l | grep -q card && ok "at least one capture card detected" || bad "no capture card detected"

step "2. Speaker test (1s beep)"
speaker-test -t sine -f 440 -l 1 -s 1 >/dev/null 2>&1 \
  && ok "speaker-test completed (did you hear a beep?)" \
  || bad "speaker-test failed"

step "3. Mic test (3s record-and-replay)"
echo "  (speak for 3 seconds now…)"
arecord -d 3 -f cd /tmp/hc_mic.wav >/dev/null 2>&1
[ -s /tmp/hc_mic.wav ] && ok "captured audio file written" || bad "no audio captured"
aplay /tmp/hc_mic.wav >/dev/null 2>&1 && ok "playback OK (did you hear yourself?)" || bad "playback failed"

step "4. Network reachability"
ping -c 1 -W 3 api.anthropic.com >/dev/null 2>&1 \
  && ok "api.anthropic.com is reachable" \
  || bad "cannot reach api.anthropic.com (Wi-Fi or DNS issue)"

step "5. Claude API call"
if [ ! -f "$ENV_FILE" ]; then
  bad "$ENV_FILE not found — finish Part 7 first"
else
  # shellcheck disable=SC1090
  set -a; source "$ENV_FILE"; set +a
  if [ -z "${ANTHROPIC_API_KEY:-}" ]; then
    bad "ANTHROPIC_API_KEY is empty in .env"
  else
    response=$(curl -s -w "\n%{http_code}" https://api.anthropic.com/v1/messages \
      -H "x-api-key: $ANTHROPIC_API_KEY" \
      -H "anthropic-version: 2023-06-01" \
      -H "content-type: application/json" \
      -d "{\"model\":\"${ANTHROPIC_MODEL:-claude-haiku-4-5-20251001}\",\"max_tokens\":50,\"messages\":[{\"role\":\"user\",\"content\":\"Say hello in exactly 5 words.\"}]}")
    http_code=$(echo "$response" | tail -n1)
    body=$(echo "$response" | sed '$d')
    if [ "$http_code" = "200" ]; then
      ok "Claude API responded HTTP 200"
      echo "  Reply: $(echo "$body" | grep -o '"text":"[^"]*"' | head -1 | sed 's/"text":"//;s/"$//')"
    else
      bad "Claude API returned HTTP $http_code"
      echo "  $body" | head -3
    fi
  fi
fi

echo
if [ $exit_code -eq 0 ]; then
  printf "$PASS All checks passed. You're ready for Part 9.\n"
else
  printf "$FAIL One or more checks failed. Fix above before running the chatbot.\n"
fi
exit $exit_code

Run it:

chmod +x ~/healthcheck.sh
bash ~/healthcheck.sh

Checkpoint: All five sections print green check marks. If anything fails, fix that piece before moving on — running the full chatbot before this passes just makes debugging harder.


Part 9: Run the chatbot #

Manual launch (foreground, for testing) #

cd ~/whisplay-ai-chatbot
bash run_chatbot.sh

The LCD lights up with status. Say "Hey Jarvis" — Claudia plays a chime to indicate she's listening, you ask your question, Claude answers out loud. The session ends automatically after 60 seconds of silence (WAKE_WORD_IDLE_TIMEOUT_SEC) or when you say a stop word — by default byebye, goodbye, or stop.

You can also press the on-board button to wake her without the keyword — useful when there's background noise the wake-word detector is missing.

Stop the foreground process with Ctrl+C.

Set it to start on boot #

The repo provides an opinionated startup installer that registers a chatbot.service systemd unit and sets the system to multi-user (headless) mode. Use it:

cd ~/whisplay-ai-chatbot
bash startup.sh

After this, the chatbot starts automatically on every boot. Verify:

sudo systemctl status chatbot.service

You should see Active: active (running).

Live logs #

tail -f ~/whisplay-ai-chatbot/chatbot.log
# or
journalctl -u chatbot.service -f

Tuning wake-word reliability #

Falling back to button-only #

If wake word is too unreliable for your room or you'd rather not have a CPU listener running 24/7:

cd ~/whisplay-ai-chatbot
sed -i 's/^WAKE_WORD_ENABLED=.*/WAKE_WORD_ENABLED=false/' .env
sudo systemctl restart chatbot.service

The button activation continues to work either way.


Part 10: Optional case (3D-printed) #

PiSugar publishes free STL files for case shells:

No printer? Upload the STL to a print service like JLC3DP or Craftcloud — a few dollars shipped.


Part 11: Troubleshooting #

Nothing plays through the speaker #

Mic captures silence or garbage #

USB mic disappears after reboot #

Build fails out of memory #

Service won't start #

sudo systemctl status chatbot.service --no-pager
journalctl -u chatbot.service -n 60 --no-pager

Look for the first ERROR line — usually a missing .env key or a wrong path.

Claude API returns 401 #

Claude API returns 429 #

Wake word never triggers #

Wake word triggers on TV / unrelated speech #

Responses feel slow #

Need to re-run the healthcheck #

bash ~/healthcheck.sh

SD card filling up #

df -h
sudo apt clean
# clear chatbot recordings:
rm -f ~/whisplay-ai-chatbot/data/recordings/*.wav 2>/dev/null

Reference #


Summary stack #

Layer What it is
Hardware Pi Zero 2 WH + PiSugar Whisplay HAT (+ optional PiSugar 3 battery, USB mic)
OS Raspberry Pi OS 64-bit
Audio driver WM8960 (Whisplay)
Activation Wake word (hey_jarvis by default; trained claudia via Appendix A) + button fallback
Wake-word engine openWakeWord in a Python 3.11 pyenv
Speech → text Local Whisper, or cloud STT if configured
LLM Claude API (Anthropic)
Text → speech Piper (local) or cloud TTS if configured
Service manager systemd (chatbot.service, set up by startup.sh)

Only Claude runs in the cloud. Everything else can run on-device if you want it to.


Appendix A: Train your own "Claudia" wake word #

openWakeWord ships with pre-trained models for hey_jarvis, hey_mycroft, alexa, and hey_rhasspy. There is no pre-trained "Claudia" model — the only way to get the device to actually answer to its name is to train one yourself. This is a one-time chore: ~30 minutes of recording, ~15–30 minutes of training on a free Colab GPU, then a one-line .env swap on the Pi.

What you need #

A.1 Collect ~100 positive samples ("Claudia") #

The model learns from your voice. Record yourself saying "Claudia" naturally — same room and mic conditions you'll use the device in, ideally the same person who'll use it most.

Use any recorder that emits 16-bit, 16 kHz mono WAV. Audacity works fine: Tracks → Add New → Mono Track, set the project rate to 16000 Hz, record 1 second of "Claudia", export as WAV, repeat.

Aim for ~100 clips, each ~1 second long, with natural variation:

Save them all into a folder like claudia_samples/.

A.2 Run the openWakeWord training notebook #

Open the official training Colab linked from the openWakeWord repo: https://github.com/dscripka/openWakeWord → "Training new models" section. The notebook is named openwakeword_training_tutorial.ipynb.

In the notebook:

  1. Runtime → Change runtime type → GPU.
  2. Upload your claudia_samples/ folder into the Colab session (Files panel → upload).
  3. Set the wake-word name to claudia and point the positive-samples path at your uploaded folder.
  4. Keep the synthetic-data generator enabled — it'll fabricate thousands of additional "claudia" pronunciations using a TTS model, which is what actually makes a hundred real samples enough.
  5. Run all cells. Training takes 15–30 minutes on the free T4.
  6. When it's done, the notebook writes claudia.tflite (or claudia.onnx, depending on the notebook revision). Download it.

A.3 Drop the model onto the Pi #

From your PC:

scp claudia.tflite pi@claudia.local:/home/pi/wakeword/

(Create the dir first if it doesn't exist: ssh pi@claudia.local mkdir -p /home/pi/wakeword)

A.4 Switch .env to the new model #

SSH in and edit ~/whisplay-ai-chatbot/.env:

WAKE_WORDS=claudia
WAKE_WORD_MODEL_PATHS=/home/pi/wakeword/claudia.tflite

Or do it from your Windows PC via the Console:

.\Claudia.Console.bat set-wakeword claudia
# (then SSH in once to set WAKE_WORD_MODEL_PATHS — the console doesn't
#  write that key since it depends on where you put the .tflite)

Restart the service:

sudo systemctl restart chatbot.service

A.5 Sanity check #

journalctl -u chatbot.service -f

Say "Claudia" — you should see a detection event in the log and hear the chime. If false positives are too frequent in the first day of use, retrain with more "hard negative" samples (your voice saying close-but-not-claudia words) and bump WAKE_WORD_THRESHOLD toward 0.6–0.7.

CPU note: Custom single-keyword models are often less CPU-hungry than the multi-keyword default, so wake-word listening typically gets cheaper after this switch — useful on a Pi Zero 2 W.


Built for MindAttic LLC — 2026.05.19