Voice Commands — Lattices Docs

Voice commands let you control Lattices by speaking. Press Hyper+D to open the voice command window, hold Option to speak, release to stop. Lattices transcribes your speech via Vox, matches it to an intent, and executes it.

Quick start

Install Vox (provides mic + transcription)
Connect an Assistant provider in Settings > AI
Press Hyper+D to open the voice command window
Hold Option and speak a command
Release Option — Lattices transcribes and executes

Keyboard shortcuts

Key	Action
Hyper+D	Open/close voice command window
⌥ (hold)	Push-to-talk — hold to record, release to stop
Tab	Arm/disarm the mic
Escape	Cancel recording or dismiss window

Built-in commands

Search

Find windows by app name, title, content, or category.

"Find all vox windows"
"Find terminals"           → expands to iTerm, Terminal, Warp, etc.
"Show me all browsers"     → expands to Safari, Chrome, Firefox, Arc, etc.
"Where is my editor?"      → expands to VS Code, Cursor, Xcode, etc.

Category synonyms are built in — saying "terminals", "browsers", "editors", "chat", "music", "mail", or "notes" automatically expands to search for the actual app names.

Tile

Move windows to screen positions.

"Tile this left"
"Snap to the right half"
"Maximize the window"
"Put this in the top right corner"

Voice tiling should resolve into the same canonical daemon mutation used by other agent surfaces: window.place.

Focus

Bring a window or app to the front.

"Focus Safari"
"Switch to Slack"
"Go to the lattices window"

Open / Launch

Open applications or project workspaces.

"Open Spotify"
"Launch the vox project"

Kill

Close windows or quit applications.

"Kill this window"
"Close Safari"
"Quit Spotify"

Scan

Trigger an OCR scan of visible windows.

"Scan the screen"
"Read what's on screen"

Other

"List all windows"
"Show my sessions"
"Switch to layer 2"
"Help"

AI advisor

Every voice command can ask the selected Assistant provider for commentary and follow-up suggestions in the AI corner (bottom-right of the voice command window).

When local matching handles the command well, the AI corner shows "no AI needed" with an optional "ask AI" button. When the advisor has something useful, it shows a one-line comment and an actionable suggestion button.

How it works

You speak a command
Local intent matching runs immediately (fast, free)
The selected Assistant provider runs in parallel when connected
If the advisor suggests something, a button appears in the AI corner
Click the suggestion to execute it
If you engage with a suggestion that the local matcher missed, it's recorded in ~/.lattices/advisor-learning.jsonl for future improvement

Hands-off inference

Hands-off voice uses the shared inference wrapper in bin/infer.ts. By default it chooses the lowest-latency configured provider and, when Groq credentials are present, uses groq/llama-3.1-8b-instant.

Credentials are read from process env, .env.local, .env, ~/.lattices/inference.json, then ~/.config/speakeasy/settings.json. For Groq, either GROQ_API_KEY or the common typo GROK_API_KEY works when the key has Groq's gsk_ prefix.

Override the voice engine if needed:

LATTICES_VOICE_PROVIDER=groq
LATTICES_VOICE_MODEL=llama-3.1-8b-instant

Configuration

Open Settings > AI to configure:

Setting	Default	Description
Assistant provider	OpenAI Codex	Provider used by the in-app chat and provider-backed voice advisor. Supports OAuth providers such as GitHub Copilot and OpenAI Codex plus API-key providers.
Pi runtime	Auto-detected	Installs or refreshes the provider runtime used by the in-app assistant.
Provider credentials	Not authenticated	Sign in with OAuth or save an API key locally for the selected provider.

Layout

The voice command window has four sections:

Section	Position	Content
History	Left column	Past commands with expandable details
Voice Command	Center column	Current transcript, matched intent, results
Log	Top-right	Rolling diagnostic log (last 12 entries)
AI Corner	Bottom-right	Advisor commentary, suggestions, provider readiness

Search architecture

Voice search uses the same backend as lattices search:

Quick search — window titles, app names, session tags (instant)
Complete search — adds terminal cwd/processes + OCR content
Synonym expansion — category terms like "terminals" expand to actual app names before searching
Query cleanup — strips natural language qualifiers ("and sort by...", "please", "for me") before searching

Processing resilience

15-second timeout — if processing doesn't complete, returns to idle
Cancellation on dismiss — closing the window cancels in-flight work
Double-execution prevention — streaming and stop callbacks can't both fire the intent

Advisor learning

When the local matcher fails but the AI advisor suggests something that you engage with, the interaction is recorded:

~/.lattices/advisor-learning.jsonl

Each line is a JSON object:

{
  "timestamp": "2026-03-15T18:30:00.000Z",
  "transcript": "find all terminals",
  "localIntent": "search",
  "localSlots": {"query": "terminals"},
  "localResultCount": 0,
  "advisorIntent": "search",
  "advisorSlots": {"query": "iterm"},
  "advisorLabel": "Search iTerm"
}

This dataset captures where the local system falls short and what the right answer was. Future work can mine it for automatic synonym mappings and phrase pattern improvements.

Requirements

Vox — provides microphone access and speech-to-text transcription
Assistant provider — enables provider-backed AI suggestions from Settings > AI (optional, voice commands still work without it)
Accessibility permission — for window tiling and focus
Screen Recording permission — for window discovery