Claude Code voice mode: how it works and when to use it
Claude Code now has native voice input via /voice. Here's how push-to-talk works, which languages are supported, and when speaking beats typing.
Claude Code shipped native voice input. No third-party tools, no browser extensions — just /voice, the spacebar, and your microphone. Anthropic started the rollout on March 3, 2026, initially live for about 5% of users, expanding progressively since. If your account has access, you'll see a notice on the Claude Code welcome screen.
What voice mode is (and is not)
Voice mode is speech-to-text input. Claude does not talk back. There's no audio output, no conversational back-and-forth. You speak, the transcription lands in your input field, and Claude processes it like any other text prompt. Terminal output stays the same.
The change is entirely on the input side. You hold a key, speak naturally, release, and review the transcript before sending. You can mix voice and keyboard in the same message — paste a file path with your fingers while describing the context out loud.
Anthropic isn't turning the terminal into a voice assistant. They're removing the friction between thinking and typing. Different goals entirely.
How to enable it
Voice mode requires Claude Code v2.1.69 or later. Update first:
npm update -g @anthropic-ai/claude-code
claude --version
Then, inside any Claude Code session:
/voice
Claude Code will request microphone access from your OS. Grant it.
Voice mode only works when you authenticate via a Claude.ai account. It's not available with a direct Anthropic API key, or through Amazon Bedrock, Google Vertex AI, or Microsoft Foundry. If you're using one of those integrations, /voice returns an error.
Push-to-talk
The interaction model is push-to-talk:
- Hold the spacebar — recording starts, an indicator appears in your terminal
- Speak your prompt naturally
- Release — the transcription appears in your input field
- Review and send, or type additional context, or cancel and re-record
There is no always-on microphone. Claude Code is not listening to your conversations, your teammates, or your ambient environment. You control exactly when it records.
The push-to-talk key defaults to Space but is customisable via ~/.claude/keybindings.json:
{
"bindings": [
{
"context": "Chat",
"bindings": {
"meta+k": "voice:pushToTalk",
"space": null
}
}
]
}
Setting "space": null removes the default binding. If you want both keys active, omit that line. Anthropic recommends modifier combinations like meta+k — they activate on the first keypress rather than requiring a brief hold for detection.
Avoid binding a bare letter key like v. Single letters trigger during hold detection warmup and type into your prompt buffer. Stick with Space or modifier combinations.
Twenty languages as of March 2026
Voice mode launched with 10 languages and doubled in March 2026:
Since launch: English, Spanish, French, German, Italian, Portuguese, Japanese, Korean, Chinese, Hindi
Added March 2026: Russian, Polish, Turkish, Dutch, Ukrainian, Greek, Czech, Danish, Swedish, Norwegian
Transcription is optimised for technical terminology — repository names, library names, common developer vocabulary. Generic speech recognition stumbles on useState, tRPC, drizzle-orm, or kubectl. A model tuned for developer speech handles these better, though accuracy still varies by term and accent.
When voice actually helps
Voice mode is not universally better than typing. It's better in specific situations.
Speak when you're:
- Setting high-level context. "I want to refactor the auth module to use JWT instead of sessions — let's start by understanding what's currently in place." This kind of framing is exhausting to type and easy to say.
- Describing bugs. Narrating what you observed, what you expected, what the error says. Developers cut corners when typing bug descriptions. Speaking them tends to be more complete.
- Thinking through architecture. Tradeoffs, structure, approach. Spoken input is closer to how developers actually reason through design problems.
- Exploring. When you're not sure what you want yet and need to talk through the problem before committing to a specific instruction.
- Managing ergonomics. Developers dealing with RSI, fatigue, or physical constraints get genuine relief here. Hours of terminal work without keyboard strain is not a minor thing.
Type when you're:
- Writing precise technical strings. Exact filenames, function names, configuration values. Transcription errors on precise strings send Claude in the wrong direction.
- Pasting code. Speaking code is almost always less accurate than pasting it.
- In a noisy environment. Push-to-talk helps, but background noise still bleeds in.
- Sending short commands. Typing
/testor/clearis faster than reaching for voice mode.
The most effective workflow combines both. Speak the context and intent, type or paste the precise details.
What happens under the hood
The voice pipeline runs in three stages:
Audio capture. When you hold the push-to-talk key, the terminal captures audio from your default system microphone at 16kHz mono. A recording indicator appears.
Transcription. A speech recognition model specialised for developer vocabulary processes the audio. The transcript appears in your terminal for review — you see it before Claude acts on it.
Prompt submission. Once you're satisfied, the transcript is submitted as a standard text prompt. Everything from that point behaves identically to typed input — file access, tool use, git operations, multi-agent workflows, all of it.
Claude Code handles microphone permissions at the OS level. On macOS, grant your terminal application (Terminal, iTerm2, Warp, etc.) microphone access in System Settings. On Linux, your terminal needs access through PulseAudio or PipeWire. Voice mode won't activate without the necessary permissions.
Voice mode does not work in SSH sessions or Claude Code on the web. It requires local microphone access.
Plans and availability
Voice mode is included at no extra cost across Pro, Max, Team, and Enterprise plans. As of mid-March 2026, access is expanding through progressive rollout. There's no opt-in form or waitlist — when your account is enabled, the welcome screen tells you.
Why the terminal matters
GitHub Copilot's voice functionality lives inside VS Code. Cursor and Windsurf have partial voice support tied to their editors. Claude Code's voice mode works at the terminal level, independent of any editor or IDE. That means voice input is available wherever Claude Code runs, in whatever workflow you've built around it.
Some early 2026 numbers for context: Claude Code is generating $2.5 billion in annualised revenue, with weekly active users doubling since January. According to SemiAnalysis, Claude Code now authors roughly 4% of all public GitHub commits — a figure projected to hit 20% by end of 2026.
The developers who'll get the most out of voice mode are the ones who treat it as another input method. Reach for it when speaking is faster, return to the keyboard when precision matters. The friction that disappears is the translation layer between thinking and typing. That bottleneck matters more than most people expect until they've used it for a week.
Getting started
# Update Claude Code
npm update -g @anthropic-ai/claude-code
# Start a session and enable voice
claude
/voice
Hold Space. Speak. Release. Review the transcript. Send.
For keybinding customisation and a full settings reference, see the official documentation at code.claude.com/docs/en/voice-dictation.
If /voice isn't recognised yet, your account is still in the queue. Keep updating to the latest version.
If you're looking for a smoother dictation experience outside the terminal — drafting docs, writing emails, or narrating notes — Wispr Flow is worth a look. It's a system-wide voice-to-text tool at around $12/month, with a free month of Pro via that link.
You can paste this post's URL into Claude Code or any AI assistant for context if you hit issues setting up voice mode.
Where to run this
This post is brought to you by Hetzner, whose dedicated root servers give us the raw metal we actually run these benchmarks on, and by Tailscale, which keeps our node-to-node traffic encrypted without making us think about it. If you find this useful, check them out.
You need a machine with a local microphone, which rules out most remote VPS setups for voice mode specifically — but Claude Code itself runs everywhere. Hetzner gets you a CX23 at €4.85/month with €10 free credit, and it's where we run this blog. For development boxes, it's hard to beat.
If you'd rather skip managing Claude Code yourself entirely, xCloud offers managed OpenClaw hosting — point, deploy, done.
(Affiliate links — we get a small cut if you sign up, at no cost to you.)