Dictation modes and languages
InkSpoke's defaults are tuned to just work, but a few settings let you shape how it listens: whether you see words appear as you speak, how it handles very soft speech, which language you're dictating in, and how aggressively it cleans up your microphone. This page walks through each one.
You don't have to touch any of this. Out of the box InkSpoke uses Standard mode, expects English, auto-detects other languages, and auto-tunes your audio. The settings below are for when you want something different.
Two ways to transcribe: Standard vs Live Preview
The Dictation Mode setting decides when InkSpoke turns your speech into text.
| Mode | Best for | How it works |
|---|---|---|
| Standard (default) | Everyday dictation where accuracy matters most | InkSpoke records while you talk and transcribes everything in one pass after you stop. It sees the whole utterance at once, so it makes the highest-accuracy call. |
| Live Preview | Watching your words land in real time | InkSpoke streams a running preview roughly every ~2 seconds as you speak, then confirms each sentence at a natural pause. The listening overlay shows finalized text plus a blinking draft of the chunk still in progress. |
Both modes end the same way — the final, cleaned-up text is injected wherever your cursor is. The difference is whether you get a live "draft" along the way.
Live Preview streams speculative previews and confirms them at sentence boundaries detected by InkSpoke's built-in voice-activity model (which ships with the app — no download needed). Two things to know:
- If that voice-activity model isn't available, InkSpoke quietly falls back to Standard.
- Live Preview runs only with on-device speech models. If you've switched to a cloud speech provider, InkSpoke also falls back to Standard so it isn't firing off an API request every couple of seconds. See on-device vs. cloud.
| Setting | Default | What it does |
|---|---|---|
Dictation Mode (DictationMode) | Standard | Standard transcribes once after you stop. LivePreview streams a preview every ~2 s and finalizes at pauses (on-device models only). Found under Settings → Audio. |
Dictating softly: quiet-speech mode
Working in an open office, a library, or a room with someone asleep? Quiet-speech mode (sometimes called "whisper mode") retunes InkSpoke so it can still hear you when you're barely making a sound.
When it's on, InkSpoke does three things: it lowers the threshold at which it decides you're speaking, boosts the gain on your microphone, and asks the speech model to work a little harder on low-energy audio. The result is far better recognition of soft, close-to-the-mic speech — at the cost of being more sensitive to background noise, which is why it's off by default.
| Setting | Default | What it does |
|---|---|---|
Quiet-speech mode (WhisperModeEnabled) | Off | Tunes detection, gain, and the speech model for very soft dictation. |
Turning it on shifts three internal knobs:
- Voice-activity speech threshold drops from
0.45to0.30, so quieter sounds register as speech. - Audio gain is boosted to
3.0×. - Whisper's beam size increases from
1to5for more careful decoding of low-energy audio.
Dictating in other languages
InkSpoke isn't English-only. You can set a default language, let it auto-detect, or pick a language per dictation right from the overlay.
Set a default and let auto-detect help
Out of the box, InkSpoke expects English and keeps auto-detect on, so it can identify the spoken language automatically when you switch.
| Setting | Default | What it does |
|---|---|---|
Language (Language) | English (en) | The language InkSpoke assumes when auto-detect is off, or as a starting point. |
Auto-detect language (AutoDetectLanguage) | On | Lets InkSpoke identify the language you're speaking rather than forcing a fixed one. |
Switch language on the overlay
Every dictation shows a language picker in the listening overlay. It lists Auto followed by your preferred languages, so a mid-session switch is one click:
┌────────────────────────────────────────────────┐
│ ● Listening… ⏱ 0:04 │
├────────────────────────────────────────────────┤
│ ▁▃▅▇▅▃▁▂▄▆▄▂ │
│ │
│ [ Workspace ▾ ] [ EN ▾ ] [ Send ] │
│ ├ Auto │
│ ├ English │
│ ├ Español │
│ └ Français │
└────────────────────────────────────────────────┘
You can also cycle languages with ↑ / ↓ while the overlay is open, without reaching for the mouse. A language you choose this way can stick as your new default.
Workspaces can carry their own preferred language. When a workspace is smart-matched to the app you're in, InkSpoke can pre-select that workspace's language for you — handy if you always write to one team in Spanish and another in English.
Getting clean audio automatically
Good transcription starts with a good signal. InkSpoke runs several audio steps for you, most of which need no configuration at all.
Auto-calibrated gain
Quiet or inconsistent microphones are amplified automatically. In the first moments of a session InkSpoke measures your ambient noise and peak level, works out how much to boost you so your speech lands at a healthy level, and applies that consistently for the whole session (so it never ducks your volume during a pause). If calibration finishes partway in, it even re-amplifies the audio it already buffered. There's nothing to set — it just happens.
Noise suppression
InkSpoke can strip out fans, keyboard clatter, and ambient chatter while preserving your voice, using a neural noise filter (DeepFilterNet) plus a low-pass filter that also acts as a high-frequency noise gate.
| Setting | Default | What it does |
|---|---|---|
Noise suppression (AudioProcessing.DeepFilterEnabled) | On | Neural background-noise removal. Adds about 30 ms of latency and needs the noise-suppression model downloaded; if it isn't present, this step is simply skipped. |
Suppression strength (AudioProcessing.DeepFilterStrength) | 0.75 | How aggressively noise is removed, on a 0.0–1.0 scale. Lower it if you find speech sounding over-processed. |
Low-pass filter (AudioProcessing.LowpassFilterEnabled) | On | A 7.5 kHz low-pass applied before InkSpoke resamples to 16 kHz — trims high-frequency hiss and prevents aliasing. |
Noise suppression is optional and depends on the DeepFilterNet model being installed. With it off (or not yet downloaded), auto-gain and the low-pass filter still run.
Voice-activity detection and silence timeouts
InkSpoke uses voice-activity detection to tell speech from silence: it trims silent gaps before transcription and drives the "speech detected" state you see on the overlay. Two safety timeouts also stop a session automatically so a forgotten recording doesn't run forever.
| Setting | Default | What it does |
|---|---|---|
Silence timeout (SilenceTimeoutSeconds) | 30 s | Auto-cancels a session after this much continuous silence. Set to 0 to disable. |
Max recording length (MaxRecordingDurationSeconds) | 300 s (5 min) | Hard cap on a single dictation. Set to 0 for no limit. |
If you dictate long passages, raise or disable Max recording length — otherwise InkSpoke stops and processes what it has at the 5-minute mark.
Platform notes
These modes and settings behave the same on Windows, macOS, and Linux. Neural noise suppression relies on a downloadable native component and model, so it's the one feature that may be inactive until that model is installed; everything else works everywhere out of the box.
Next steps
- Audio and models settings — where the dictation mode, quiet-speech, and audio options live.
- The listening overlay — the language and workspace pickers in context.
- On-device vs. cloud and privacy — why Live Preview is on-device only.
- Models and providers — choosing the speech model behind your dictation.