On-device vs. cloud, and privacy
InkSpoke is offline-first: your words can be transcribed and polished entirely on your own computer, and nothing leaves it unless you choose a cloud model or turn on sync. This page explains what runs where, how you control it, and exactly which data stays private.
Two decisions, made per model
A single dictation goes through two AI steps, and each one is independent:
- Speech recognition (ASR) — turning your audio into text. Runs on-device with Whisper.net or Parakeet, or via a cloud provider.
- AI refinement (LLM) — cleaning up and reshaping that text. Runs on-device with a local model, or via the built-in InkSpoke Platform cloud model or your own BYOK provider.
You pick a model for each step (in AI Models settings), and the model you pick decides whether that step is local or cloud. So you can, for example, transcribe locally to keep your audio private, and still refine with a fast cloud model.
Your speech
│
┌─────────────┴──────────────┐
│ 1. Speech recognition │
│ Local Whisper / Parakeet │ → audio stays on your device
│ Cloud provider API │ → audio uploaded to provider
└─────────────┬──────────────┘
│ transcribed text
┌─────────────┴──────────────┐
│ 2. AI refinement │
│ Local on-device LLM │ → text stays on your device
│ Cloud Platform / BYOK │ → text sent to provider
└─────────────┬──────────────┘
│
Injected at your cursor
Out of the box, speech runs on-device (Whisper Small — free and offline) but refinement uses the InkSpoke Platform cloud model during your Pro trial. So by default your audio never leaves your machine, but your transcribed text is sent to be polished. To keep the entire loop local, switch refinement to an on-device model too (see below).
On-device models
Local models download once and then run with no network. The speech side has two engines:
| Engine | What it is | Notes |
|---|---|---|
| Whisper.net | The default local speech recognizer. Default model is Whisper Small (244M). | Small is included free; the other sizes (Tiny, Base, Medium, Large, Large-v3 Turbo variants) are Pro. |
| Parakeet | An alternative ONNX speech engine. | Selectable as a speech model when downloaded. |
On-device refinement uses a local GGUF language model. Local ASR beyond Whisper Small, and local LLMs, live on the AI Models → On-Device tab, which is a Pro feature.
GPU acceleration
On-device speech can use your GPU to run faster. This is controlled by UseGpuForDictation
(on by default), and what it does depends on your OS:
| Platform | On-device speech acceleration |
|---|---|
| macOS | Metal (GPU) + Apple Neural Engine for Whisper |
| Windows | CUDA (NVIDIA GPUs) |
| Linux | CPU only — no GPU acceleration |
GPU acceleration currently applies to Whisper. The Parakeet engine runs on CPU on all platforms for now; CUDA (Windows) and CoreML (macOS) acceleration for Parakeet are planned but not yet enabled. If you rely on GPU speed, stay on a Whisper model.
Cloud models
Choosing a cloud speech or text model routes that step to a provider over the network:
- InkSpoke Platform — the built-in cloud provider. Refinement through it uses InkSpoke's Responses API; it's the default text model during your Pro trial.
- BYOK (bring your own key) — add any OpenAI-compatible provider with your own API key on the AI Models → Providers tab (Pro). Your key is stored in your operating system's keychain, never in a settings file. Requests go directly to your provider under your account.
Cloud speech falls back to local
Cloud speech recognition is designed to fail safe. If a cloud upload doesn't succeed, InkSpoke falls back to your local model so you still get a transcript, and the failed upload is queued for retry rather than lost.
Meeting recording transcribes on-device only — cloud transcription for live meetings is coming soon. Cloud transcription is available when you import an audio or video file.
What stays private
InkSpoke keeps your data on your machine by default:
- Your audio never leaves your device when you use a local speech model. With a cloud speech model, only then is audio uploaded.
- Your history, recordings, and workspaces are stored locally (the History screen is even labelled "Local only") unless you turn on cloud sync.
- Cloud sync is opt-in and end-to-end encrypted. It's off by default
(
CloudSyncEnabled = false) and requires you to be signed in. When on, your workspaces and settings are encrypted on your device with a key held in your OS keychain — the servers store ciphertext they can't read. - API keys and sync keys live in the OS keychain (macOS Keychain, Windows Credential
Manager, Linux Secret Service) — never in the plain-text
settings.json.
You also have a Privacy Tier setting (Settings → Configuration → General) that sets your overall posture. It defaults to LocalShield, with HybridIntelligence and PrivacyCloud as the other levels.
Even with local speech, if your refinement model is cloud-based, your transcribed text is sent
to that provider. And if you send custom vocabulary to a cloud speech model, that's gated by
a separate opt-in (CustomVocabularyCloud). For a fully private loop, keep both the speech model
and the refinement model on-device.
Choosing your setup: privacy vs. accuracy and speed
Mix and match the two steps to land where you want on the privacy/performance trade-off:
| Setup | Speech | Refinement | What leaves your device | Best when |
|---|---|---|---|---|
| Fully on-device | Local (Whisper / Parakeet) | Local LLM | Nothing | Privacy is paramount, or you're offline. Quality and speed depend on your hardware and model size. Local LLM needs Pro. |
| Hybrid (audio stays home) | Local | Platform or BYOK cloud | Transcribed text only | You want strong refinement quality but never want to upload audio. This is closest to the default. |
| Fully cloud | Cloud provider | Cloud provider | Audio + text | You're on modest hardware and want the fastest, most accurate results, and you're comfortable using a provider. |
A good rule of thumb: keep speech on-device (it's free and private), and only reach for the cloud on the refinement step where a larger model helps most. You can change either model at any time — nothing is locked in.
Settings that affect this
| Setting | Default | What it does |
|---|---|---|
| Active speech model | Whisper Small (local) | Picking a cloud speech model switches ASR to cloud (AsrProvider.Mode). |
UseGpuForDictation | On | GPU acceleration for on-device speech (Metal / CUDA; no effect on Linux). |
CloudSyncEnabled | Off | Opt-in, end-to-end-encrypted sync of workspaces and settings. |
PrivacyTier | LocalShield | Your overall privacy posture (LocalShield / HybridIntelligence / PrivacyCloud). |
CustomVocabularyCloud | — | Gate for sending your custom vocabulary to a cloud speech model. |
Next steps
- Models and providers — the full catalog: on-device, Platform, and BYOK.
- Audio and models settings — download models, pick your defaults, tune the GPU toggle.
- Account, sync, and updates — turn on encrypted cloud sync and manage devices.
- Synced data and privacy — view or delete your end-to-end-encrypted data from the web.