[Blog] [Paper 1] [Paper 2] [Model Card] [Podcast] | [Join our Discord for questions and support]
Moonshine is a family of speech-to-text models optimized for fast and accurate automatic speech recognition (ASR) on resource-constrained devices. It is well-suited to real-time, on-device applications like live transcription and voice command recognition. English Moonshine obtains word-error rates (WER) better than similarly-sized Tiny and Base Whisper on the OpenASR leaderboard, and non-English Moonshine variants outperform Whisper Small and Medium, which are 9x and 28x larger, respectively.
Moonshine processes audio segments between 5x-15x faster than Whisper while maintaining the same (or significantly better!) WER/CER. This is because its compute requirements scale with the length of input audio. Shorter input audio is processed faster, unlike Whisper models that process everything as 30-second chunks.
Unquantized Base is 62M parameters (or 400MB), while Tiny is 27M parameters (around 190MB).
Moonshine currently supports 8 languages. Below is a performance summary. Arabic, Chinese, Japanese, and Korean are character-error rates (CER); all others are WER.
| Language | Tag | Moonshine Tiny (27M) | Moonshine Base (62M) | Whisper Tiny (39M) | Whisper Base (74M) | Whisper Small (244M) | Whisper Medium (769M) | 
|---|---|---|---|---|---|---|---|
| Arabic | ar | 24.76 | 52.40 | 48.25 | 32.44 | 25.44 | |
| English | 12.66 | 10.07 | 12.81 | 10.32 | |||
| Chinese | zh | 32.77 | 68.51 | 59.13 | 46.76 | 40.41 | |
| Japanese | ja | 15.69 | 96.71 | 72.69 | 40.94 | 27.88 | |
| Korean | ko | 9.85 | 23.92 | 15.93 | 9.87 | 7.68 | |
| Spanish | es | TBA | |||||
| Ukrainian | uk | 19.70 | 66.77 | 48.56 | 25.93 | 16.51 | |
| Vietnamese | vi | 15.92 | 96.4 | 52.79 | 26.46 | 18.49 | 
 
Read the paper for more details on our non-English flavors of Moonshine.
With the release of new Moonshine languages, we have deprecated the Keras-based moonshine package. We recommend using Hugging Face transformers for vibe-checking the models, and using the ONNX runtime via moonshine-onnx for on-device applications. This table summarizes support:
| Model | Language | transformers | ONNX | Keras (deprecated) | 
|---|---|---|---|---|
| tiny-ar | Arabic | ✅ | ✅ | ❌ | 
| tiny-zh | Chinese | ✅ | ✅ | ❌ | 
| tiny | English | ✅ | ✅ | ✅ | 
| base | English | ✅ | ✅ | ✅ | 
| tiny-ja | Japanese | ✅ | ✅ | ❌ | 
| tiny-ko | Korean | ✅ | ✅ | ❌ | 
| base-es | Spanish | ✅ | ✅ | ❌ | 
| tiny-uk | Ukrainian | ✅ | ✅ | ❌ | 
| tiny-vi | Vietnamese | ✅ | ✅ | ❌ | 
We like uv for managing Python environments, so we use it here. If you don't want to use it, simply skip the uv installation and leave uv off of your shell commands.
First, install uv for Python environment management.
Then create and activate a virtual environment:
uv venv env_moonshine
source env_moonshine/bin/activateUsing Moonshine with the ONNX runtime is preferable if you want to run the models on SBCs like the Raspberry Pi. To use it, run the following:
uv pip install useful-moonshine-onnx@git+https://git@github.com/moonshine-ai/moonshine.git#subdirectory=moonshine-onnxYou can test Moonshine by transcribing the provided example audio file with the .transcribe function:
python
>>> import moonshine_onnx
>>> moonshine_onnx.transcribe(moonshine_onnx.ASSETS_DIR / 'beckett.wav', 'moonshine/tiny')
['Ever tried ever failed, no matter try again, fail again, fail better.']The first argument is a path to an audio file and the second is the name of a Moonshine model. moonshine/tiny and moonshine/base are English-only models. If you wish to use one of the non-English Moonshine models, just append the language IETF tag to the model name, e.g., moonshine/tiny-ko. See the table for supported languages and their tags.
Moonshine models can be used in many applications, so we've included code samples showing how to use them in different situations. The demo folder in this repository also has more information on them.
Moonshine is supported by the transformers library, as follows:
import torch
from transformers import AutoProcessor, MoonshineForConditionalGeneration
from datasets import load_dataset
processor = AutoProcessor.from_pretrained("UsefulSensors/moonshine-tiny")
model = MoonshineForConditionalGeneration.from_pretrained("UsefulSensors/moonshine-tiny")
ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
audio_array = ds[0]["audio"]["array"]
inputs = processor(audio_array, return_tensors="pt")
generated_ids = model.generate(**inputs)
transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(transcription)If you wish to use one of the non-English Moonshine models, just append the IETF code to the repo ID, e.g., UsefulSensors/moonshine-tiny-ko. See the table for supported languages and their tags.
You can try the Moonshine ONNX models with live input from a microphone with the live captions demo.
The files for the CTranslate2 versions of Moonshine are available at huggingface.co/UsefulSensors/moonshine/tree/main/ctranslate2, but they require a pull request to be merged before they can be used with the mainline version of the framework. Until then, you should be able to try them with our branch, with this example script.
Use our MoonshineJS library to run Moonshine models in the web browser with a few lines of Javascript.
We have an active Discord server where we're happy to answer questions, offer support, and generally geek out about voice AI, so please come join the conversations.
All inference code in this repo is released under the MIT license. The English Moonshine models are also released under the MIT license.
All non-English Moonshine variants are released under the Moonshine AI Community License (TLDR: Models are free to use for researchers, developers, small businesses, and creators with less than $1M in annual revenue.).
A copy of both licenses is included in this repository.
If you benefit from our work, please cite our paper:
@misc{jeffries2024moonshinespeechrecognitionlive,
      title={Moonshine: Speech Recognition for Live Transcription and Voice Commands}, 
      author={Nat Jeffries and Evan King and Manjunath Kudlur and Guy Nicholson and James Wang and Pete Warden},
      year={2024},
      eprint={2410.15608},
      archivePrefix={arXiv},
      primaryClass={cs.SD},
      url={https://arxiv.org/abs/2410.15608}, 
}
Please also cite our paper on non-English Moonshine variants if you find them useful:
@misc{king2025flavorsmoonshinetinyspecialized,
      title={Flavors of Moonshine: Tiny Specialized ASR Models for Edge Devices}, 
      author={Evan King and Adam Sabra and Manjunath Kudlur and James Wang and Pete Warden},
      year={2025},
      eprint={2509.02523},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2509.02523}, 
}
