High-quality on-device transcription. Easily convert speech to text from meetings, lectures, and more.
The transcription is powered by OpenAI’s Whisper model running locally on your device.
The app also includes support for Shortcuts.
- Batch conversion
- Export to karaoke file
Supports audio in 100 languages
- Haitian Creole
- Norwegian Nynorsk
Aiko transcribes audio directly on your device, ensuring complete privacy. It’s perfect for sensitive recordings.
The app uses the Whisper large v2 model on macOS and the medium or small model on iOS depending on available memory.
Divide text into paragraphs
Aiko divides the transcription text by sentences. If you want the text divided into paragraphs, copy the text from Aiko, go to ChatGPT, and use the following prompt.
Divide the text into paragraphs. Don't change the text otherwise: TRANSCRIPTION TEXT
Remove newlines and divide the text into paragraphs. Don't change the text otherwise: TRANSCRIPTION TEXT
Fix missing punctation
A flaw of the Whisper model is that transcriptions can sometimes be missing punctation. To fix missing punctation, copy the text from Aiko, go to ChatGPT, and use this prompt:
Fix the missing punctation. Don't change the text otherwise: TRANSCRIPTION TEXT
Quickly record and transcribe (macOS)
You can use this shortcut to be able to quickly record, transcribe, and have the result copied to the clipboard. The shortcut can be triggered from the menu bar or you can set a global keyboard shortcut for it.
Frequently Asked Questions
Can you use the large v3 model for the Mac app?
The v3 model is worse than v2 in too many cases. I tried releasing v3, but got a lot of emails about the quality being worse, so I ended up reverting it.
I have a feature request, bug report, or some feedback
Can I edit the text in the app?
I don’t plan to support any editing. Export the transcription and edit it in a proper text editor.
How is this better than the built-in transcription on Apple devices?
- Much better accuracy.
- Support for more languages.
- Transcribe audio and video files.
- Export to many different formats, like JSON, CSV, and subtitles.
I found a mistake in the transcription
The app uses the OpenAI Whisper model and I have no control over the quality of its output. You could provide feedback about the problem here.
My language is not in the list of supported languages. Can you support it?
I have no control over the supported languages. You could try to request it here.
The transcription repeats itself many times
This is unfortunately a flaw in the Whisper model and out of my control.
The transcription is missing punctation
This is unfortunately a flaw in the Whisper model. Workaround.
The transcription includes a sentence at the end that was not in the audio
This is unfortunately a flaw in the Whisper model. It can sometimes add a sentence like “Thanks for watching!” to the end. There is not much I can do about this.
This issue arises from quirks in the AI’s processing, where it sometimes generates off-topic content, often due to data remnants or misinterpreted context. These are not messages or ‘whispers’ with any underlying meaning; they’re random anomalies that OpenAI is actively working to correct.
The transcription is in Traditional Chinese while the audio was in Simplified Chinese?
I have plans to add a workaround where you can write a prompt to improve this, but I cannot promise when this will happen.
Why must I keep the iOS app open while it transcribes?
iOS apps are fundamentally restricted from operating in the background for extended periods. This ironically even affects Apple’s official apps.
What file formats does it support?
Any audio and video format that macOS and iOS supports. For example:
.mov. It does not support
How can I transcribe audio from the Voice Memos app?
macOS: Drag and drop the memo into the Aiko window. Note that because of a macOS bug, this can sometimes crash Aiko. If this happens, try sharing the memo from the Voice Memos app to Aiko instead.
iOS: In the Voice Memos app, tap the memo, tap the
… button, tap
Share, and choose Aiko in the app list.
Why does it take so long to generate?
Several factors can affect the transcription speed, including the performance of your device and the amount of available memory and CPU. Try closing down other apps or restarting your device before transcribing.
That being said, it’s likely Aiko will become significantly faster in the coming months.
Why does the app take up so much space on disk and memory?
The app delivers the highest quality transcription on the market for 100 different languages. Rather than asking why it’s so large, the real question is how is it so small.
The app is overheating my device
The role of the operating system (iOS/macOS) is to effectively manage the system’s resources and to safeguard against overheating. Apps are designed to utilize as many resources as they need to function optimally. In the event that resource consumption reaches a point that threatens to overheat the device, it’s the operating system’s job to limit such usage automatically.
If your device is experiencing overheating while using the app, it’s important to understand that the issue likely originates either from the operating system’s inability to manage resources effectively, or from an underlying hardware problem. In either case, the app itself is not responsible for the overheating.
The good news is that Aiko will soon take better advantage of the GPU, so CPU usage should be reduced significantly.
Can I delete some of the languages to save space?
This is unfortunately not possible. The model has all the languages stored together in a way that makes it impossible to remove just some languages.
Can you support real-time transcription?
This is something I plan to look into, but I have more popular requests I need to prioritize first.
Can you support naming the people in the audio?
How can I transcribe a Zoom meeting?
The app does not yet support live transcription, but you could record the Zoom meeting, and after the meeting is finished, drop the recording into the Aiko window to transcribe.
How can I transcribe a Messages voice note?
Drag and drop the voice note into Aiko.
How can I transcribe a Telegram voice note?
Telegram voice notes are stored in the format Ogg, which macOS/iOS cannot handle.
Workaround for iOS:
- Download this app.
- In Telegram, share the voice note to “Audio Converter”.
- Select “AAC” as output format and tap the convert button.
- Tap the share button and then choose Aiko.
Workaround for macOS:
- Download this app
- In Telegram, right-click the voice note and save it.
- Open the saved voice note with the “Audio Converter” app.
- Select “AAC” as output format and tap the convert button.
- Save the converted file and open it with Aiko.
I would also recommend sending feedback to Telegram that they should support M4A for voice notes.
How can I export the transcription as subtitles (SRT)?
When the transcription is done, click the share button in the toolbar, and choose “SRT”.
How can I transcribe a YouTube video?
Download the audio using a service like dirpy and then open the file in Aiko.
The app supports translating to English, can it support more languages?
The translation support is built into the AI model and it only supports translating to English. You could copy-paste the result into ChatGPT or Google Translate.
Is the app native?
Yes, it’s native and written in Swift and SwiftUI.
Why is this free without ads?
I just enjoy making apps. Consider leaving a nice review on the App Store.
Where can I find the changelog?
Go here and click “Version History”.
Can you localize the app into my language?
I don’t plan to localize the app.
Non-App Store Version
A special version for users that cannot access the App Store. It won’t receive automatic updates. I will update it here once a year.
Download (1.2.0 · 3 GB)
Requires macOS 13 or later