Agent skill
transcribe-audio
Transcribes audio files to text using whisper-cpp (local, offline). Use when converting speech to text, transcribing podcasts, lectures, meetings, or any audio content.
Install this agent skill to your Project
npx add-skill https://github.com/knoopx/pi/tree/main/agent/skills/transcribe-audio
SKILL.md
Transcribe Audio
Local audio transcription using whisper-cpp — a C++ port of OpenAI's Whisper.
Quick Start
# Transcribe with a small model (fast, ~75MB)
nix run nixpkgs#whisper-cpp -- -m models/ggml-tiny.en.bin -f audio.mp3
Available Models
| Model | Size | Speed | Quality |
|---|---|---|---|
ggml-tiny.en.bin |
75 MB | ⚡ Fastest | Basic |
ggml-base.en.bin |
142 MB | ⚡ Fast | Good |
ggml-small.en.bin |
468 MB | 🐌 Medium | Better |
ggml-medium.en.bin |
1.4 GB | 🐌 Slower | Good |
ggml-large-v3.bin |
3.1 GB | 🐌🐌 Slow | Best |
Download a Model
# Example: download base model
curl -L https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin -o ggml-base.en.bin
Common Options
| Option | Description |
|---|---|
-m MODEL |
Model path |
-f FILE |
Input audio file |
-t N |
Threads (default: 4) |
-l LANG |
Language (en, auto, etc.) |
-otxt |
Output to .txt file |
-osrt |
Output to .srt subtitle file |
-ovtt |
Output to .vtt file |
-oj |
Output to JSON |
-of PATH |
Output file path (without extension) |
-nt |
No timestamps in output |
-np |
No prints (results only) |
--print-confidence |
Show confidence scores |
Examples
Transcribe with Timestamps (Default)
nix run nixpkgs#whisper-cpp -- -m ggml-base.en.bin -f recording.mp3
Save to Text File
nix run nixpkgs#whisper-cpp -- -m ggml-base.en.bin -f recording.mp3 -otxt -of transcript
Generate Subtitles
nix run nixpkgs#whisper-cpp -- -m ggml-base.en.bin -f video.mp4 -osrt -of captions
JSON Output with Confidence
nix run nixpkgs#whisper-cpp -- -m ggml-base.en.bin -f audio.wav -oj -of result --print-confidence
Auto-Detect Language
nix run nixpkgs#whisper-cpp -- -m ggml-base.bin -f audio.mp3 -l auto
Process Multiple Files
nix run nixpkgs#whisper-cpp -- -m ggml-base.en.bin -f file1.mp3 file2.mp3 file3.mp3
Offset and Duration
# Start at 30s, process 60 seconds
nix run nixpkgs#whisper-cpp -- -m ggml-base.en.bin -f audio.mp3 -ot 30000 -d 60000
Supported Formats
flac, mp3, ogg, wav
GPU Acceleration
For GPU support, use whisper-cpp-vulkan:
nix run nixpkgs#whisper-cpp-vulkan -- -m ggml-base.en.bin -f audio.mp3
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
conventional-commits
Writes and reviews Conventional Commits commit messages (v1.0.0) to support semantic versioning and automated changelogs. Use when drafting git commit messages, PR titles, release notes, or when enforcing a conventional commit format (type(scope): subject, BREAKING CHANGE, footers, revert).
nix-flakes
Creates reproducible builds, manages flake inputs, defines devShells, and builds packages with flake.nix. Use when initializing Nix projects, locking dependencies, or running nix build/develop commands.
skill-authoring
Writes effective pi skills with proper structure, concise content, and progressive disclosure. Use when creating new skills, improving existing skills, or reviewing skill quality.
gtkx
Build GTK4 desktop applications with GTKX React framework. Use when creating React components that render as native GTK widgets, working with GTK4/Libadwaita UI, handling signals, virtual lists, menus, or building Linux desktop UIs.
nu-shell
Processes structured data through pipelines, filters tables, transforms JSON/CSV/YAML, and defines custom commands. Use when scripting with typed parameters or working with tabular data.
nix
Runs packages temporarily, creates isolated shell environments, and evaluates Nix expressions. Use when executing tools without installing, debugging derivations, or working with nixpkgs.
Didn't find tool you were looking for?