Agent skill
voiceover
Generates audio narration from a text file using Chatterbox TTS. Use when the user wants to generate voiceover/audio from ANY text file.
Install this agent skill to your Project
npx add-skill https://github.com/kedbin/relearning-flow/tree/main/skills/voiceover
Metadata
Additional technical details for this skill
- author
- community
- version
- 8.0
SKILL.md
Voiceover
Generates voiceover audio from a text file using Chatterbox TTS with voice cloning. Outputs MP3 format directly. Supports automatic deployment and git push!
The Pipeline
[Content]
↓
create-script skill (REQUIRED FOR QUALITY)
- Condenses content ~50%
- Adds paralinguistic tags ([chuckle], [sigh], etc.)
- Rewrites for conversational speech
↓
[filename].txt
↓
voiceover skill (YOU ARE HERE)
- TTS generation with Chatterbox
- Deploy to site
- Git push
↓
[filename].mp3 published
IMPORTANT: For high-quality voiceovers, ALWAYS use the create-script skill first. The --transform flag only does basic markdown stripping—no condensation, no paralinguistic tags.
When to Use This Skill
USE THIS SKILL when the user:
- Has a
.txtscript file ready (created bycreate-scriptskill) - Says "voiceover", "generate audio", "create narration"
- Wants to convert a prepared script to audio
IMPORTANT: If the user provides raw content (markdown, URL, article), use the create-script skill FIRST to prepare it, THEN use this skill on the resulting .txt file.
Project Location
Chatterbox Directory: ~/projects/chatterbox (configure to your setup)
- Script files (.txt):
~/projects/chatterbox/archive/ - Output files (.mp3):
~/projects/chatterbox/archive/ - Voice reference:
~/projects/chatterbox/clone.wav - Log file:
~/projects/chatterbox/voiceover.log
CLI Arguments
| Argument | Default | Description |
|---|---|---|
-i, --input |
article.txt |
Input text file (use .txt from create-script) |
-o, --output |
<input>.mp3 |
Output MP3 file (auto-generated if omitted) |
-v, --voice |
clone.wav |
Voice reference for cloning |
-e, --entry |
none | Journal entry name (e.g., entry-011) for frontmatter update |
--deploy |
off | Copy MP3 to site public/audio/ after generation |
--push |
off | Git add, commit, and push to remote (implies --deploy) |
-m, --message |
auto | Custom git commit message |
--preflight |
off | Run pre-flight checks only (no generation) |
Instructions
Step 1: Verify the Input File Exists
Ensure the input .txt file exists in the archive/ directory:
ls ~/projects/chatterbox/archive/entry-XXX.txt
If the user provides raw markdown or content, STOP and use create-script first.
Step 2: Launch in Background
CRITICAL: Use uv run from the chatterbox root directory.
cd ~/projects/chatterbox && nohup uv run python archive/voiceover_script.py \
-i archive/entry-XXX.txt \
-o archive/entry-XXX.mp3 \
--entry entry-XXX \
--push > voiceover.log 2>&1 &
Step 3: Verify It Started (ONE CHECK ONLY)
Wait briefly and check the log once:
sleep 5 && head -10 ~/projects/chatterbox/voiceover.log
Expected output:
Using device: cuda
Loading model...
Fetching 10 files: 100%|██████████| 10/10 [00:00<?, ?it/s]
CRITICAL: DO NOT poll for progress repeatedly. This floods the context window. Trust the script to complete.
Step 4: Inform the User and Move On
Tell the user:
- Voiceover generation launched in background
- Input:
archive/entry-XXX.txt - Output:
archive/entry-XXX.mp3 - Will auto-deploy and push when complete
- Desktop notification will appear when done
- Monitor (optional):
tail -f ~/projects/chatterbox/voiceover.log
Then you are DONE with this task. Do not wait for completion or check progress again.
What the Script Does (Fire and Forget)
When --push is used, the script automatically:
- Generates the MP3 with voice cloning
- Copies MP3 to
your-site/public/audio/ - Updates journal frontmatter with
audioUrl - Runs
git pullto sync - Stages audio file and journal entry
- Commits with message: "Add entry-XXX with audio narration"
- Pushes to GitHub
- Sends desktop notification
You don't need to monitor any of this. The script is self-contained.
Examples
Example 1: Full workflow with push
cd ~/projects/chatterbox && nohup uv run python archive/voiceover_script.py \
-i archive/entry-013.txt \
-o archive/entry-013.mp3 \
--entry entry-013 \
--push > voiceover.log 2>&1 &
Then verify started:
sleep 5 && head -10 voiceover.log
Done. Move on.
Example 2: Deploy only (no push)
cd ~/projects/chatterbox && nohup uv run python archive/voiceover_script.py \
-i archive/entry-010.txt \
-o archive/entry-010.mp3 \
--deploy > voiceover.log 2>&1 &
Example 3: Generation only
cd ~/projects/chatterbox && nohup uv run python archive/voiceover_script.py \
-i archive/my_script.txt \
-o archive/final_audio.mp3 > voiceover.log 2>&1 &
Example 4: Pre-flight check
cd ~/projects/chatterbox && uv run python archive/voiceover_script.py --preflight
Output Specifications
The script produces an MP3 file with:
- 192kbps bitrate
- Normalized to -19 LUFS
- 0.5 second gaps between chunks
Desktop Notifications
The script sends notifications via notify-send:
- Success with push: "Voiceover Complete - [file] generated, deployed, and pushed to GitHub!"
- Success with deploy: "Voiceover Complete - [file] generated and deployed"
- Success (generation only): "Voiceover Complete - [file] generated successfully!"
Troubleshooting
| Error | Solution |
|---|---|
ModuleNotFoundError: No module named 'chatterbox' |
Check pyproject.toml package configuration |
No such file or directory |
Verify input file path and existence |
CUDA out of memory |
Reduce chunk size or run on CPU (slower) |
pydub.exceptions.CouldntEncodeError |
Install ffmpeg: sudo apt install ffmpeg |
| Git push fails | Check for uncommitted changes or network issues |
Quick Reference
# Launch voiceover with push
cd ~/projects/chatterbox && nohup uv run python archive/voiceover_script.py \
-i archive/entry-XXX.txt \
-o archive/entry-XXX.mp3 \
--entry entry-XXX \
--push > voiceover.log 2>&1 &
# Verify started (ONE CHECK ONLY)
sleep 5 && head -10 voiceover.log
# DONE - do not poll for progress
Critical Reminders
- ONE startup check only - do not poll for progress
- Fire and forget - trust the script to complete
- Desktop notification - user will know when done
- Don't flood context - repeated log checks waste tokens
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
relearning-content
Creates journal entries or project pages for a personal knowledge site. Use when the user wants to write, publish, or add content - journals, projects, or articles about cognitive engineering, productivity systems, or tool-driven growth.
create-script
Transforms content into a voiceover-ready script optimized for Chatterbox TTS. Use when the user provides ANY content for voiceover - URLs, raw text, video scripts, notes, or asks to "create a script" for audio.
handoff
Compact the current conversation into a handoff document for another agent to pick up.
obsidian-vault
Search, create, and manage notes in the Obsidian vault with wikilinks and index notes. Use when user wants to find, create, or organize notes in Obsidian.
scaffold-exercises
Create exercise directory structures with sections, problems, solutions, and explainers that pass linting. Use when user wants to scaffold exercises, create exercise stubs, or set up a new course section.
edit-article
Edit and improve articles by restructuring sections, improving clarity, and tightening prose. Use when user wants to edit, revise, or improve an article draft.
Didn't find tool you were looking for?