Agent skill
elevenlabs-tts
This skill converts text to high-quality audio files using ElevenLabs API. Use this skill when users request text-to-speech generation, audio narration, or voice synthesis with customizable voice parameters (stability, similarity boost) and voice presets (rachel, adam, bella, elli, josh, arnold, ava).
Install this agent skill to your Project
npx add-skill https://github.com/glebis/claude-skills/tree/main/elevenlabs-tts
SKILL.md
ElevenLabs Text-to-Speech
Overview
Generate professional audio files from text using ElevenLabs' advanced text-to-speech API. The skill provides pre-configured voice presets with sensible defaults, voice parameter customization, and direct access to the scripts/elevenlabs_tts.py script for programmatic control.
Quick Start
To generate audio from text:
- Ensure the
.envfile contains a validELEVENLABS_API_KEY - Execute the script with text:
python scripts/elevenlabs_tts.py "Your text here" - Specify voice and output:
python scripts/elevenlabs_tts.py "Text" --voice adam --output audio/output.mp3
Voice Presets
Seven pre-configured voices are available. See references/api_reference.md for complete voice descriptions:
rachel(default) - Clear, professional femaleadam- Deep, authoritative malebella- Warm, friendly femaleelli- Young, enthusiastic femalejosh- Friendly, conversational malearnold- Deep, powerful maleava- Expressive, dynamic female
Parameters
Text
The text to convert to speech. Any length is supported.
Voice Selection
Specify voice using preset name (e.g., rachel, adam) or direct ElevenLabs voice ID.
Voice Parameters
- stability (0.0-1.0, default 0.5): Lower values create expressive variation; higher values ensure consistency
- similarity_boost (0.0-1.0, default 0.75): Higher values maintain closer adherence to voice characteristics
Output
Specify the output file path. Default is output.mp3. Directories are created automatically.
Usage Examples
Basic Python Usage
from scripts.elevenlabs_tts import generate_speech
path = generate_speech(
text="Hello, this is a test message",
voice_id="rachel"
)
Command Line
# With default voice
python scripts/elevenlabs_tts.py "Generate this text"
# With custom voice and stability
python scripts/elevenlabs_tts.py "Different voice" --voice adam --stability 0.7
# To custom output path
python scripts/elevenlabs_tts.py "Save here" --output audio/narration.mp3
# List available voices
python scripts/elevenlabs_tts.py "" --list-voices
Implementation Notes
- The script handles API communication with error reporting
- Output directories are created automatically if they don't exist
- Returns absolute path to generated audio file
- Uses
eleven_monolingual_v1model by default (can be overridden)
Resources
scripts/elevenlabs_tts.py- Main Python script for text-to-speech generation. Can be imported as a module or executed from command line.references/api_reference.md- Detailed API documentation including voice descriptions, parameter explanations, and usage examples..envand.env.example- Environment configuration for storing API credentials securely.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
tdd
This skill should be used when the user wants to implement features or fix bugs using test-driven development. Enforces the RED-GREEN-REFACTOR cycle with vertical slicing, context isolation between test writing and implementation, human checkpoints, and auto-test feedback loops. Uses multi-agent orchestration with the Task tool for architecturally enforced context isolation. Supports Jest, Vitest, pytest, Go test, cargo test, PHPUnit, and RSpec.
brand-agency
Applies Agency brand colors and typography to artifacts including presentations, SVG graphics, documents, and web interfaces. This skill should be used when brand colors, visual formatting, neobrutalism style, or Agency design standards apply. Keywords - branding, corporate identity, visual identity, styling, brand colors, typography, visual formatting, visual design, neobrutalism.
github-gist
Publish files or Obsidian notes as GitHub Gists. Use when user wants to share code/notes publicly, create quick shareable snippets, or publish markdown to GitHub. Triggers include "publish as gist", "create gist", "share on github", "make a gist from this".
chrome-history
Query Chrome browsing history with natural language. Filter by date range, article type, keywords, and specific sites.
wispr-analytics
This skill should be used when analyzing Wispr Flow voice dictation history for self-reflection, work patterns, mental health insights, or productivity analytics. Triggered by requests like "/wispr-analytics", "analyze my dictations", "what did I dictate today", "wispr reflection", or any request to review voice dictation patterns. Supports modes - technical (coding/work), soft (communication), trends (volume/frequency), mental (sentiment/energy/rumination).
granola
This skill should be used when importing, listing, or exporting Granola meeting recordings and transcripts. Queries Granola's local cache and API to list meetings, extract transcripts, and export to Obsidian notes in Fathom-compatible format.
Didn't find tool you were looking for?