Agent skill
media-understanding
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/marketing/media-understanding
SKILL.md
Media Understanding
Audio Files → faster-whisper (local)
For mp3, wav, m4a, flac, ogg, aac files:
faster-whisper "path/to/audio.mp3" -o /tmp --model large-v3
Options
| Option | Description |
|---|---|
-o DIR |
Output directory for .srt file |
--model SIZE |
tiny, base, small, medium, large-v3 (default: large-v3) |
--language LANG |
Force language (auto-detected by default) |
--task transcribe |
Transcribe in original language (default) |
--task translate |
Translate to English |
--word_timestamps true |
Include word-level timing |
Output: SRT subtitle file in output directory.
Video Files → Gemini (visual + audio)
For mp4, mov, webm, avi, mkv files or YouTube URLs:
uv run ~/.claude/skills/media-understanding/scripts/understand_video.py \
--source "path/to/video.mp4" \
--prompt "Describe what happens in this video"
Options
| Option | Description |
|---|---|
--fast |
Use faster flash model |
--fps N |
Frame rate sampling (default: 1 fps) |
--start N |
Start time in seconds |
--end N |
End time in seconds |
Example Prompts
- "Summarize this video in 3 bullet points"
- "Transcribe all spoken dialogue with timestamps"
- "What text appears on screen?"
- "Describe the main actions and events"
API Key
Gemini requires GEMINI_API_KEY env var.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
agent-ops-spec
Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.
agent-ops-state
Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.
agent-ops-spec
Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.
agent-ops-testing
Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.
agent-ops-testing
Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.
agent-ops-state
Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.
Didn't find tool you were looking for?