Agent skill
youtube-transcribe-skill
Extract subtitles/transcripts from a YouTube video URL and save as a local file. Use when you need to extract subtitles from a YouTube video.
Install this agent skill to your Project
npx add-skill https://github.com/feiskyer/codex-settings/tree/main/skills/youtube-transcribe-skill
SKILL.md
YouTube Transcript Extraction
Extract subtitles/transcripts from a YouTube video URL and save them as a local file.
Input YouTube URL: $ARGUMENTS
Step 1: Verify URL and Get Video Information
-
Verify URL Format: Confirm the input is a valid YouTube URL (supports
youtube.com/watch?v=oryoutu.be/formats). -
Get Video Information:
- If
yt-dlpis available, preferyt-dlp --get-title "[VIDEO_URL]". - If using browser automation, extract the title from the page (via snapshot or
document.title) for file naming.
- If
Step 2: CLI Quick Extraction (Priority Attempt)
Use command-line tools to quickly extract subtitles.
-
Check Tool Availability: Execute
which yt-dlp.- If
yt-dlpis found, proceed to subtitle download. - If
yt-dlpis NOT found, skip immediately to Step 3.
- If
-
Execute Subtitle Download (Only if
yt-dlpis found):- Tip: Always add
--cookies-from-browserto avoid sign-in restrictions. Default tochrome. - Retry Logic: If
yt-dlpfails with a browser error (e.g., "Could not open Chrome"), ask the user to specify their available browser (e.g.,firefox,safari,edge) and retry.
bash# Get the title first (try chrome first) yt-dlp --cookies-from-browser=chrome --get-title "[VIDEO_URL]" # Download subtitles yt-dlp --cookies-from-browser=chrome --write-auto-sub --write-sub --sub-lang zh-Hans,zh-Hant,en --skip-download --output "<Video Title>.%(ext)s" "[VIDEO_URL]" - Tip: Always add
-
Verify Results:
- Check the command exit code.
- Exit code 0 (Success): Subtitles have been saved locally, task complete.
- Exit code non-0 (Failure):
- If error is related to browser/cookies, ask user for correct browser and retry Step 2.
- If other errors (e.g., video unavailable), proceed to Step 3.
Step 3: Browser Automation (Fallback)
When the CLI method fails or yt-dlp is missing, use browser UI automation to extract subtitles.
-
Check Tool Availability:
- Check if
chrome-devtools-mcptools (specificallymcp__chrome__new_page) are available. - CRITICAL CHECK: If
chrome-devtools-mcpis NOT available ANDyt-dlpwas NOT found in Step 2:- STOP execution.
- Notify the User: "Unable to proceed. Please either install
yt-dlp(for fast CLI extraction) OR configurechrome-devtools-mcp(for browser automation)."
- Check if
-
Initialize Browser Session (If tools are available):
Call
mcp__chrome__new_pageto open the video URL.
3.2 Analyze Page State
Call mcp__chrome__take_snapshot to read the page accessibility tree.
3.3 Expand Video Description
Reason: The "Show transcript" button is usually hidden within the collapsed description area.
- Search the snapshot for a button labeled "...more", "...更多", or "Show more" (usually located in the description block below the video title).
- Call
mcp__chrome__clickto click that button.
3.4 Open Transcript Panel
- Call
mcp__chrome__take_snapshotto get the updated UI snapshot. - Search for a button labeled "Show transcript", "显示转录稿", or "内容转文字".
- Call
mcp__chrome__clickto click that button.
3.5 Extract Content via DOM
Reason: Directly reading the accessibility tree for long lists is slow and consumes many tokens; DOM injection is more efficient.
Call mcp__chrome__evaluate_script to execute the following JavaScript:
() => {
// Select all transcript segment containers
const segments = document.querySelectorAll("ytd-transcript-segment-renderer");
if (!segments.length) return "BUFFERING"; // Retry if empty
// Iterate and format as "timestamp text"
return Array.from(segments)
.map((seg) => {
const time = seg.querySelector(".segment-timestamp")?.innerText.trim();
const text = seg.querySelector(".segment-text")?.innerText.trim();
return `${time} ${text}`;
})
.join("\n");
};
If it returns "BUFFERING", wait a few seconds and retry.
3.6 Save and Cleanup
- Use the Write tool to save the extracted text as a local file (e.g.,
<Video Title>.txt). - Call
mcp__chrome__close_pageto release resources.
Output Requirements
- Save the subtitle file to the current working directory.
- Filename format:
<Video Title>.txt - File content format: Each line should be
Timestamp Subtitle Text. - Report upon completion: File path, subtitle language, total number of lines.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
kiro-skill
Interactive feature development workflow from idea to implementation. Creates requirements (EARS format), design documents, and implementation task lists. Use when creating feature specs, requirements documents, design documents, or implementation plans. Triggered by "kiro" or references to .kiro/specs/ directory.
autonomous-skill
Use when work must continue across multiple Codex sessions with `.autonomous/` tracking, resumable execution, or autonomous handoff. Use for long-running, multi-session, or resume-later tasks.
deep-research
深度调研的多实例(多 Agent)编排工作流:把一个调研目标拆成可并行子目标,用 Codex CLI(`codex exec`)在默认 `workspace-write` 沙箱内运行子进程;联网与采集优先使用已安装的 skills,其次使用 MCP 工具;用脚本聚合子结果并分章精修,最终交付“成品报告文件路径 + 关键结论/建议摘要”。用于:系统性网页/资料调研、竞品/行业分析、批量链接/数据集分片检索、长文写作与证据整合,或用户提及“深度调研/Deep Research/Wide Research/多 Agent 并行调研/多进程调研”等场景。
spec-kit-skill
GitHub Spec-Kit integration for constitution-based spec-driven development. 7-phase workflow (constitution, specify, clarify, plan, tasks, analyze, implement). Use when working with spec-kit CLI, .specify/ directories, or creating specifications with constitution-driven development. Triggered by "spec-kit", "speckit", "constitution", "specify", references to .specify/ directory, or spec-kit commands.
nanobanana-skill
Generate, remix, or edit images with Nanobanana / Nano Banana 2 through the bundled Gemini CLI wrapper. Use this whenever the user wants AI image generation or editing, especially for reference-image composition, character consistency, grounded visuals that may need live web search, style transfer, marketing graphics, product mockups, social assets, or when they explicitly mention Nanobanana, Gemini image models, Google image generation, AI drawing, 图片生成, AI绘图, 图片编辑, or 生成图片.
claude-skill
Use when work should be delegated to Claude Code CLI, especially headless `claude -p` runs, automation scripts, CI jobs, resumable sessions, or requests to use Claude/Claude Code for a task.
Didn't find tool you were looking for?