Agent skills
youtube-transcribe-skill

Agent skill

youtube-transcribe-skill

Extract subtitles/transcripts from a YouTube video URL and save as a local file. Use when you need to extract subtitles from a YouTube video.

View SKILL.md on GitHub Repository

Stars 168

Forks 27

Install this agent skill to your Project

npx add-skill https://github.com/feiskyer/codex-settings/tree/main/skills/youtube-transcribe-skill

SKILL.md

YouTube Transcript Extraction

Extract subtitles/transcripts from a YouTube video URL and save them as a local file.

Input YouTube URL: $ARGUMENTS

Step 1: Verify URL and Get Video Information

Verify URL Format: Confirm the input is a valid YouTube URL (supports youtube.com/watch?v= or youtu.be/ formats).
Get Video Information:
- If yt-dlp is available, prefer yt-dlp --get-title "[VIDEO_URL]".
- If using browser automation, extract the title from the page (via snapshot or document.title) for file naming.

Step 2: CLI Quick Extraction (Priority Attempt)

Use command-line tools to quickly extract subtitles.

Check Tool Availability: Execute which yt-dlp.
- If yt-dlp is found, proceed to subtitle download.
- If yt-dlp is NOT found, skip immediately to Step 3.
Execute Subtitle Download (Only if yt-dlp is found):
- Tip: Always add --cookies-from-browser to avoid sign-in restrictions. Default to chrome.
- Retry Logic: If yt-dlp fails with a browser error (e.g., "Could not open Chrome"), ask the user to specify their available browser (e.g., firefox, safari, edge) and retry.
bash
```
# Get the title first (try chrome first)
yt-dlp --cookies-from-browser=chrome --get-title "[VIDEO_URL]"

# Download subtitles
yt-dlp --cookies-from-browser=chrome --write-auto-sub --write-sub --sub-lang zh-Hans,zh-Hant,en --skip-download --output "<Video Title>.%(ext)s" "[VIDEO_URL]"
```
Verify Results:
- Check the command exit code.
- Exit code 0 (Success): Subtitles have been saved locally, task complete.
- Exit code non-0 (Failure):
  - If error is related to browser/cookies, ask user for correct browser and retry Step 2.
  - If other errors (e.g., video unavailable), proceed to Step 3.

Step 3: Browser Automation (Fallback)

When the CLI method fails or yt-dlp is missing, use browser UI automation to extract subtitles.

Check Tool Availability:
- Check if chrome-devtools-mcp tools (specifically mcp__chrome__new_page) are available.
- CRITICAL CHECK: If chrome-devtools-mcp is NOT available AND yt-dlp was NOT found in Step 2:
  - STOP execution.
  - Notify the User: "Unable to proceed. Please either install yt-dlp (for fast CLI extraction) OR configure chrome-devtools-mcp (for browser automation)."
Initialize Browser Session (If tools are available):

Call mcp__chrome__new_page to open the video URL.

3.2 Analyze Page State

Call mcp__chrome__take_snapshot to read the page accessibility tree.

3.3 Expand Video Description

Reason: The "Show transcript" button is usually hidden within the collapsed description area.

Search the snapshot for a button labeled "...more", "...更多", or "Show more" (usually located in the description block below the video title).
Call mcp__chrome__click to click that button.

3.4 Open Transcript Panel

Call mcp__chrome__take_snapshot to get the updated UI snapshot.
Search for a button labeled "Show transcript", "显示转录稿", or "内容转文字".
Call mcp__chrome__click to click that button.

3.5 Extract Content via DOM

Reason: Directly reading the accessibility tree for long lists is slow and consumes many tokens; DOM injection is more efficient.

Call mcp__chrome__evaluate_script to execute the following JavaScript:

javascript

() => {
  // Select all transcript segment containers
  const segments = document.querySelectorAll("ytd-transcript-segment-renderer");
  if (!segments.length) return "BUFFERING"; // Retry if empty

  // Iterate and format as "timestamp text"
  return Array.from(segments)
    .map((seg) => {
      const time = seg.querySelector(".segment-timestamp")?.innerText.trim();
      const text = seg.querySelector(".segment-text")?.innerText.trim();
      return `${time} ${text}`;
    })
    .join("\n");
};

If it returns "BUFFERING", wait a few seconds and retry.

3.6 Save and Cleanup

Use the Write tool to save the extracted text as a local file (e.g., <Video Title>.txt).
Call mcp__chrome__close_page to release resources.

Output Requirements

Save the subtitle file to the current working directory.
Filename format: <Video Title>.txt
File content format: Each line should be Timestamp Subtitle Text.
Report upon completion: File path, subtitle language, total number of lines.

Maintainer

feiskyer Core maintainer

Source details

Full Name: feiskyer/codex-settings
Branch: main
Path in repo: skills/youtube-transcribe-skill
License: MIT License
Topics: ai claude-code agents codex claude-skills vibe-coding openai agentic-ai copilot litellm spec-driven-development

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

feiskyer/codex-settings

kiro-skill

Interactive feature development workflow from idea to implementation. Creates requirements (EARS format), design documents, and implementation task lists. Use when creating feature specs, requirements documents, design documents, or implementation plans. Triggered by "kiro" or references to .kiro/specs/ directory.

168 27

Explore

feiskyer/codex-settings

autonomous-skill

Use when work must continue across multiple Codex sessions with `.autonomous/` tracking, resumable execution, or autonomous handoff. Use for long-running, multi-session, or resume-later tasks.

168 27

Explore

feiskyer/codex-settings

deep-research

深度调研的多实例（多 Agent）编排工作流：把一个调研目标拆成可并行子目标，用 Codex CLI（`codex exec`）在默认 `workspace-write` 沙箱内运行子进程；联网与采集优先使用已安装的 skills，其次使用 MCP 工具；用脚本聚合子结果并分章精修，最终交付“成品报告文件路径 + 关键结论/建议摘要”。用于：系统性网页/资料调研、竞品/行业分析、批量链接/数据集分片检索、长文写作与证据整合，或用户提及“深度调研/Deep Research/Wide Research/多 Agent 并行调研/多进程调研”等场景。

168 27

Explore

feiskyer/codex-settings

spec-kit-skill

GitHub Spec-Kit integration for constitution-based spec-driven development. 7-phase workflow (constitution, specify, clarify, plan, tasks, analyze, implement). Use when working with spec-kit CLI, .specify/ directories, or creating specifications with constitution-driven development. Triggered by "spec-kit", "speckit", "constitution", "specify", references to .specify/ directory, or spec-kit commands.

168 27

Explore

feiskyer/codex-settings

nanobanana-skill

Generate, remix, or edit images with Nanobanana / Nano Banana 2 through the bundled Gemini CLI wrapper. Use this whenever the user wants AI image generation or editing, especially for reference-image composition, character consistency, grounded visuals that may need live web search, style transfer, marketing graphics, product mockups, social assets, or when they explicitly mention Nanobanana, Gemini image models, Google image generation, AI drawing, 图片生成, AI绘图, 图片编辑, or 生成图片.

168 27

Explore

feiskyer/codex-settings

claude-skill

Use when work should be delegated to Claude Code CLI, especially headless `claude -p` runs, automation scripts, CI jobs, resumable sessions, or requests to use Claude/Claude Code for a task.

168 27

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

YouTube Transcript Extraction

Step 1: Verify URL and Get Video Information

Step 2: CLI Quick Extraction (Priority Attempt)

Step 3: Browser Automation (Fallback)

3.2 Analyze Page State

3.3 Expand Video Description

3.4 Open Transcript Panel

3.5 Extract Content via DOM

3.6 Save and Cleanup

Output Requirements

Recommended Agent Skills

kiro-skill

autonomous-skill

deep-research

spec-kit-skill

nanobanana-skill

claude-skill