Agent skill

explainer

Create explainer videos with narration and AI-generated visuals. Triggers on: "解说视频", "explainer video", "explain this as a video", "tutorial video", "introduce X (video)", "解释一下XX（视频形式）".

View SKILL.md on GitHub Repository

Stars 37

Forks 1

Install this agent skill to your Project

npx add-skill https://github.com/marswaveai/skills/tree/main/explainer

Metadata

Additional technical details for this skill

openclaw: { "emoji": "\ud83c\udfac", "requires": { "env": [ "LISTENHUB_API_KEY" ] }, "primaryEnv": "LISTENHUB_API_KEY" }

SKILL.md

When to Use

User wants to create an explainer or tutorial video
User asks to "explain" something in video form
User wants narrated content with AI-generated visuals
User says "explainer video", "解说视频", "tutorial video"

When NOT to Use

User wants audio-only content without visuals (use /speech or /podcast)
User wants a podcast-style discussion (use /podcast)
User wants to generate a standalone image (use /image-gen)
User wants to read text aloud without video (use /speech)

Purpose

Generate explainer videos that combine a single narrator's voiceover with AI-generated visuals. Ideal for product introductions, concept explanations, and tutorials. Supports text-only script generation or full text + video output.

Hard Constraints

No shell scripts. Construct curl commands from the API reference files listed in Resources
Always read shared/authentication.md for API key and headers
Follow shared/common-patterns.md for polling, errors, and interaction patterns
Always read config following shared/config-pattern.md before any interaction
Never hardcode speaker IDs — always fetch from the speakers API
Never save files to ~/Downloads/ or .listenhub/ — save artifacts to the current working directory with friendly topic-based names (see shared/config-pattern.md § Artifact Naming)
Explainer uses exactly 1 speaker
Mode must be info (for Info style) or story (for Story style) — never slides (use /slides skill instead)

Step -1: API Key Check

Follow shared/config-pattern.md § API Key Check. If the key is missing, stop immediately.

Step 0: Config Setup

Follow shared/config-pattern.md Step 0 (Zero-Question Boot).

If file doesn't exist — silently create with defaults and proceed:

bash

mkdir -p ".listenhub/explainer"
echo '{"outputMode":"inline","language":null,"defaultStyle":null,"defaultSpeakers":{}}' > ".listenhub/explainer/config.json"
CONFIG_PATH=".listenhub/explainer/config.json"
CONFIG=$(cat "$CONFIG_PATH")

Do NOT ask any setup questions. Proceed directly to the Interaction Flow.

If file exists — read config silently and proceed:

bash

CONFIG_PATH=".listenhub/explainer/config.json"
[ ! -f "$CONFIG_PATH" ] && CONFIG_PATH="$HOME/.listenhub/explainer/config.json"
CONFIG=$(cat "$CONFIG_PATH")

Setup Flow (user-initiated reconfigure only)

Only run when the user explicitly asks to reconfigure. Display current settings:

当前配置 (explainer)：
  输出方式：{inline / download / both}
  语言偏好：{zh / en / 未设置}
  默认风格：{info / story / 未设置}
  默认主播：{speakerName / 使用内置默认}

Then ask:

outputMode: Follow shared/output-mode.md § Setup Flow Question.
Language (optional): "默认语言？"
- "中文 (zh)"
- "English (en)"
- "每次手动选择" → keep null
Style (optional): "默认风格？"
- "Info — 信息展示型"
- "Story — 故事叙述型"
- "每次手动选择" → keep null

After collecting answers, save immediately:

bash

NEW_CONFIG=$(echo "$CONFIG" | jq --arg m "$OUTPUT_MODE" '. + {"outputMode": $m}')
echo "$NEW_CONFIG" > "$CONFIG_PATH"
CONFIG=$(cat "$CONFIG_PATH")

Interaction Flow

Step 1: Topic / Content

Free text input. Ask the user:

What would you like to explain or introduce?

Accept: topic description, text content, or concept to explain.

Step 2: Language

If config.language is set, pre-fill and show in summary — skip this question. Otherwise ask:

Question: "What language?"
Options:
  - "Chinese (zh)" — Content in Mandarin Chinese
  - "English (en)" — Content in English

Step 3: Style

If config.defaultStyle is set, pre-fill and show in summary — skip this question. Otherwise ask:

Question: "What style of explainer?"
Options:
  - "Info" — Informational, factual presentation style
  - "Story" — Narrative, storytelling approach

Step 4: Speaker Selection

Follow shared/speaker-selection.md:

If config.defaultSpeakers.{language} is set → use saved speaker silently
If not set → use built-in default from shared/speaker-selection.md for the language
Show the speaker in the confirmation summary (Step 6) — user can change from there if desired
Only show the full speaker list if the user explicitly asks to change voice

Only 1 speaker is supported for explainer videos.

Step 5: Output Type

Question: "What output do you want?"
Options:
  - "Text script only" — Generate narration script, no video
  - "Text + Video" — Generate full explainer video with AI visuals

Step 6: Confirm & Generate

Summarize all choices:

Ready to generate explainer:

  Topic: {topic}
  Language: {language}
  Style: {info/story}
  Speaker: {speaker name}
  Output: {text only / text + video}

  Proceed?

Wait for explicit confirmation before calling any API.

Workflow

Submit (foreground): POST /storybook/episodes with content, speaker, language, mode → extract episodeId
Tell the user the task is submitted

Poll (background): Run the following exact bash command with run_in_background: true and timeout: 600000. Do NOT use python3, awk, or any other JSON parser — use jq as shown:

bash

EPISODE_ID="<id-from-step-1>"
for i in $(seq 1 30); do
  RESULT=$(curl -sS "https://api.marswave.ai/openapi/v1/storybook/episodes/$EPISODE_ID" \
    -H "Authorization: Bearer $LISTENHUB_API_KEY" \
    -H "X-Source: skills" 2>/dev/null)
  STATUS=$(echo "$RESULT" | tr -d '\000-\037\177' | jq -r '.data.processStatus // "pending"')
  case "$STATUS" in
    success|completed) echo "$RESULT"; exit 0 ;;
    failed|error) echo "FAILED: $RESULT" >&2; exit 1 ;;
    *) sleep 10 ;;
  esac
done
echo "TIMEOUT" >&2; exit 2

When notified, download and present script:

Read OUTPUT_MODE from config. Follow shared/output-mode.md for behavior.

inline or both: Present the script inline.

Present:
```
解说脚本已生成！

「{title}」

在线查看：https://listenhub.ai/app/explainer/{episodeId}
```
download or both: Also save the script file. Generate a topic slug following shared/config-pattern.md § Artifact Naming.
- If text-only output: save as {slug}-explainer.md in cwd (dedup if exists)
- If text+video output: create {slug}-explainer/ folder (dedup if exists), write script.md inside
- Present the save path in addition to the above summary.

If video requested: POST /storybook/episodes/{episodeId}/video (foreground) → poll again (background) using the exact bash command below with run_in_background: true and timeout: 600000. Poll for videoStatus, not processStatus:

bash

EPISODE_ID="<id-from-step-1>"
for i in $(seq 1 30); do
  RESULT=$(curl -sS "https://api.marswave.ai/openapi/v1/storybook/episodes/$EPISODE_ID" \
    -H "Authorization: Bearer $LISTENHUB_API_KEY" \
    -H "X-Source: skills" 2>/dev/null)
  STATUS=$(echo "$RESULT" | tr -d '\000-\037\177' | jq -r '.data.videoStatus // "pending"')
  case "$STATUS" in
    success|completed) echo "$RESULT"; exit 0 ;;
    failed|error) echo "FAILED: $RESULT" >&2; exit 1 ;;
    *) sleep 10 ;;
  esac
done
echo "TIMEOUT" >&2; exit 2

When notified, download and present result:

Present result

Read OUTPUT_MODE from config. Follow shared/output-mode.md for behavior.

inline or both: Display video URL and audio URL as clickable links.

Present:

解说视频已生成！

视频链接：{videoUrl}
音频链接：{audioUrl}
时长：{duration}s
消耗积分：{credits}

download or both: Also download the audio file into the {slug}-explainer/ folder.

bash

curl -sS -o "{slug}-explainer/audio.mp3" "{audioUrl}"

Present:

已保存到当前目录：
  {slug}-explainer/
    script.md
    audio.mp3

After Successful Generation

Update config with the choices made this session:

bash

NEW_CONFIG=$(echo "$CONFIG" | jq \
  --arg lang "{language}" \
  --arg style "{info/story}" \
  --arg speakerId "{speakerId}" \
  '. + {"language": $lang, "defaultStyle": $style, "defaultSpeakers": (.defaultSpeakers + {($lang): [$speakerId]})}')
echo "$NEW_CONFIG" > "$CONFIG_PATH"

Estimated times:

Text script only: 2-3 minutes
Text + Video: 3-5 minutes

API Reference

Speaker list: shared/api-speakers.md
Speaker selection guide: shared/speaker-selection.md
Episode creation: shared/api-storybook.md
Polling: shared/common-patterns.md § Async Polling
Config pattern: shared/config-pattern.md

Composability

Invokes: speakers API (for speaker selection); may invoke /speech for voiceover
Invoked by: content-planner (Phase 3)

Example

User: "Create an explainer video introducing Claude Code"

Agent workflow:

Topic: "Claude Code introduction"
Ask language → "English"
Ask style → "Info"
Fetch speakers, user picks "cozy-man-english"
Ask output → "Text + Video"

bash

curl -sS -X POST "https://api.marswave.ai/openapi/v1/storybook/episodes" \
  -H "Authorization: Bearer $LISTENHUB_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Source: skills" \
  -d '{
    "sources": [{"type": "text", "content": "Introduce Claude Code: what it is, key features, and how to get started"}],
    "speakers": [{"speakerId": "cozy-man-english"}],
    "language": "en",
    "mode": "info"
  }'

Poll until text is ready, then generate video if requested.

Maintainer

marswaveai Core maintainer

Source details

Full Name: marswaveai/skills
Branch: main
Path in repo: explainer
License: MIT License
Topics: claude cursor skills aigc podcast slides videos

Featured Tools

Join Our Newsletter

DEPRECATED — replaced by individual skills. Use when the user triggers any ListenHub action: "make a podcast", "explainer video", "read aloud", "TTS", "generate image", "解说视频", "播客", "朗读", "生成图片".

37 1

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

Metadata

SKILL.md

When to Use

When NOT to Use

Purpose

Hard Constraints

Step -1: API Key Check

Step 0: Config Setup

Setup Flow (user-initiated reconfigure only)

Interaction Flow

Step 1: Topic / Content

Step 2: Language

Step 3: Style

Step 4: Speaker Selection

Step 5: Output Type

Step 6: Confirm & Generate

Workflow

After Successful Generation

API Reference

Composability

Example

Recommended Agent Skills

tts

content-parser

image-gen

podcast

creator

listenhub