Agent skill

songsee

Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc.) from audio files via CLI. Useful for audio analysis, music production debugging, and visual documentation.

Stars 56,643
Forks 7,481

Install this agent skill to your Project

npx add-skill https://github.com/NousResearch/hermes-agent/tree/main/skills/media/songsee

Metadata

Additional technical details for this skill

hermes
{
    "tags": [
        "Audio",
        "Visualization",
        "Spectrogram",
        "Music",
        "Analysis"
    ],
    "homepage": "https://github.com/steipete/songsee"
}

SKILL.md

songsee

Generate spectrograms and multi-panel audio feature visualizations from audio files.

Prerequisites

Requires Go:

bash
go install github.com/steipete/songsee/cmd/songsee@latest

Optional: ffmpeg for formats beyond WAV/MP3.

Quick Start

bash
# Basic spectrogram
songsee track.mp3

# Save to specific file
songsee track.mp3 -o spectrogram.png

# Multi-panel visualization grid
songsee track.mp3 --viz spectrogram,mel,chroma,hpss,selfsim,loudness,tempogram,mfcc,flux

# Time slice (start at 12.5s, 8s duration)
songsee track.mp3 --start 12.5 --duration 8 -o slice.jpg

# From stdin
cat track.mp3 | songsee - --format png -o out.png

Visualization Types

Use --viz with comma-separated values:

Type Description
spectrogram Standard frequency spectrogram
mel Mel-scaled spectrogram
chroma Pitch class distribution
hpss Harmonic/percussive separation
selfsim Self-similarity matrix
loudness Loudness over time
tempogram Tempo estimation
mfcc Mel-frequency cepstral coefficients
flux Spectral flux (onset detection)

Multiple --viz types render as a grid in a single image.

Common Flags

Flag Description
--viz Visualization types (comma-separated)
--style Color palette: classic, magma, inferno, viridis, gray
--width / --height Output image dimensions
--window / --hop FFT window and hop size
--min-freq / --max-freq Frequency range filter
--start / --duration Time slice of the audio
--format Output format: jpg or png
-o Output file path

Notes

  • WAV and MP3 are decoded natively; other formats require ffmpeg
  • Output images can be inspected with vision_analyze for automated audio analysis
  • Useful for comparing audio outputs, debugging synthesis, or documenting audio processing pipelines

Expand your agent's capabilities with these related and highly-rated skills.

NousResearch/hermes-agent

agentmail

Give the agent its own dedicated email inbox via AgentMail. Send, receive, and manage email autonomously using agent-owned email addresses (e.g. hermes-agent@agentmail.to).

56,643 7,481
Explore
NousResearch/hermes-agent

base

Query Base (Ethereum L2) blockchain data with USD pricing — wallet balances, token info, transaction details, gas analysis, contract inspection, whale detection, and live network stats. Uses Base RPC + CoinGecko. No API key required.

56,643 7,481
Explore
NousResearch/hermes-agent

solana

Query Solana blockchain data with USD pricing — wallet balances, token portfolios with values, transaction details, NFTs, whale detection, and live network stats. Uses Solana RPC + CoinGecko. No API key required.

56,643 7,481
Explore
NousResearch/hermes-agent

one-three-one-rule

Structured decision-making framework for technical proposals and trade-off analysis. When the user faces a choice between multiple approaches (architecture decisions, tool selection, refactoring strategies, migration paths), this skill produces a 1-3-1 format: one clear problem statement, three distinct options with pros/cons, and one concrete recommendation with definition of done and implementation plan. Use when the user asks for a "1-3-1", says "give me options", or needs help choosing between competing approaches.

56,643 7,481
Explore
NousResearch/hermes-agent

fastmcp

Build, test, inspect, install, and deploy MCP servers with FastMCP in Python. Use when creating a new MCP server, wrapping an API or database as MCP tools, exposing resources or prompts, or preparing a FastMCP server for Claude Code, Cursor, or HTTP deployment.

56,643 7,481
Explore
NousResearch/hermes-agent

qdrant-vector-search

High-performance vector similarity search engine for RAG and semantic search. Use when building production RAG systems requiring fast nearest neighbor search, hybrid search with filtering, or scalable vector storage with Rust-powered performance.

56,643 7,481
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results