{/* This page is auto-generated from the skill’s SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}

Songsee

Audio spectrograms/features (mel, chroma, MFCC) via CLI.

技能元数据

SourceBundled (installed by default)
Pathskills/media/songsee
Version1.0.0
Authorcommunity
LicenseMIT
Platformslinux, macos, windows
TagsAudio, Visualization, Spectrogram, Music, Analysis

参考:完整 SKILL.md

:::info The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. :::

songsee

从音频文件生成频谱图和多面板音频特征可视化。

前置条件

Requires Go:

go install github.com/steipete/songsee/cmd/songsee@latest

Optional: ffmpeg for formats beyond WAV/MP3.

快速入门

# Basic spectrogram
songsee track.mp3
 
# Save to specific file
songsee track.mp3 -o spectrogram.png
 
# Multi-panel visualization grid
songsee track.mp3 --viz spectrogram,mel,chroma,hpss,selfsim,loudness,tempogram,mfcc,flux
 
# Time slice (start at 12.5s, 8s duration)
songsee track.mp3 --start 12.5 --duration 8 -o slice.jpg
 
# From stdin
cat track.mp3 | songsee - --format png -o out.png

Visualization Types

Use --viz with comma-separated values:

TypeDescription
spectrogramStandard frequency spectrogram
melMel-scaled spectrogram
chromaPitch class distribution
hpssHarmonic/percussive separation
selfsimSelf-similarity matrix
loudnessLoudness over time
tempogramTempo estimation
mfccMel-frequency cepstral coefficients
fluxSpectral flux (onset detection)

Multiple --viz types render as a grid in a single image.

Common Flags

FlagDescription
--vizVisualization types (comma-separated)
--styleColor palette: classic, magma, inferno, viridis, gray
--width / --heightOutput image dimensions
--window / --hopFFT window and hop size
--min-freq / --max-freqFrequency range filter
--start / --durationTime slice of the audio
--formatOutput format: jpg or png
-oOutput file path

说明

  • WAV and MP3 are decoded natively; other formats require ffmpeg
  • Output images can be inspected with vision_analyze for automated audio analysis
  • Useful for comparing audio outputs, debugging synthesis, or documenting audio processing pipelines