bulk-summarize

This skill should be used when the user asks to "summarize videos", "summarize podcasts", "research a topic using media", "bulk summarize content", "scan YouTube channels", "scan podcast feeds", "create podcast notes", "digest conference talks", "summarize Apple Podcasts", or mentions video/podcast research, media summarization, or bulk content processing.

$ 安裝

git clone https://github.com/smerchek/bulk-summarize /tmp/bulk-summarize && cp -r /tmp/bulk-summarize/skills/bulk-summarize ~/.claude/skills/bulk-summarize

// tip: Run this command in your terminal to install the skill


name: bulk-summarize description: This skill should be used when the user asks to "summarize videos", "summarize podcasts", "research a topic using media", "bulk summarize content", "scan YouTube channels", "scan podcast feeds", "create podcast notes", "digest conference talks", "summarize Apple Podcasts", or mentions video/podcast research, media summarization, or bulk content processing. version: 0.1.0

Bulk Media Summarizer

A tool for scanning and summarizing media content across YouTube, podcast RSS feeds, SoundCloud, and other sources supported by yt-dlp. Ideal for podcast research, conference talk digests, tutorial compilations, and topic research.

Prerequisites

Ensure these dependencies are installed:

  • bun: curl -fsSL https://bun.sh/install | bash
  • yt-dlp: brew install yt-dlp or pip install yt-dlp
  • summarize: See https://summarize.sh (GitHub)

Supported Platforms

PlatformURL PatternNotes
YouTubeyoutube.com/@channel, playlist URLsHas built-in transcripts, fastest
RSS FeedsDirect feed URLsWorks with any podcast RSS feed
SoundCloudsoundcloud.com/user/...Audio-only, auto-transcribed
Vimeovimeo.com/...Video
Twitchtwitch.tv/videos/...VODs and clips

Apple Podcasts

Apple Podcasts URLs don't work directly with yt-dlp. Extract the RSS feed URL first:

  1. Use WebFetch on the Apple Podcasts URL
  2. Find the RSS feed URL in the page (usually in metadata)
  3. Use the RSS feed URL in the config

Example:

  • Apple URL: https://podcasts.apple.com/us/podcast/huberman-lab/id1545953110
  • RSS Feed: https://feeds.megaphone.fm/hubermanlab (use this in config)

Core Commands

bun run /path/to/bulk-summarize.ts [command]
CommandDescription
init [name]Create starter config file
scanFind content matching keywords
summarizeProcess pending items
combineMerge all summaries into one document
statusCheck progress for all sources
listShow configured sources
reset [source]Clear checkpoint data

Options

OptionDescription
-c, --config <file>Config file (default: bulk-summarize.json)
-s, --source <id>Target specific source
-n, --limit <n>Limit items to process
-p, --parallel <n>Concurrent summarizations (default: 1)
-d, --delay <ms>Delay between items (default: 1000)

Workflow

Quick Start

# 1. Create config
bun run bulk-summarize.ts init my-research.json

# 2. Edit config to add sources

# 3. Scan and process
bun run bulk-summarize.ts -c my-research.json scan
bun run bulk-summarize.ts -c my-research.json summarize
bun run bulk-summarize.ts -c my-research.json combine --output research.md

YouTube Channels

# Search for channels
yt-dlp "ytsearch10:TOPIC podcast" --flat-playlist \
  --print "%(channel_url)s %(channel)s" 2>/dev/null | sort -u

# Verify channel
yt-dlp --flat-playlist --print "%(playlist_uploader)s" \
  --playlist-end 1 "https://www.youtube.com/@ChannelName/videos"

Podcast RSS Feeds

# Test RSS feed
yt-dlp --flat-playlist --print "%(title)s" \
  "https://feeds.megaphone.fm/hubermanlab" --playlist-end 3

To get RSS feed from Apple Podcasts: use WebFetch on the Apple Podcasts URL and look for the feed URL in the response (typically in page metadata or as a direct link).

Config Structure

{
  "name": "Project Name",
  "keywords": ["keyword1", "keyword2"],
  "sources": [
    {
      "id": "source-id",
      "name": "Display Name",
      "url": "https://...",
      "type": "channel",
      "enabled": true
    }
  ],
  "settings": {
    "maxVideosPerSource": 50,
    "summaryLength": "xl",
    "summaryPrompt": "Your prompt.\n\nTitle: {title}",
    "outputDir": "summaries"
  }
}

Summary Lengths

LengthUse Case
shortQuick overview (<15 min content)
mediumStandard summary
longDetailed notes
xlComprehensive coverage
xxlMaximum detail (2+ hour content)

Audio vs Video Processing

YouTube (has transcripts)

  • Fast processing using existing captions
  • Highest accuracy for transcribed content

Audio Podcasts (RSS, SoundCloud)

  • Requires transcription via Whisper
  • summarize handles this automatically
  • Slightly slower due to transcription step
  • Quality depends on audio clarity

Output Structure

summaries/
  source-id/
    .checkpoint.json    # Progress tracking
    episode1.md         # Individual summaries
    episode2.md

Troubleshooting

Source Not Found

# Test URL directly
yt-dlp --flat-playlist --print "%(title)s" "URL" --playlist-end 3

Transcription Fails

  • Audio quality too low
  • Non-English content (specify language in summarize)
  • Try reducing parallel processing

Additional Resources

Reference Files

  • references/config-templates.md - Ready-to-use configs for common use cases
  • references/prompt-patterns.md - Effective prompts by content type
  • references/platform-guide.md - Platform-specific URL formats and tips

Example Files

  • examples/podcast-research.json - Multi-source podcast research
  • examples/conference-talks.json - Conference playlist digest
  • examples/multi-platform.json - Cross-platform research setup