youtube-transcript-analyzer
Use when analyzing YouTube videos for research, learning, or understanding how content relates to a project - downloads transcripts with yt-dlp, chunks long content, and provides context-aware analysis
$ 安裝
git clone https://github.com/Krosebrook/source-of-truth-monorepo /tmp/source-of-truth-monorepo && cp -r /tmp/source-of-truth-monorepo/plugins/marketplaces/ai-coding-config/.claude/skills/youtube-transcript-analyzer ~/.claude/skills/source-of-truth-monorepo// tip: Run this command in your terminal to install the skill
name: youtube-transcript-analyzer description: Use when analyzing YouTube videos for research, learning, or understanding how content relates to a project - downloads transcripts with yt-dlp, chunks long content, and provides context-aware analysis
YouTube Transcript Analyzer
Overview
Download and analyze YouTube video transcripts to extract insights, understand concepts, and relate content to your work. Uses yt-dlp for reliable transcript extraction with intelligent chunking for long-form content.
When to Use
Use when you need to:
- Understand how a YouTube video/tutorial relates to your current project
- Research technical concepts explained in video format
- Extract key insights from talks, presentations, or educational content
- Compare video content with your codebase or approach
- Learn from video demonstrations without watching the entire video
Prerequisites
Ensure yt-dlp is installed:
# Install via pip
pip install yt-dlp
# Or via homebrew (macOS)
brew install yt-dlp
# Verify installation
yt-dlp --version
Transcript Extraction Process
Download Transcript
Use yt-dlp to extract subtitles/transcripts:
# Download transcript only (no video)
yt-dlp --skip-download --write-auto-sub --sub-format vtt --output "transcript.%(ext)s" URL
# Or get manually created subtitles if available (higher quality)
yt-dlp --skip-download --write-sub --sub-lang en --sub-format vtt --output "transcript.%(ext)s" URL
# Get video metadata for context
yt-dlp --skip-download --print-json URL
Handle Long Transcripts
For transcripts exceeding 8,000 tokens (roughly 6,000 words or 45+ minutes):
- Split into logical chunks based on timestamp or topic breaks
- Generate a summary for each chunk focusing on key concepts
- Create an overall synthesis connecting themes to the user's question
- Reference specific timestamps for detailed sections
For shorter transcripts, analyze directly without chunking.
Analysis Approach
Context-Aware Analysis
When analyzing with respect to a project or question:
- Extract the video's core concepts and techniques
- Identify patterns, architectures, or approaches discussed
- Compare with the current project's implementation
- Highlight relevant insights, differences, and potential applications
- Note specific timestamps for key moments
Structured Output
Provide analysis in this format:
Video Overview:
- Title, author, duration
- Main topic and key themes
Key Insights:
- Concept 1 with timestamp
- Concept 2 with timestamp
- Technical approaches explained
Relevance to Your Project:
- Direct applications
- Differences from current approach
- Potential improvements or learnings
Specific Recommendations:
- Actionable items based on video content
- Code patterns or techniques to consider
Example Workflow
# 1. Get video metadata
yt-dlp --print-json "https://youtube.com/watch?v=VIDEO_ID" > metadata.json
# 2. Download transcript
yt-dlp --skip-download --write-auto-sub --sub-lang en --sub-format vtt \
--output "transcript" "https://youtube.com/watch?v=VIDEO_ID"
# 3. Read and analyze transcript content
# 4. If long: chunk by timestamp ranges (every 10-15 minutes)
# 5. Generate summaries and relate to user's question
Handling Common Issues
No transcript available:
- Some videos lack auto-generated or manual captions
- Inform user and offer alternative approaches (video description, comments)
Multiple languages:
- Prefer English transcripts:
--sub-lang en - If unavailable, check available languages:
--list-subs
Long processing time:
- Set expectations for videos over 2 hours
- Offer to focus on specific sections if timestamps provided
Best Practices
Focus analysis on practical application rather than comprehensive summaries. Users want to know "how does this help me" not "what did they say for 90 minutes."
Extract concrete examples and code patterns when available. Reference specific timestamps so users can jump to relevant sections.
When comparing with project code, be specific about similarities and differences. Vague comparisons like "similar approach" don't add value.
For technical content, identify the underlying patterns and principles rather than surface-level implementation details. Help users understand transferable concepts.
Token Efficiency
For very long transcripts (2+ hours):
- Process in 15-20 minute segments
- Summarize each segment to 200-300 words
- Create final synthesis under 500 words
- Provide detailed analysis only for highly relevant sections
Repository
