Video Processing
410 skills in Content & Media > Video Processing
youtube-download
Downloads a YouTube video using yt-dlp. Use this when you need to download a video from YouTube to the local machine for offline viewing or processing.
vercel-ai-sdk
Guide for Vercel AI SDK v5 implementation patterns including generateText, streamText, useChat hook, tool calling, embeddings, and MCP integration. Use when implementing AI chat interfaces, streaming responses, tool/function calling, text embeddings, or working with convertToModelMessages and toUIMessageStreamResponse. Activates for AI SDK integration, useChat hook usage, message streaming, or tool calling tasks.
research-agent
Use when researching AI agents, LLMs, hosting solutions, OCR technologies, video generation models, or evaluating technology stacks. Apply when user asks to research, compare, evaluate, or investigate technologies, frameworks, models, or tools. Use proactively when technical decisions require research backing.
moai-streaming-ui
Enhanced streaming UI system with progress indicators, status displays, and interactive feedback mechanisms. Use when running long-running operations, displaying progress, providing user feedback, or when visual indicators enhance user experience during complex workflows.
pptx-to-html
Convert PowerPoint (.pptx) presentations to standalone HTML format with FULL style, position, and formatting preservation. Accurately replicates slides with exact fonts, colors, shapes, backgrounds, layouts, hyperlinks, videos, audio, and tables. Use for web-friendly presentations that maintain visual fidelity and interactivity.
ai-dev-guidelines
Comprehensive AI/ML development guide for LangChain, LangGraph, and ML model integration in FastAPI. Use when building LLM applications, agents, RAG systems, sentiment analysis, aspect-based analysis, chain orchestration, prompt engineering, vector stores, embeddings, or integrating ML models with FastAPI endpoints. Covers LangChain patterns, LangGraph state machines, model deployment, API integration, streaming, error handling, and best practices.
obsidian-vault-manager
Manage Obsidian knowledge base - capture ideas, YouTube videos, articles, repositories, create study guides, and publish to GitHub Pages. Use smart AI tagging for automatic organization.
video-to-gif
Convert multiple video files (MOV/MP4) into a single merged GIF with customizable speed per segment.Use this skill when users want to:- Merge multiple videos into one GIF- Create demo GIFs from screen recordings- Combine video clips with different playback speeds- Convert videos to optimized GIFs with compressionTriggers: "create GIF from videos", "merge videos to GIF", "convert MOV to GIF", "combine videos into animated GIF"
youtube-transcript
Use when a YouTube video transcript is needed e.g. for summarisation or Q&A on the content.
gemini-video-understanding
Analyze and understand videos using Google's Gemini API. Use when the user asks to analyze, understand, describe, summarize, transcribe, or extract information from videos. Supports local video files (MP4, MOV, WebM, etc.) and YouTube URLs. Can answer questions about video content, describe scenes, identify objects/people/actions, extract text/timestamps, and more. Use this skill when user provides a video file path or YouTube link and wants to understand its content.
veo
Video generation with Veo 3.1
youtube-cache
YouTube video cache operations with Qdrant. Auto-checks cache when YouTube URLs detected. Provides semantic search, verification, ingestion, and archive access. Shows metadata by default, transcript on request.
brand-marketing
Use when creating commercial animations, advertising motion, brand identity animation, logo reveals, or marketing video content.
launch-video
プロダクトローンチ動画の台本作成を支援。ローンチ動画、ウェビナー、オンラインセミナーのスクリプトを書く際に使用。3話構成・4話構成に対応。
get-youtube-transcript-raw
Capture a YouTube video transcript as raw material using `ytt`, storing it in the raw/ directory with minimal metadata for later distillation.
kaizen
Kailash Kaizen - production-ready AI agent framework with signature-based programming, multi-agent coordination, and enterprise features. Use when asking about 'AI agents', 'agent framework', 'BaseAgent', 'multi-agent systems', 'agent coordination', 'signatures', 'agent signatures', 'RAG agents', 'vision agents', 'audio agents', 'multimodal agents', 'agent prompts', 'prompt optimization', 'chain of thought', 'ReAct pattern', 'Planning agent', 'PEV agent', 'Tree-of-Thoughts', 'pipeline patterns', 'supervisor-worker', 'router pattern', 'ensemble pattern', 'blackboard pattern', 'parallel execution', 'agent-to-agent communication', 'A2A protocol', 'streaming agents', 'agent testing', 'agent memory', 'agentic workflows', 'AgentRegistry', 'OrchestrationRuntime', 'distributed agents', 'agent registry', '100+ agents', 'capability discovery', 'fault tolerance', or 'health monitoring'.
pipeline-dev
Audio/video pipeline development with NAM, Pedalboard, FFmpeg, and Playwright. Use for guitar tone processing, IR convolution, video rendering, and audio effects.
youtube-transcribe
Download YouTube video transcripts with timestamps. Use when asked to transcribe a YouTube video, get transcript, or extract text from a video URL.
long-prompt-guide
Production Brief methodology for complex Veo 3 video scenes. Use when creating scenes with dialogue, character continuity, structured settings, or multi-beat sequences. Provides 11-block framework (Format & Tone, Main Subjects, Wardrobe & Props, Location & Framing, Lighting & Palette, Continuity Rules, Actions & Camera Beats, Montage Plan, Dialogue, Sound & Foley, Finish) for professional, replicable results.
gemini
All-purpose Gemini 3 Pro client with Thinking enabled. Query, analyze files (MP4, PDF, images), YouTube videos, generate/edit images. Uses browser cookies - no API key required.