Video Processing
410 skills in Content & Media > Video Processing
gemini-video-understanding
Analyze videos using Google's Gemini API - describe content, answer questions, transcribe audio with visual descriptions, reference timestamps, clip videos, and process YouTube URLs. Supports 9 video formats, multiple models (Gemini 2.5/2.0), and context windows up to 2M tokens (6 hours of video).
provider-integration-templates
OpenRouter framework integration templates for Vercel AI SDK, LangChain, and OpenAI SDK. Use when integrating OpenRouter with frameworks, setting up AI providers, building chat applications, implementing streaming responses, or when user mentions Vercel AI SDK, LangChain, OpenAI SDK, framework integration, or provider setup.
youtube-embed
YouTube video facade pattern for performance. Lazy-load iframe on click, poster images, GA4 tracking, Schema.org markup.
great-prompt-anatomy
Essential framework for creating solid Veo 3 prompts. Use when constructing video prompts, validating prompt completeness, or teaching prompt structure. Defines 8 mandatory components (Subject, Setting, Action, Style/Genre, Camera/Composition, Lighting/Mood, Audio, Constraints) that every prompt must include for professional results.
streaming-patterns
Configure ADK bidi-streaming for real-time multimodal interactions. Use when building live voice/video agents, implementing real-time streaming, configuring LiveRequestQueue, setting up audio/video processing, or when user mentions bidi-streaming, real-time agents, streaming tools, multimodal streaming, or Gemini Live API.
youtube
Extract subtitles, frames, and metadata from YouTube videos. Use when user shares a YouTube URL and wants transcript, screenshots, or video analysis.
streaming-output
Output format markers for the real-time stream formatter. Use when building prompts for streaming analysis to ensure proper progress display. Documents the patterns that StreamFormatter detects and displays.
ui-embed
Embed the chatbot UI inside Docusaurus and connect it to the FastAPI RAG backend. Use when building chat components, handling streaming responses, or integrating chat widgets into MDX pages.
gemini-video-understanding
Analyze videos using Google's Gemini API - describe content, answer questions, transcribe audio with visual descriptions, reference timestamps, clip videos, and process YouTube URLs. Supports 9 video formats, multiple models (Gemini 2.5/2.0), and context windows up to 2M tokens (6 hours of video).
news-collector-agent
Collects daily hot stock market issues and top movers for MeowStreet Wars video production. Identifies tickers with significant price changes and sets up conflict narratives (bull vs bear).
course-scripts
Use for Phase 7 of Course OS - writing production-ready scripts adapted to content type including video scripts, voiceovers, presentation notes, and facilitation guides. Triggers on "/course-scripts", "write scripts", "video script", "presentation notes", or after completing Phase 6.
session-processor
Orchestrate end-to-end processing of D&D session videos from upload through knowledge extraction. Use when the user wants a complete automated workflow to process a new session recording.
video-upload-patterns
Video upload patterns for YouTube, TikTok, and Vimeo. Use when uploading videos to platforms, managing video metadata, scheduling video releases, or handling bulk video uploads.
cloudflare-containers
Deploy and manage Docker containers on Cloudflare's global network alongside Workers. Use when building applications that need to run Python code, process large files (multi-GB zips, video transcoding), execute CLI tools, run AI inference, create code sandboxes, or any workload requiring more memory/CPU than Workers provide. Triggers include requests to run containers, execute arbitrary code, process large files, deploy backend services in Python/Go/Rust, or integrate heavyweight compute with Workers.
video-production-guidelines
Video script writing and production methodology for MYCURE using Apple Keynote presentation principles and two-column AV script format. Auto-activates for video scripts, scene breakdowns, production planning, tutorial videos, demo videos, explainer content. Includes one-message-per-scene principle, visual-audio harmony, 3-second rule, and professional script notation.
course-import
Use for Phase 1 of Course OS - collecting and cataloging source materials including existing course content, reference books, videos, competitor courses, and expert knowledge. Triggers on "/course-import", "import course materials", "add sources", "collect references", or when starting a new course project.
nextjs-advanced-routing
Guide for advanced Next.js App Router patterns including Route Handlers, Parallel Routes, Intercepting Routes, Server Actions, error boundaries, draft mode, and streaming with Suspense. CRITICAL for server actions (action.ts, actions.ts files, 'use server' directive), setting cookies from client components, and form handling. Use when requirements involve server actions, form submissions, cookies, mutations, API routes, `route.ts`, parallel routes, intercepting routes, or streaming. Essential for separating server actions from client components.
langchain
LangChain high-level agent framework. Build agents with tools, memory, and streaming in under 10 lines of code.
transcribe-vaam-video-with-gemini
Transcribe Vaam videos using Google Gemini AI. Takes a Vaam share URL, downloads the video, and returns a full text transcription. Supports any language without translation.
ai-talking-head
Specialized skill for AI talking head and lip-sync video generation. Use when you need presenter videos, UGC-style content, or lip-synced avatars. Triggers on: talking head, presenter video, lip sync, UGC video. Outputs professional talking head videos.