影片處理
410 skills in 內容與媒體 > 影片處理
google-gemini-api
Integrate Gemini API with correct current SDK (@google/genai v1.27+, NOT deprecated @google/generative-ai). Supports text generation, multimodal (images/video/audio/PDFs), function calling, and thinking mode. 1M input tokens. Use when: integrating Gemini API, implementing multimodal AI, using thinking mode for reasoning, function calling with parallel execution, streaming responses, deploying to Cloudflare Workers, building chat, or troubleshooting SDK deprecation, context window, model not found, function calling, or multimodal format errors. Keywords: gemini api, @google/genai, gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite, gemini-3-pro-preview, multimodal gemini, thinking mode, google ai, genai sdk, function calling gemini, streaming gemini, gemini vision, gemini video, gemini audio, gemini pdf, system instructions, multi-turn chat, DEPRECATED @google/generative-ai, gemini context window, gemini models 2025, gemini 1m tokens, gemini tool use, parallel function calling, compositional function calling, gemini 3
youtube-strategy
Provides strategic up to date guidance for developing high-leverage youtube content and strategic development. Use this skill when the user asks about youtube strategy, generating youtube scripts or youtube content.
video-to-article
Use this skill when the user wants to convert a lecture, presentation, or talk video into text formats (transcript, outline, or article). Trigger when user mentions processing video recordings, creating transcripts from lectures, or generating articles from recorded presentations.
openai-agents
Build AI applications with OpenAI Agents SDK - text agents, voice agents (realtime), multi-agent workflows with handoffs, tools with Zod schemas, input/output guardrails, structured outputs, and streaming. Deploy to Cloudflare Workers, Next.js, or React with human-in-the-loop patterns. Use when: building text-based agents with tools and Zod schemas, creating realtime voice agents with WebRTC/WebSocket, implementing multi-agent workflows with handoffs between specialists, setting up input/output guardrails for safety, requiring human approval for critical actions, streaming agent responses, deploying agents to Cloudflare Workers or Next.js, or troubleshooting Zod schema type errors, MCP tracing failures, infinite loops (MaxTurnsExceededError), tool call failures, schema mismatches, or voice agent handoff constraints.
incremental-fetch
Build resilient data ingestion pipelines from APIs. Use when creating scripts that fetch paginated data from external APIs (Twitter, exchanges, any REST API) and need to track progress, avoid duplicates, handle rate limits, and support both incremental updates and historical backfills. Triggers: 'ingest data from API', 'pull tweets', 'fetch historical data', 'sync from X', 'build a data pipeline', 'fetch without re-downloading', 'resume the download', 'backfill older data'. NOT for: simple one-shot API calls, websocket/streaming connections, file downloads, or APIs without pagination.
chatkit-integration
Foundation skill for integrating OpenAI ChatKit framework with custom backends. This skill should be used for initial ChatKit setup including server implementation, React component integration, authentication, context injection, and database persistence. For streaming UI patterns use chatkit-streaming. For interactive widgets and actions use chatkit-actions.
social-media-bio-generator
Create compelling social media bios for any platform. Use when the user needs bios for Twitter/X, Instagram, LinkedIn, TikTok, YouTube, or any social platform.
mixmi-curation-model
Complete curation and streaming economics including playlists, radio, mixed content, revenue calculations, and the two-economy system (creation vs curation)
claude-api
Build with Claude Messages API using structured outputs (v0.69.0+, Nov 2025) for guaranteed JSON schema validation. Covers prompt caching (90% savings), streaming SSE, tool use, model deprecations (3.5/3.7 retired Oct 2025). Use when: building chatbots/agents with validated JSON responses, or troubleshooting rate_limit_error, structured output validation, prompt caching not activating, streaming SSE parsing.
streaming-output
Stream long-form content to markdown files with resume capability. Writes content incrementally with section markers, enabling recovery if context limits are hit. Use when generating long documents (over 1000 lines), B-SPEC or specification writing, multi-section reports, any task where context compaction may occur mid-generation, or when user explicitly requests streaming output. Commands: init, write, status, resume, finalize, repair.
go-grpc
Build gRPC services with Go - protobuf, streaming, interceptors
mixmi-color-system
Platform color palette with semantic meanings, hex codes, accessibility notes, and usage patterns for all content types including loops, songs, playlists, video, and radio
fabric
Native Fabric pattern execution for Claude Code. USE WHEN processing content with Fabric patterns (extract_wisdom, summarize, analyze_claims, threat modeling, etc.). Patterns run natively in Claude's context - no CLI spawning needed. Only use fabric CLI for YouTube transcripts (-y) or pattern updates (-U).
shorts-presentation-skill
Create vertical (9:16) interactive presentations optimized for YouTube Shorts, TikTok, and Instagram Reels. Takes YouTube video URLs to extract facts via Playwright MCP and web research, then generates animated slides you can screen record and narrate. Perfect for quick educational content, fact-reveals, and viral short-form videos.
auvima-video-production
AuViMa视频自动化制作专家。当用户需要收集视频资料、制作视频素材(clip)、生成声音克隆配音或合成最终视频时使用此skill。涵盖Chrome CDP操作、屏幕录制、声音克隆API调用和视频合成流程。
video-clipper
Cut video segments by timestamp, split videos into chunks, trim start/end, and extract specific scenes with precise frame control.
research-source-processing
Process expert sources (videos, podcasts, articles, books) into structured insights. Use when ingesting new knowledge sources for extraction and analysis.
video-generation-skill
Design video concepts, scripts, shotlists, transitions, and editing notes for VEO, Gemini, and Nano Banana-based pipelines. Use when turning a marketing idea into concrete video assets.
supadata
Supadata API via curl. Use this skill to extract transcripts from YouTube/TikTok/Instagram videos and scrape web content to markdown.
ffmpeg-media-processing
Use when user asks to convert, compress, trim, resize, extract audio, add subtitles, create GIFs, or process video/audio files