圖片處理
912 skills in 內容與媒體 > 圖片處理
cloudinary
Upload images and videos to Cloudinary with CDN delivery and transformations. Use this skill for media hosting, optimization, resizing, format conversion, and video concatenation.
replicate-cli
This skill provides comprehensive guidance for using the Replicate CLI to run AI models, create predictions, manage deployments, and fine-tune models. Use this skill when the user wants to interact with Replicate's AI model platform via command line, including running image generation models, language models, or any ML model hosted on Replicate. This skill should be used when users ask about running models on Replicate, creating predictions, managing deployments, fine-tuning models, or working with the Replicate API through the CLI.
docker-containerization
Package applications into secure, portable Docker images with validated pipelines
reverse-engineering-firmware
Firmware-focused reverse engineering for embedded/IoT images with extraction, partition analysis, and secure handling.
image-gen
Generate images via local SDXL Lightning, OpenAI DALL·E, Replicate, or custom providers with structured prompts and safety checks.
reverse-engineering-firmware-analysis
Extended firmware analysis for embedded/IoT images with deep extraction, emulation, and vulnerability assessment.
imgur
Upload images to Imgur for free hosting. Use this skill when you need to upload images and get public URLs for sharing or embedding in articles.
zai-cli
Z.AI CLI providing: - Vision: image/video analysis, OCR, UI-to-code, error diagnosis (GLM-4.6V) - Search: real-time web search with domain/recency filtering - Reader: web page to markdown extraction - Repo: GitHub code search and reading via ZRead - Tools: MCP tool discovery and raw calls - Code: TypeScript tool chaining Use for visual content analysis, web search, page reading, or GitHub exploration. Requires Z_AI_API_KEY.
fal.ai
fal.ai AI image generation. Use this skill when you need to use fal, fal.ai, or generate images from text prompts using AI text-to-image models.
brave-search
Brave Search API via curl. Use this skill for privacy-focused web, image, video, and news search with no tracking.
midjourney-replicate-flux
Generate highly detailed, Midjourney-style image prompts optimized for the FLUX 1.1 Pro model on Replicate. Transform basic user descriptions into rich, cinematic prompts with professional photography qualities, dramatic lighting, and editorial-quality aesthetics. Use when users request image generation, need prompt enhancement, or want Midjourney-quality outputs via FLUX 1.1 Pro.
gemini-media
Gemini media and multimodal workflows across image, audio, and video.
concept-forge
Transform nebulous ideas into sharp, testable frameworks through multi-perspective dialectical interrogation. Use when developing vague intuitions, pressure-testing concepts, structuring half-formed frameworks, or distinguishing new ideas from existing concepts. Triggers include "explore this idea," "think through X," or "challenge my thinking."
htmlcsstoimage
HTMLCSStoImage API via curl. Use this skill to generate images from HTML/CSS or capture screenshots of web pages.
analyzing-media
Analyzes PDFs, images, screenshots, diagrams, and documents using Gemini multimodal. Extracts text, tables, forms; interprets visuals, architecture diagrams, flowcharts, ERDs. Use when user mentions PDFs, images, screenshots, document extraction, OCR, visual analysis, diagram interpretation, or form processing. Do not use for web searching or shell commands.
solana-compression
Build with ZK Compression on Solana using Light Protocol. Use when creating compressed tokens, compressed PDAs, or integrating ZK compression into Solana programs. Covers compressed account model, state trees, validity proofs, and client integration with Helius/Photon RPC.
pptx
PowerPoint presentation processing. Format: .pptx (ZIP/XML structure). Capabilities: create presentations, edit slides, layouts/masters, speaker notes, comments, shapes, images, charts, text extraction, template preservation. Actions: create, edit, analyze, extract from presentations. Keywords: PowerPoint, pptx, presentation, slide, layout, master slide, speaker notes, comments, shapes, images, charts, text extraction, template, slide deck, bullet points, title slide, content slide, animation. Use when: creating presentations, editing PowerPoint files, extracting slide content, modifying layouts, adding speaker notes, working with presentation templates.
media-processing
Video/audio/image processing with FFmpeg and ImageMagick. Tools: FFmpeg (video/audio), ImageMagick (images). Capabilities: format conversion, encoding (H.264/H.265/VP9/AV1), streaming (HLS/DASH), filters, effects, thumbnails, watermarks, batch processing, hardware acceleration (NVENC/QSV). Actions: convert, encode, resize, crop, compress, extract, merge, stream, transcode media. Keywords: FFmpeg, ImageMagick, video encoding, audio extraction, image resize, thumbnail, watermark, HLS, DASH, H.264, H.265, VP9, AV1, codec, bitrate, framerate, resolution, aspect ratio, filter, overlay, concat, trim, fade, batch processing. Use when: converting video/audio formats, encoding with specific codecs, generating thumbnails, creating streaming manifests, extracting audio from video, batch processing images, adding watermarks, optimizing file sizes.
gemini-imagegen
Generate and edit images using the Gemini API (Nano Banana). Use this skill when creating images from text prompts, editing existing images, applying style transfers, generating logos with text, creating stickers, product mockups, or any image generation/manipulation task. Supports text-to-image, image editing, multi-turn refinement, and composition from multiple reference images.
docx
Word document processing. Format: .docx (ZIP/XML structure). Capabilities: create documents, edit content, tracked changes, comments, formatting preservation, text extraction, styles, headers/footers, tables, images. Actions: create, edit, analyze, extract from Word documents. Keywords: Word, docx, document, tracked changes, comments, formatting, styles, headers, footers, tables, images, paragraphs, text extraction, template, mail merge, revision history, document comparison. Use when: creating Word documents, editing docx files, working with tracked changes, adding comments, extracting document content, preserving document formatting.