Image Processing
912 skills in Content & Media > Image Processing
newsletter-events-research
Research events from Instagram and Facebook for local newsletter. Use when scraping event sources, downloading flyer images, or extracting event details.
placeholder-images
Generate SVG placeholder images for prototypes. Use when adding placeholder images for layouts, mockups, or development. Supports simple, labeled, and brand-aware types.
ai-product-photo
Specialized skill for AI product photography. Use when you need professional product shots, hero images, lifestyle photography, or e-commerce visuals. Triggers on: product shot, hero image, e-commerce photo. Outputs production-ready product photography.
pdf-extractor
Expert in PDF content extraction and analysis. **Use whenever the user mentions PDFs, .pdf files, or requests to extract, read, parse, analyze, convert, or process PDF documents.** Handles text extraction, image extraction, converting PDFs to markdown or other formats, batch PDF processing, and analyzing PDF document structure for AI processing. Uses a fast Go binary with Vertex AI Gemini for intelligent image analysis. Supports two methods - preferred binary-based extraction (default) and alternative image-based extraction (when explicitly requested). (project, gitignored)
pca-gif-maker
Gera GIFs animados programaticamente usando Python (Pillow) para o projeto PCA Camocim. Cria animacoes de texto e badges para documentacao.
flux-image
FLUX.2 Pro gorsel uretimi. Use when generating AI images for posts.
vision
Analyzes and processes images using Claude's vision capabilities. Supports OCR, image classification, diagram comparison, chart analysis, visual Q&A, and more. Use when users need to understand, extract, or analyze visual content.
markitdown
Convert various file formats (PDF, Office documents, images, audio, web content, structured data) to Markdown optimized for LLM processing. Use when converting documents to markdown, extracting text from PDFs/Office files, transcribing audio, performing OCR on images, extracting YouTube transcripts, or processing batches of files. Supports 20+ formats including DOCX, XLSX, PPTX, PDF, HTML, EPUB, CSV, JSON, images with OCR, and audio with transcription.
github-attach-images
Attach images to GitHub PRs and issues via a scratch repo
nano-banana-pro
Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.
image-gen
根据文本描述生成图像(返回生成参数和预览URL)
gemini-vision
Guide for implementing Google Gemini API image understanding - analyze images with captioning, classification, visual QA, object detection, segmentation, and multi-image comparison. Use when analyzing images, answering visual questions, detecting objects, or processing documents with vision.
marp-slide
Create professional Marp presentation slides with 7 beautiful themes (default, minimal, colorful, dark, gradient, tech, business). Use when users request slide creation, presentations, or Marp documents. Supports custom themes, image layouts, and "make it look good" requests with automatic quality improvements.
ragsharp-query-code-graph
Query the ragsharp code graph for declarations, references, callers, callees, dependencies, and line-number evidence.Triggers: find usages, where defined, callers, callees, dependency path, project deps, type hierarchy, line numbers, evidence.
pdf-author
Generate professional PDF documents using Typst with proper Chinese font rendering (Source Han Serif/Sans) and syntax-highlighted code blocks that break across pages naturally. Includes visual validation by converting PDFs to PNG and inspecting with multimodal capabilities. Use when the user requests PDF generation, document creation with Chinese text, Typst compilation, or when working with Chinese fonts and long code examples that need proper page breaks.
osint
Gathers intelligence from public sources. Use when searching for usernames, geolocating images, investigating social media, analyzing domains, or solving information gathering challenges.
tile-snake-pattern-stitching
SNAKE vs RASTER tile acquisition patterns for image stitching in KINTSUGI
container-registry-setup
Эксперт по container registry. Используй для настройки ECR, Harbor, Docker Hub, image security и CI/CD интеграции.
add-buyable-item
Add a new one-time shop boost (buyable item) consistent with the app’s cat-petting theme and design-concept-reference.png. Produces the boosts.json entry, icon filename/path, accessible imageDescription, and concise icon generation instructions.
imagemagick
Guide for using ImageMagick command-line tools to perform advanced image processing tasks including format conversion, resizing, cropping, effects, transformations, and batch operations. Use when manipulating images programmatically via shell commands.