Image Processing
912 skills in Content & Media > Image Processing
imggen
Use this skill when users want to generate images using OpenAI's image generation API (DALL-E or gpt-image-1), or extract text from images using OCR. Invoke when users request AI-generated images, artwork, logos, illustrations, visual content from text prompts, or need to extract text/data from images.
file-upload-handling
Implement secure file uploads with validation, size limits, type checking, virus scanning, and UUID naming. Use when handling file uploads like profile photos, documents, or resources.
changelog-infographic
Generate beautiful infographic PNG images from Claude Code changelog summaries. Use this skill after changelog-interpreter has generated a user-friendly summary, to create a visual representation that can be saved and shared.
comfyui
ComfyUI node-based Stable Diffusion interface. GPU-accelerated imagegeneration with custom node support and CivitAI model downloads.Use 'ujust comfyui' for configuration, lifecycle management, andmodel/node operations.
ai-image-generation
Execute AI image generation with optimal quality. Use when you need to generate images via Replicate API. Triggers on: generate image, create visual, product shot. Outputs generated images for feedback and iteration.
mps
MPS(Media Processing Suite) - PDF 및 이미지(PNG, JPG)의 워터마크 제거, 블로그 최적화(1200px, WebP/JPEG). PNG 10MB 자동 압축으로 네이버 블로그 한도 준수. PDF는 DPI 자동 계산으로 메모리 절약. 여러 이미지 합치기 지원. 한글 깨짐 및 맞춤법 자동 체크. 로고 삽입은 선택사항(기본 비활성화). 올인원 미디어 처리 스킬.
gcp-resource-cleanup
Automated cleanup of legacy GCP resources (GKE deployments, Cloud Run services, Artifact Registry images) with safety checks and cost tracking. Use when deploying new versions, ending sprints, or optimizing costs.
instagram-carousel
Turn articles into Instagram carousel concepts with Nano Banana Pro image prompts. Creates carousels that deliver real VALUE, brighten their day, and create AHA moments - not just pretty slides. Triggers on "create a carousel", "turn this into slides", "Instagram carousel from article".
gemini-frontend-assistant
A specialized skill for frontend development tasks using the Gemini CLI. It leverages Gemini's multimodal capabilities for generating UI code (React, Tailwind CSS) from descriptions or images (screenshots).
web-to-markdown
Use ONLY when the user explicitly says: 'use the skill web-to-markdown ...' (or 'use a skill web-to-markdown ...'). Converts webpage URLs to clean Markdown by calling the local web2md CLI (Puppeteer + Readability), suitable for JS-rendered pages.
fix-build-failures
Fix build and compilation errors from TypeScript, webpack, Vite, Python builds. Use when build/compile checks fail.
openai-responses
This skill provides comprehensive knowledge for working with OpenAI's Responses API, the unified stateful API for building agentic applications. It should be used when building AI agents that preserve reasoning across turns, integrating MCP servers for external tools, using built-in tools (Code Interpreter, File Search, Web Search, Image Generation), managing stateful conversations, implementing background processing, or migrating from Chat Completions API.Use when building agentic workflows, conversational AI with memory, tools-based applications, RAG systems, data analysis agents, or any application requiring OpenAI's reasoning models with persistent state. Covers both Node.js SDK and Cloudflare Workers implementations.Keywords: responses api, openai responses, stateful openai, openai mcp, code interpreter openai, file search openai, web search openai, image generation openai, reasoning preservation, agentic workflows, conversation state, background mode, chat completions migration, gpt-5, polymorphic o
raw-image-processor
Expert guidance for RAW image editing workflows, providing step-by-step processing instructions, adjustment recommendations, and non-destructive editing best practices for professional photography.
image-enhancer
Optimizes images for web performance including format conversion, lazy loading, responsive sizing, and quality enhancement. Use when adding new images, optimizing existing assets, or implementing image components. Triggers on image optimization requests, performance improvements, or asset management.
graphviz-diagrams
Create complex graph visualizations using Graphviz DOT language, with both source code and pre-rendered images.
nano-banana
Generate, edit, and compose images using Google's Gemini 3 Pro Image model (Nano Banana Pro). Use this skill when the user asks to create images, generate visuals, edit photos, compose multiple images, create logos, thumbnails, infographics, product shots, or any image generation task. Supports text-to-image, image editing, multi-image composition (up to 14 images), iterative refinement, aspect ratio control, and Google Search-grounded image generation for real-time data visualization.
lexical-editor-image-management
Implement Lexical Editor with automatic image management using Laravel Observers. Converts base64 images to file storage, deletes unused images, and handles cleanup. Use when building WYSIWYG editors with rich content, managing media uploads in editors, implementing automatic image optimization, or setting up Observer-based storage management for rich text editors.
arch-v
Video production workflow orchestrator for Veo 3. Guides users through creating professional video prompts via two paths - direct text-to-video OR image-to-video pipeline (Imagen 3/4 → Veo 3). Validates prompt completeness, checks conflicts, ensures all mandatory components present. Integrates camera-movements, great-prompt-anatomy, short-prompt-guide, long-prompt-guide, and imagine skills.
image-prompt-writer
Craft optimized image generation prompts for tacosdedatos illustrations. Use this skill when you need to create a prompt for Gemini image generation, select an appropriate illustration mode, or help someone write a prompt for the tacosdedatos visual style. Outputs ready-to-use prompts with negative prompts included.
btc-trading-bot
Bitcoin trading simulation with technical analysis (EMA/RSI/Bollinger), Monte Carlo projections, and Telegram alerts. Cambodian market focus (USD/KHR 4050).TRIGGERS: BTC backtesting, indicator calculations, Fear & Greed integration, crypto strategy development, CoinGecko/Alternative.me APIs, portfolio simulation, trailing stop optimization, Sharpe/drawdown metrics.ENTRY POINTS: btc_trader.py (365d backtest), btc_simulation.py (60d Monte Carlo), backtest_runner.py (advanced metrics).