🎨

Image Processing

912 skills in Content & Media > Image Processing

vision

Marketplace

subject segmentation, VNGenerateForegroundInstanceMaskRequest, isolate object from hand, VisionKit subject lifting, image foreground detection, instance masks, class-agnostic segmentation, VNRecognizeTextRequest, OCR, VNDetectBarcodesRequest, DataScannerViewController, document scanning, RecognizeDocumentsRequest

CharlesWiltgen/Axiom
142
10
업데이트 3d ago

camera-capture

Marketplace

AVCaptureSession, camera preview, photo capture, video recording, RotationCoordinator, session interruptions, deferred processing, capture responsiveness, zero-shutter-lag, photoQualityPrioritization, front camera mirroring

CharlesWiltgen/Axiom
142
10
업데이트 3d ago

zai-cli

Marketplace

Z.AI CLI providing: - Vision: image/video analysis, OCR, UI-to-code, error diagnosis (GLM-4.6V) - Search: real-time web search with domain/recency filtering - Reader: web page to markdown extraction - Repo: GitHub code search and reading via ZRead - Tools: MCP tool discovery and raw calls - Code: TypeScript tool chaining Use for visual content analysis, web search, page reading, or GitHub exploration. Requires Z_AI_API_KEY.

numman-ali/n-skills
124
10
업데이트 3d ago

frontend-design-pro

Marketplace

Creates jaw-dropping, production-ready frontend interfaces AND delivers perfectly matched real photos (Unsplash/Pexels direct links) OR flawless custom image-generation prompts for hero images, backgrounds, and illustrations. Zero AI slop, zero fake URLs.

claudekit/frontend-design-pro-demo
121
19
업데이트 3d ago

frontend-design-pro

Marketplace

Creates jaw-dropping, production-ready frontend interfaces AND delivers perfectly matched real photos (Unsplash/Pexels direct links) OR flawless custom image-generation prompts for hero images, backgrounds, and illustrations. Zero AI slop, zero fake URLs.

claudekit/frontend-design-pro-demo
121
19
업데이트 3d ago

gemini-to-seedream-migration

Migrate AI image generation from Google Gemini 2.5 Flash to BytePlus SeeDream v4.5. Use when: (1) User wants to switch from Gemini to SeeDream/BytePlus for image generation, (2) User asks about migrating image generation APIs or replacing Gemini with BytePlus, (3) User needs cost optimization or better image quality for AI-generated images, (4) User mentions SeeDream, BytePlus, or wants SDK-to-REST API migration for image generation

julianromli/ai-skills
120
17
업데이트 3d ago

tiptap

Marketplace

Build rich text editors with Tiptap - headless editor framework with React, shadcn/ui, and Tailwind v4 integration. Includes SSR-safe setup, image uploads to R2, prose styling, collaborative editing, and markdown support. Use when creating blog editors, comment systems, documentation platforms, or Notion-like apps, or troubleshooting SSR hydration errors, Tailwind typography issues, or image upload performance.

jezweb/claude-skills
120
18
업데이트 3d ago

vercel-blob

Marketplace

Integrate Vercel Blob object storage for file uploads, image management, and CDN-delivered assets in Next.js applications. Supports client-side uploads with presigned URLs and multipart transfers. Use when implementing file uploads (images, PDFs, videos), managing user-generated content, or troubleshooting missing tokens, size limit errors, or client upload failures.

jezweb/claude-skills
120
18
업데이트 3d ago

cloudflare-images

Marketplace

Store and transform images with Cloudflare Images API and transformations. Use when: uploading images, implementing direct creator uploads, creating variants, generating signed URLs, optimizing formats (WebP/AVIF), transforming via Workers, or debugging CORS, multipart, or error codes 9401-9413.

jezweb/claude-skills
120
18
업데이트 3d ago

markitdown

Convert various file formats (PDF, Office documents, images, audio, web content, structured data) to Markdown optimized for LLM processing. Use when converting documents to markdown, extracting text from PDFs/Office files, transcribing audio, performing OCR on images, extracting YouTube transcripts, or processing batches of files. Supports 20+ formats including DOCX, XLSX, PPTX, PDF, HTML, EPUB, CSV, JSON, images with OCR, and audio with transcription.

smallnest/goskills
116
16
업데이트 3d ago

image-optimizer

[TODO: Complete and informative explanation of what the skill does and when to use it. Include WHEN to use this skill - specific scenarios, file types, or tasks that trigger it.]

smallnest/goskills
116
16
업데이트 3d ago

tabz-mcp

Marketplace

Control Chrome browser: take screenshots, click buttons, fill forms, download images, inspect pages, capture network requests. Use when user says: 'screenshot this', 'click the button', 'fill the form', 'download that image', 'what page am I on', 'check the browser', 'look at my screen', 'interact with the website', 'capture the page', 'get the HTML', 'inspect element'. Provides MCP tool discovery for tabz_* browser automation tools.

GGPrompts/TabzChrome
115
10
업데이트 3d ago

docker

Guide for using Docker - a containerization platform for building, running, and deploying applications in isolated containers. Use when containerizing applications, creating Dockerfiles, working with Docker Compose, managing images/containers, configuring networking and storage, optimizing builds, deploying to production, or implementing CI/CD pipelines with Docker.

einverne/dotfiles
109
19
업데이트 3d ago

gemini-image-gen

Guide for implementing Google Gemini API image generation - create high-quality images from text prompts using gemini-2.5-flash-image model. Use when generating images, creating visual content, or implementing text-to-image features. Supports text-to-image, image editing, multi-image composition, and iterative refinement.

einverne/dotfiles
109
19
업데이트 3d ago

imagemagick

Guide for using ImageMagick command-line tools to perform advanced image processing tasks including format conversion, resizing, cropping, effects, transformations, and batch operations. Use when manipulating images programmatically via shell commands.

einverne/dotfiles
109
19
업데이트 3d ago

gemini-vision

Guide for implementing Google Gemini API image understanding - analyze images with captioning, classification, visual QA, object detection, segmentation, and multi-image comparison. Use when analyzing images, answering visual questions, detecting objects, or processing documents with vision.

einverne/dotfiles
109
19
업데이트 3d ago

Unnamed Skill

Capture Unity EditorWindow and save as PNG. Use when you need to: (1) Take a screenshot of Game View, Scene View, Console, Inspector, etc., (2) Capture visual state for debugging or verification, (3) Save editor output as an image file.

hatayama/uLoopMCP
105
8
업데이트 3d ago

nanobanana-skill

Generate or edit images using Google Gemini API via nanobanana. Use when the user asks to create, generate, edit images with nanobanana, or mentions image generation/editing tasks.

feiskyer/codex-settings
79
13
업데이트 3d ago

sc-gemini-imagegen

Marketplace

Generate and edit images using the Gemini API (Nano Banana Pro). Use this skill when creating images from text prompts, editing existing images, applying style transfers, generating logos with text, creating stickers, product mockups, or any image generation/manipulation task. Supports text-to-image, image editing, multi-turn refinement, and composition from multiple reference images.

kylesnowschwartz/SimpleClaude
74
10
업데이트 3d ago

site-slides

Generate presentation slides from images or PDF files. Use when user wants to create slides, generate presentations, or convert PDF to slides for the training camp website. Triggers on keywords like "slides", "presentation", "幻灯片", "演示文稿".

tyrchen/geektime-bootcamp-ai
70
42
업데이트 3d ago