圖片處理
912 skills in 內容與媒體 > 圖片處理
gemini-pdf
Process multimodal documents using Gemini CLI, leveraging Gemini's superior multimodal capabilities. Use for PDFs, scanned documents, image-heavy documents, or any file where visual understanding matters. Ideal for extracting content from complex layouts, tables, diagrams, handwritten notes, or mixed text/image documents. Triggers on PDF processing, document extraction, "use Gemini for this", or when document has visual complexity that benefits from multimodal understanding.
docker-optimization
Optimize Docker images for Python applications including multi-stage builds (70%+ size reduction), security scanning with Trivy, layer caching, and distroless base images. Use when creating Dockerfiles, reducing image size, improving build performance, or scanning for vulnerabilities.
merge-book-cover
Merge a cover image into a PDF book while preserving aspect ratio and matching width. Use when the user wants to "merge cover", "combine pdf", "fix cover size", or "add cover image".
image-generator
Generate professional visuals using Gemini via browser automation with 6-gate quality control.Use when creating chapter illustrations, diagrams, or teaching visuals.NOT for stock photos or decorative images.
gemini-imagegen
Generate and edit images using the Gemini API (Nano Banana). Use this skill when creating images from text prompts, editing existing images, applying style transfers, generating logos with text, creating stickers, product mockups, or any image generation/manipulation task. Supports text-to-image, image editing, multi-turn refinement, and composition from multiple reference images.
nano-banana
AI image generation and editing using Google's Nano Banana (Gemini 2.5 Flash Image) and Nano Banana Pro (Gemini 3 Pro Image) APIs. Use this skill when the user wants to generate, edit, or compose images using AI. Triggers include requests to create images from text descriptions, edit existing images, add/remove elements from photos, apply style transfers, maintain character consistency across images, generate images with text overlays (logos, posters, infographics), or create multi-image compositions. Also use when users mention "Nano Banana", "Gemini image", or want AI-generated visuals.
text-message
Send text messages via Apple Messages app with automatic contact lookup and attachment support. This skill should be used when sending SMS/iMessage to contacts. Supports name-based recipient lookup via Google Contacts integration, image/file attachments, and handles missing recipient/message prompts. CRITICAL - Messages are ALWAYS sent individually to each recipient, NEVER as group messages. REQUIRES E.164 phone format (+1XXXXXXXXXX) for reliable delivery. Integrates Arlen's writing style guide for authentic messaging.
csharp-developer
모던 .NET 개발, ASP.NET Core 및 클라우드 네이티브 애플리케이션을 전문으로 하는 전문가 수준의 C# 개발자입니다. C# 14 기능, Blazor 및 크로스 플랫폼 개발을 마스터했으며 성능과 Clean Architecture를 강조합니다.
resend
Implement email notifications for PhotoVault using Resend and React Email. Use when working with email templates, transactional emails, notification triggers, deliverability issues, or styling email content. Includes PhotoVault branding and template patterns.
gemini-image-generator
Generate images using Google's Gemini API. Use when creating images from text prompts, editing existing images, or combining reference images for AI-generated visual content.
document-processor
Extract and process content from PDFs and DOCX files. Handles large files, OCR for scanned documents, page splitting, and markdown conversion. Use when: (1) Processing PDF references in notes, (2) Extracting text from large documents for analysis, (3) Converting DOCX to markdown, (4) Handling scanned/image PDFs with OCR, (5) Integrating with Obsidian or note-taking workflows, (6) Splitting large documents into manageable chunks.Invoke with: /process-document, /extract-pdf, /extract-docx, or say "use document-processor skill to..."
gpt-image-1-5
Generate and edit images using OpenAI's GPT Image 1.5 model. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports text-to-image generation and image editing with optional mask. DO NOT read the image file first - use this skill directly with the --input-image parameter.
building-static-sites
Use when creating static websites with Next.js static export - covers YAML-based content, image optimization, form handling, SEO, and deployment to GitHub Pages/Netlify/Vercel
falai
fal.ai AI image generation. Use this skill when you need to use fal, fal.ai, or generate images from text prompts using AI text-to-image models.
image-enhancer
Improves the quality of images, especially screenshots, by enhancing resolution, sharpness, and clarity. Perfect for preparing images for presentations, documentation, or social media posts.
vite-build-tool
Vite lightning-fast build tool with instant HMR, ESM-first architecture, and zero-config setup for modern web development. Use when building React/Vue/Svelte applications, needing instant hot module replacement, migrating from webpack/CRA, or setting up TypeScript projects.
astro-images
Width-based responsive image patterns for Astro. Aspect ratio independent.
operating-k8s-local
Operates local Kubernetes clusters with Minikube for development and testing.Use when setting up local K8s, deploying applications locally, or debugging K8s issues.Covers Minikube, kubectl essentials, local image loading, and networking.
service-website-generator
Orchestrates automated service-based website generation with local SEO optimization. Creates 200+ service+location pages using parallel agents, Unsplash images via Jina AI, NextJS with dynamic routing, and PostgreSQL database. Use when building service business websites (plumbers, electricians, pressure washing, HVAC, etc.) targeting multiple locations.
radius-scale
Generates border-radius tokens from sharp to pill shapes. Use when creating corner rounding systems, button radius, card corners, or input styling. Outputs CSS, Tailwind, or JSON.