圖片處理
912 skills in 內容與媒體 > 圖片處理
local-rag
Index local folders and query them using RAG (Retrieval Augmented Generation). Supports PDF, DOCX, PPTX, XLSX, images with OCR, and text files.
CI/CD Pipeline Management
GitLab CI/CD pipeline optimization, Docker image building, caching strategies, and 3-stage deployment workflow
event-detection-temporal-intelligence-expert
Expert in temporal event detection, spatio-temporal clustering (ST-DBSCAN), and photo context understanding. Use for detecting photo events, clustering by time/location, shareability prediction, place recognition, event significance scoring, and life event detection. Activate on 'event detection', 'temporal clustering', 'ST-DBSCAN', 'spatio-temporal', 'shareability prediction', 'place recognition', 'life events', 'photo events', 'temporal diversity'. NOT for individual photo aesthetic quality (use photo-composition-critic), color palette analysis (use color-theory-palette-harmony-expert), face recognition implementation (use photo-content-recognition-curation-expert), or basic EXIF timestamp extraction.
writing-dockerfiles
Guides Dockerfile creation and optimization. Use when Dockerfile or Docker Compose is detected. Supports multi-stage builds, cache optimization, security hardening, and image size minimization.
ai
Add AI features to Glide apps using AI columns like Generate Text, Image to Text, Audio to Text. Use when adding AI-powered functionality, text generation, OCR, transcription, or auto-categorization.
vr-avatar-engineer
Expert in photorealistic and stylized VR avatar systems for Apple Vision Pro, Meta Quest, and cross-platform metaverse. Specializes in facial tracking (52+ blend shapes), subsurface scattering, Persona-style generation, Photon networking, and real-time LOD. Activate on 'VR avatar', 'Vision Pro Persona', 'Meta avatar', 'facial tracking', 'blend shapes', 'avatar networking', 'photorealistic avatar'. NOT for 2D profile pictures (use image generation), non-VR game characters (use game engine tools), static 3D models (use modeling tools), or motion capture hardware setup.
reachy-mini-sdk
Comprehensive guide for programming Reachy Mini robot using Python SDK v1.2.6. Use when working with Reachy Mini robot control, motion programming, sensor access, audio/video processing, or building AI applications. Covers movement control (head, antennas, body), camera/microphone access, motion recording/playback, coordinate systems, and Hugging Face integration. Essential for robotics development, AI experimentation, and interactive applications.
image-comparison-tool
Compare images with SSIM similarity scoring, pixel difference highlighting, and side-by-side visualization.
image-processing
Process, transform, and analyze images using common operations
background-remover
Remove backgrounds from images using segmentation. Support for color-based, edge detection, and AI-assisted removal methods. Batch processing available.
ocr-document-processor
Extract text from images and scanned PDFs using OCR. Supports 100+ languages, table detection, structured output (markdown/JSON), and batch processing.
fsharp-persistence
Implement data persistence using SQLite with Dapper, JSON files, or event sourcing. Use when: "database", "save data", "store", "CRUD", "create table", "query", "SQL", "SQLite", "Dapper", "file storage", "JSON file", "event sourcing", "persistence", "read from database", "write to database", "data access", "repository". Creates code in src/Server/Persistence.fs with async I/O patterns.
Unnamed Skill
Process multimedia files with FFmpeg (video/audio encoding, conversion, streaming, filtering, hardware acceleration) and ImageMagick (image manipulation, format conversion, batch processing, effects, composition). Use when converting media formats, encoding videos with specific codecs (H.264, H.265, VP9), resizing/cropping images, extracting audio from video, applying filters and effects, optimizing file sizes, creating streaming manifests (HLS/DASH), generating thumbnails, batch processing images, creating composite images, or implementing media processing pipelines. Supports 100+ formats, hardware acceleration (NVENC, QSV), and complex filtergraphs. | Sử dụng khi: xử lý hình ảnh, video, audio, FFmpeg, ImageMagick, chuyển đổi media.
fsharp-frontend
Implement F# frontend using Elmish MVU architecture with Feliz for React components. Use when: "add UI", "create component", "build form", "frontend", "client-side", "user interface", "view", "display", "render", "Elmish", "Feliz", "button", "input", "state management". Creates Model/Msg/update in src/Client/State.fs and views in src/Client/View.fs. Follows strict MVU pattern with RemoteData for async operations and TailwindCSS/DaisyUI for styling.
docker-multi-stage
Multi-stage builds for optimized, minimal production images with build/runtime separation
editorial-image-generator
Creates sophisticated HBR-style editorial illustrations for any content using AI understanding and visual analysis. Use when creating conceptual illustrations, analyzing generated images, or compositing logos. Works with any brand configuration. AI-native approach - Claude reasons about content rather than using rigid templates.
scientific-schematics
Create publication-quality scientific diagrams using Nano Banana Pro AI with iterative refinement. AI generation is the default method for all diagram types. Generates high-fidelity images with automatic quality review. Specialized in neural network architectures, system diagrams, flowcharts, biological pathways, and complex scientific visualizations.
pdf-to-markdown-converter
Converts PDF files to Markdown format using PyMuPDF, extracting text content and embedded images. Fast and lightweight. Automatically fixes LaTeX umlauts (¨a → ä, etc.) and converts ß to ss (Swiss German). Use when converting PDFs to Markdown, extracting document content, or processing PDF files for text analysis. Generates one .md file and 0..n .png files for images.
receipt-scanner
Extract vendor, date, items, amounts, and total from receipt images using OCR and pattern matching with structured JSON output.
ecosystem
JavaScript ecosystem including npm, build tools (Webpack, Vite), testing (Jest, Vitest), linting, and CI/CD integration.