Image Processing
912 skills in Content & Media > Image Processing
reachy-mini
Complete SDK for controlling Reachy Mini robot - head movement, antennas, camera, audio, motion recording/playback. Covers architecture (daemon/client), deployment modes (USB, wireless, simulation, on-Pi), and app distribution. Also includes advanced application patterns: MovementManager, layered motion, audio-reactive movement, face tracking, LLM tool systems, and OpenAI realtime integration. Use when: (1) Writing code to control Reachy Mini, (2) Moving the robot head or antennas, (3) Accessing camera/video, (4) Playing/recording audio, (5) Recording or playing back motions, (6) Looking at points in image or world space, (7) Understanding robot capabilities, (8) Connecting to real or simulated robot, (9) Building conversational AI apps, (10) Integrating with LLMs/OpenAI, (11) Deploying apps to robot, (12) Any robotics task with Reachy Mini.
generate-image
Generate and edit images using Google Gemini (Nano Banana). Use when user asks to create, generate, or edit images. Requires Chrome logged into gemini.google.com.
mermaid-export
Export Mermaid diagrams from documents to SVG images. Use after creating documents with Mermaid code blocks when you need to render them as static images for distribution, GitHub, wikis, or static sites. Processes .md, .html, .mdx, .rst, .adoc files.
chat-integrator
Automatically integrates processed media (audio transcriptions and image summaries) into chat.md files at the correct timestamp position. Use this when you want to merge processed .json audio files and .md image summaries into the daily chat.md conversation log.
resize-images
Resizes images that are too large for the blog. Use when asked to resize images, optimize images, or make images smaller. Can resize to a specific width (e.g., "resize to 300px") or default to 1000px max width. Defaults to the most recent post's assets folder.
drawio
Create and edit draw.io diagrams in XML format. Use when the user wants to create flowcharts, architecture diagrams, sequence diagrams, or any visual diagrams. Handles XML structure, styling, fonts (Noto Sans JP), arrows, connectors, and PNG export.
angular-frontend
Build and implement Angular 18 standalone components, TypeScript services with Signals and RxJS, routing with guards, and Tailwind CSS styling for Photo Map MVP. Use when creating, developing, or implementing TypeScript components, services, guards, forms, HTTP calls, map integration (Leaflet.js), or responsive UI layouts with Tailwind utilities. File types: .ts, .html, .css, .scss
deploy-vps
Deploy a new image to a VPS Ubuntu host using GHCR and a deploy script.
worldcrafter-feature-builder
Build complete features with Server Actions, forms, Zod validation, database CRUD operations, and comprehensive tests. Use when user requests "add a feature", "build a [feature]", "create [feature] with forms", or needs end-to-end implementation with validation and testing. Scaffolds pages, actions, schemas, loading/error states, and unit/integration/E2E tests. Supports multi-step wizards, image uploads, markdown editing, custom JSON attributes, and relationship management. Do NOT use for simple static pages (use worldcrafter-route-creator), database-only changes (use worldcrafter-database-setup), testing existing code (use worldcrafter-test-generator), or auth-only additions (use worldcrafter-auth-guard).
docx-processor
Process and generate Word documents with formatting, tables, and images. Use when working with Word documents or generating reports.
brand-guide
Generate and maintain brand style guides - colors, fonts, imagery, voice/tone, responsive specs. Use when documenting brand identity or creating style guide pages.
summarize
Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube).
devops
Deploy and manage cloud infrastructure on Cloudflare (Workers, R2, D1, KV, Pages, Durable Objects, Browser Rendering), Docker containers, and Google Cloud Platform (Compute Engine, GKE, Cloud Run, App Engine, Cloud Storage). Use when deploying serverless functions to the edge, configuring edge computing solutions, managing Docker containers and images, setting up CI/CD pipelines, optimizing cloud infrastructure costs, implementing global caching strategies, working with cloud databases, or building cloud-native applications.
docker-backend
Dockerizes backend projects with auto-detection, latest base images via web search, Dockerfile generation, and Makefile with port override support.
rp2350-micropython
Expert RP2350 development with MicroPython, covering Pimoroni Presto hardware, touchscreen interfaces, RGB lighting control, BLE server implementation, and display rendering. Use when developing for RP2350 boards, implementing touch UI, managing BLE communication, or working with RGB backlights.
gemini-image
Generate images from text prompts using fal.ai Gemini 3 Pro. Use when the user asks to create, generate, or make an image from a text description. Supports multiple aspect ratios and resolutions up to 4K.
find-screenshot
Find and attach the newest screenshot PNG in ~/temp/screenshots. Use when a user asks to locate, show, or attach the latest screenshot or wants the newest PNG from the screenshots folder.
notion-to-mdx
Automatically converts Notion markdown exports to Next.js MDX format with proper metadata and formatting. Use this skill when working with Notion markdown files that need to be converted to page.mdx files in this Next.js repository. Handles title extraction, interactive dropdown creation, and automatic image downloading from Notion.
hover-interactions
Use when creating mouse hover effects - button highlights, card lifts, link underlines, image zooms, or any pointer-triggered animation.
veo-frames-to-video
Generate video from first and last frame images using fal.ai Veo 3.1. Use when the user wants to create a video transition between two images, morph between scenes, or generate smooth video connecting a starting and ending frame.