スキルを投稿

🎨

Audio Processing

357 skills in Content & Media > Audio Processing

writing-enhancer

Rephrase or completely rewrite content matching user's preferred tone, voice, and style.

memorysaver/looplia-core

更新日 2d ago

applying-brand-guidelines

Apply brand voice, tone, and style guidelines to written content across platforms. Use this skill when writing or editing content that needs to reflect a specific brand identity, adapting content for LinkedIn, Substack, X (Twitter), or other platforms while maintaining brand consistency. Triggers include requests to write on-brand content, apply brand voice, adapt content for platforms, or review content for brand alignment.

jamesgray007/hoai-course

更新日 2d ago

openai

OpenAI API via curl. Use this skill for GPT chat completions, DALL-E image generation, Whisper audio transcription, embeddings, and text-to-speech.

vm0-ai/vm0-skills

更新日 1d ago

register-twilio-test-audio

Use when adding new test audio files for Twilio voice calls, uploading audio to S3, or updating the twilio_place_call.py script with new audio options.

更新日 1d ago

research-to-essay

Research-driven essay and post creation with thematic synthesis, citation management, and voice calibration. Use when creating Substack/LinkedIn posts, long-form essays synthesizing multiple sources, or publication-grade writing requiring web search, narrative arc, and proper attribution. Triggers include "research and write about [topic]" or "dig into this idea and write."

leegonzales/AISkills

更新日 1d ago

writing-linkedin-posts

Create engaging, authentic LinkedIn posts like a Top Voice. Use this skill when asked to write LinkedIn content, social media posts for LinkedIn, professional thought leadership content, or help with LinkedIn engagement strategy. Triggers include requests for LinkedIn posts, professional social content, thought leadership pieces, or viral/engaging LinkedIn content.

jamesgray007/hoai-course

更新日 1d ago

prose-polish

Evaluate and elevate writing effectiveness through multi-dimensional quality assessment. Analyzes craft, coherence, authority, purpose, and voice with genre-calibrated thresholds. Use for refining drafts, diagnosing quality issues, generating quality content, or teaching writing principles.

leegonzales/AISkills

更新日 1d ago

discovery-interviews-surveys

Use when validating product assumptions before building, discovering unmet user needs, understanding customer problems and workflows, testing concepts or positioning, researching target markets, identifying jobs-to-be-done and hiring triggers, uncovering pain points and workarounds, or when users mention user research, customer interviews, surveys, discovery interviews, validation studies, or voice of customer.

lyndonkl/claude

更新日 1d ago

brand-guidelines

Create a BRAND_GUIDELINES.md that defines how to communicate with your customer. Requires CUSTOMER.md to exist first. Covers voice, tone, language rules, messaging framework, and copy patterns.

doodledood/claude-code-plugins

更新日 1d ago

ai-multimodal

Multimodal AI processing via Google Gemini API (2M tokens context). Capabilities: audio (transcription, 9.5hr max, summarization, music analysis), images (captioning, OCR, object detection, segmentation, visual Q&A), video (scene detection, 6hr max, YouTube URLs, temporal analysis), documents (PDF extraction, tables, forms, charts), image generation (text-to-image, editing). Actions: transcribe, analyze, extract, caption, detect, segment, generate from media. Keywords: Gemini API, audio transcription, image captioning, OCR, object detection, video analysis, PDF extraction, text-to-image, multimodal, speech recognition, visual Q&A, scene detection, YouTube transcription, table extraction, form processing, image generation, Imagen. Use when: transcribing audio/video, analyzing images/screenshots, extracting data from PDFs, processing YouTube videos, generating images from text, implementing multimodal AI features.

samhvw8/dotfiles

更新日 1d ago

payment-integration

Payment gateway integration. Providers: SePay (Vietnamese: VietQR, bank transfer, cards), Polar (global SaaS: subscriptions, usage-based billing). SDKs: Node.js, PHP, Python, Go, Laravel, Next.js. Capabilities: checkout flows, subscription management, webhooks, QR code generation, benefit automation, tax compliance. Actions: integrate, implement, configure, handle payments/subscriptions/webhooks. Keywords: payment gateway, SePay, Polar, VietQR, bank transfer, subscription, usage-based billing, checkout, webhook, QR code, API key, OAuth2, product management, customer portal, tax compliance, MoR, recurring payment, invoice. Use when: integrating payment processing, implementing checkout, managing subscriptions, handling payment webhooks, generating payment QR codes, building billing systems.

samhvw8/dotfiles

更新日 1d ago

agent-orchestrator

Spawn, monitor, and manage Claude Code agents in parallel tmux sessions. Supports simple ad-hoc agents and complex DAG-based multi-agent orchestration with wave execution.

stevengonsalvez/claudecode-bootstrap

更新日 1d ago

story-explanation

Create compelling story-format summaries using UltraThink to find the best narrative framing. Support multiple formats - 3-part narrative, n-length with inline links, abridged 5-line, or comprehensive via Foundry MCP. USE WHEN user says 'create story explanation', 'narrative summary', 'explain as a story', or wants content in Daniel's conversational first-person voice.

jeffh/claude-plugins

更新日 1d ago

ui-audio-theme

Generate cohesive UI audio themes with subtle, minimal sound effects for applications. This skill should be used when users want to create a set of coordinated interface sounds for wallet apps, dashboards, or web applications - generating sounds mapped to UI interaction constants like button clicks, notifications, and navigation transitions using ElevenLabs API.

b-open-io/prompts

更新日 1d ago

brand-voice

Defines and maintains consistent brand communication across all marketing materials. This skill should be used when creating new marketing content, auditing existing materials for voice consistency, onboarding team members to brand guidelines, or when content sounds generic or "off-brand."

Salesably/salesably-marketplace

更新日 1d ago

content-creator

Create SEO-optimized marketing content with consistent brand voice. Includes brand voice analyzer, SEO optimizer, content frameworks, and social media templates. Use when writing blog posts, creating social media content, analyzing brand voice, optimizing SEO, planning content calendars, or when user mentions content creation, brand voice, SEO optimization, social media marketing, or content strategy.

rickydwilson-dcs/claude-skills

更新日 1d ago

Vram-GPU-OOM

GPU VRAM management patterns for sharing memory across services (Ollama, Whisper, ComfyUI). OOM retry logic, auto-unload on idle, and service signaling protocol.

lawless-m/claude-skills

更新日 1d ago

ai-multimodal

Process and generate multimedia content using Google Gemini API for better vision capabilities. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (better image analysis than Claude models, captioning, reasoning, object detection, design extraction, OCR, visual Q&A, segmentation, handle multiple images), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image with Imagen 4, editing, composition, refinement), generate videos (text-to-video with Veo 3, 8-second clips with native audio). Use when working with audio/video files, analyzing images or screenshots (instead of default vision capabilities of Claude, only fallback to Claude's vision capabilities if needed), processing PDF documents, extracting structured data from media, creating images/videos from text prompts, or implementing multimodal AI features. Supports Gemini 3/2.5, Imagen 4, and Veo 3 models with context windows up to 2M tokens.

The1Studio/theone-training-skills

更新日 1d ago

Whisper-Transcription

Audio transcription using local whisper.cpp server with CUDA acceleration. HTTP API for speech-to-text conversion.

lawless-m/claude-skills

更新日 1d ago

content-atomizer

Repurposes single content pieces into multiple formats for maximum distribution while maintaining brand voice. This skill should be used when maximizing ROI from pillar content, filling content calendars efficiently, reaching audiences across multiple platforms, or when creating original content for every channel feels unsustainable.

Salesably/salesably-marketplace

更新日 1d ago