🎨

Audio Processing

357 skills in Content & Media > Audio Processing

writing-enhancer

Marketplace

Rephrase or completely rewrite content matching user's preferred tone, voice, and style.

memorysaver/looplia-core
9
0
更新日 2d ago

applying-brand-guidelines

Apply brand voice, tone, and style guidelines to written content across platforms. Use this skill when writing or editing content that needs to reflect a specific brand identity, adapting content for LinkedIn, Substack, X (Twitter), or other platforms while maintaining brand consistency. Triggers include requests to write on-brand content, apply brand voice, adapt content for platforms, or review content for brand alignment.

jamesgray007/hoai-course
8
3
更新日 2d ago

openai

Marketplace

OpenAI API via curl. Use this skill for GPT chat completions, DALL-E image generation, Whisper audio transcription, embeddings, and text-to-speech.

vm0-ai/vm0-skills
8
0
更新日 1d ago

register-twilio-test-audio

Use when adding new test audio files for Twilio voice calls, uploading audio to S3, or updating the twilio_place_call.py script with new audio options.

cncorp/arsenal
8
0
更新日 1d ago

research-to-essay

Research-driven essay and post creation with thematic synthesis, citation management, and voice calibration. Use when creating Substack/LinkedIn posts, long-form essays synthesizing multiple sources, or publication-grade writing requiring web search, narrative arc, and proper attribution. Triggers include "research and write about [topic]" or "dig into this idea and write."

leegonzales/AISkills
8
2
更新日 1d ago

writing-linkedin-posts

Create engaging, authentic LinkedIn posts like a Top Voice. Use this skill when asked to write LinkedIn content, social media posts for LinkedIn, professional thought leadership content, or help with LinkedIn engagement strategy. Triggers include requests for LinkedIn posts, professional social content, thought leadership pieces, or viral/engaging LinkedIn content.

jamesgray007/hoai-course
8
3
更新日 1d ago

prose-polish

Evaluate and elevate writing effectiveness through multi-dimensional quality assessment. Analyzes craft, coherence, authority, purpose, and voice with genre-calibrated thresholds. Use for refining drafts, diagnosing quality issues, generating quality content, or teaching writing principles.

leegonzales/AISkills
8
2
更新日 1d ago

discovery-interviews-surveys

Marketplace

Use when validating product assumptions before building, discovering unmet user needs, understanding customer problems and workflows, testing concepts or positioning, researching target markets, identifying jobs-to-be-done and hiring triggers, uncovering pain points and workarounds, or when users mention user research, customer interviews, surveys, discovery interviews, validation studies, or voice of customer.

lyndonkl/claude
8
1
更新日 1d ago

brand-guidelines

Marketplace

Create a BRAND_GUIDELINES.md that defines how to communicate with your customer. Requires CUSTOMER.md to exist first. Covers voice, tone, language rules, messaging framework, and copy patterns.

doodledood/claude-code-plugins
7
0
更新日 1d ago

ai-multimodal

Multimodal AI processing via Google Gemini API (2M tokens context). Capabilities: audio (transcription, 9.5hr max, summarization, music analysis), images (captioning, OCR, object detection, segmentation, visual Q&A), video (scene detection, 6hr max, YouTube URLs, temporal analysis), documents (PDF extraction, tables, forms, charts), image generation (text-to-image, editing). Actions: transcribe, analyze, extract, caption, detect, segment, generate from media. Keywords: Gemini API, audio transcription, image captioning, OCR, object detection, video analysis, PDF extraction, text-to-image, multimodal, speech recognition, visual Q&A, scene detection, YouTube transcription, table extraction, form processing, image generation, Imagen. Use when: transcribing audio/video, analyzing images/screenshots, extracting data from PDFs, processing YouTube videos, generating images from text, implementing multimodal AI features.

samhvw8/dotfiles
7
2
更新日 1d ago

payment-integration

Payment gateway integration. Providers: SePay (Vietnamese: VietQR, bank transfer, cards), Polar (global SaaS: subscriptions, usage-based billing). SDKs: Node.js, PHP, Python, Go, Laravel, Next.js. Capabilities: checkout flows, subscription management, webhooks, QR code generation, benefit automation, tax compliance. Actions: integrate, implement, configure, handle payments/subscriptions/webhooks. Keywords: payment gateway, SePay, Polar, VietQR, bank transfer, subscription, usage-based billing, checkout, webhook, QR code, API key, OAuth2, product management, customer portal, tax compliance, MoR, recurring payment, invoice. Use when: integrating payment processing, implementing checkout, managing subscriptions, handling payment webhooks, generating payment QR codes, building billing systems.

samhvw8/dotfiles
7
2
更新日 1d ago

agent-orchestrator

Spawn, monitor, and manage Claude Code agents in parallel tmux sessions. Supports simple ad-hoc agents and complex DAG-based multi-agent orchestration with wave execution.

stevengonsalvez/claudecode-bootstrap
6
4
更新日 1d ago

story-explanation

Marketplace

Create compelling story-format summaries using UltraThink to find the best narrative framing. Support multiple formats - 3-part narrative, n-length with inline links, abridged 5-line, or comprehensive via Foundry MCP. USE WHEN user says 'create story explanation', 'narrative summary', 'explain as a story', or wants content in Daniel's conversational first-person voice.

jeffh/claude-plugins
6
0
更新日 1d ago

ui-audio-theme

Generate cohesive UI audio themes with subtle, minimal sound effects for applications. This skill should be used when users want to create a set of coordinated interface sounds for wallet apps, dashboards, or web applications - generating sounds mapped to UI interaction constants like button clicks, notifications, and navigation transitions using ElevenLabs API.

b-open-io/prompts
6
2
更新日 1d ago

brand-voice

Marketplace

Defines and maintains consistent brand communication across all marketing materials. This skill should be used when creating new marketing content, auditing existing materials for voice consistency, onboarding team members to brand guidelines, or when content sounds generic or "off-brand."

Salesably/salesably-marketplace
5
0
更新日 1d ago

content-creator

Create SEO-optimized marketing content with consistent brand voice. Includes brand voice analyzer, SEO optimizer, content frameworks, and social media templates. Use when writing blog posts, creating social media content, analyzing brand voice, optimizing SEO, planning content calendars, or when user mentions content creation, brand voice, SEO optimization, social media marketing, or content strategy.

rickydwilson-dcs/claude-skills
5
2
更新日 1d ago

Vram-GPU-OOM

GPU VRAM management patterns for sharing memory across services (Ollama, Whisper, ComfyUI). OOM retry logic, auto-unload on idle, and service signaling protocol.

lawless-m/claude-skills
5
0
更新日 1d ago

ai-multimodal

Process and generate multimedia content using Google Gemini API for better vision capabilities. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (better image analysis than Claude models, captioning, reasoning, object detection, design extraction, OCR, visual Q&A, segmentation, handle multiple images), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image with Imagen 4, editing, composition, refinement), generate videos (text-to-video with Veo 3, 8-second clips with native audio). Use when working with audio/video files, analyzing images or screenshots (instead of default vision capabilities of Claude, only fallback to Claude's vision capabilities if needed), processing PDF documents, extracting structured data from media, creating images/videos from text prompts, or implementing multimodal AI features. Supports Gemini 3/2.5, Imagen 4, and Veo 3 models with context windows up to 2M tokens.

The1Studio/theone-training-skills
5
4
更新日 1d ago

Whisper-Transcription

Audio transcription using local whisper.cpp server with CUDA acceleration. HTTP API for speech-to-text conversion.

lawless-m/claude-skills
5
0
更新日 1d ago

content-atomizer

Marketplace

Repurposes single content pieces into multiple formats for maximum distribution while maintaining brand voice. This skill should be used when maximizing ROI from pillar content, filling content calendars efficiently, reaching audiences across multiple platforms, or when creating original content for every channel feels unsustainable.

Salesably/salesably-marketplace
5
0
更新日 1d ago