🎨

Audio Processing

357 skills in Content & Media > Audio Processing

swarm

Multi-perspective reasoning through Upanishadic Antahkarana voices. Use for complex problems requiring diverse viewpoints and synthesis.

genomewalker/cc-soul
0
0
Aktualisiert 5d ago

marketing-writer

Write authentic, conversion-focused marketing content for product features and launches. Use when Maurice ships a feature, needs landing page copy, tweet threads, launch emails, or any marketing content. Automatically analyzes codebase to understand features and value props. Brand voice is casual, direct, no corporate buzzwords - focuses on real benefits in simple language.

skycruzer/fleet-management-v2
0
0
Aktualisiert 5d ago

brand-voice

Define or extract a consistent brand voice that other skills can use. Two modes - Extract (analyze existing content you're proud of) or Build (strategically construct a voice from scratch). Use when starting a project, when copy sounds generic, or when output needs to sound like a specific person/brand. Triggers on: what's my voice, analyze my brand, help me define my voice, make this sound like me, voice guide, brand personality. Outputs a voice profile that can be fed into direct-response-copy and other content skills.

GroundMountCompany/groundmounts-app
0
0
Aktualisiert 5d ago

typescript-taste

Apply rigorous TypeScript type design with strong inference, minimal constraints, and sound fallbacks.

iplaylf2/khora
0
0
Aktualisiert 5d ago

audio-quality-checker

Analyze the WaveCap-SDR audio stream to assess tuning quality, detect silence, noise, proper audio, or distortion. Use when checking if SDR channels are properly configured or debugging audio issues.

majiayu000/claude-skill-registry
0
0
Aktualisiert 5d ago

cover-letter-voice

Develop authentic cover letter narrative using philosophy, patterns, and job's cultural requirements

majiayu000/claude-skill-registry
0
0
Aktualisiert 5d ago

ui-token-first

Enforce UI token usage for Espresso Engineered frontend work. Use when editing Svelte/SvelteKit UI, styling typography, voice lines, headers, cards, surfaces, or layout so styles come from frontend/src/lib/ui tokens instead of app.css or ad-hoc CSS.

nickabeelee/espresso-engineered
0
0
Aktualisiert 5d ago

dheplab-newsletter

DHEPLab Newsletter content pipeline for LinkedIn and Substack. Creates, optimizes,and manages thought leadership content establishing DHEPLab as the premier voicein digital health economics. Use for LinkedIn posts, Substack newsletters, contentcalendar management, and engagement tracking.

sysylvia/ssylvia-website
0
0
Aktualisiert 5d ago

recipe-builder

Create and manage WaveCap-SDR recipe templates for common capture scenarios. Use when setting up new band plans, creating presets for trunking systems, or building reusable multi-channel configurations for marine/aviation/broadcast monitoring.

majiayu000/claude-skill-registry
0
0
Aktualisiert 5d ago

nnt-compiler

Work with the NNT (Nakul Notation Tool) compiler - parse music notation shorthand, query musical structures, and export to MusicXML, ABC, and other formats for PhD research and educational content

theslyprofessor/claude-skills
0
0
Aktualisiert 5d ago

audio-transcribe

Marketplace

使用 Whisper 将音频/视频转换为文字,支持词级别时间戳。Use when user wants to 语音转文字, 音频转文字, 视频转文字, 字幕生成, transcribe audio, speech to text, generate subtitles, 识别语音.

InfQuest/vibe-ops-plugin
0
0
Aktualisiert 5d ago

ai-multimodal

Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.

Leoph1688/ClaudeKit
0
0
Aktualisiert 5d ago

gemini-audio

Guide for implementing Google Gemini API audio capabilities - analyze audio with transcription, summarization, and understanding (up to 9.5 hours), plus generate speech with controllable TTS. Use when processing audio files, creating transcripts, analyzing speech/music/sounds, or generating natural speech from text.

levanminhduc/LuongHoaThoNew
0
0
Aktualisiert 5d ago

audio-effect

Create standard SuperCollider audio effects for Bice-Box (delays, reverbs, filters, distortions). Provides templates, ControlSpecs, common patterns, and MCP workflow for safely creating/updating effects.

majiayu000/claude-skill-registry
0
0
Aktualisiert 5d ago

widget-tester

Expert assistant for testing the embeddable Bible widget functionality in the KR92 Bible Voice project. Use when creating widget tests, validating embed API responses, testing reference formats, checking audio integration, or creating regression test cases.

majiayu000/claude-skill-registry
0
0
Aktualisiert 5d ago

ai-multimodal

Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.

nodays-off/rack-reserve
0
0
Aktualisiert 5d ago

content-research-writer

Creates high-quality content (blog posts, tweets, newsletters, documentation) that matches the user's writing style and voice. Performs web research to find citations and supporting evidence. Use when user requests blog posts, marketing content, newsletters, tweets, or any written content that should sound authentic and be well-researched.

breverdbidder/life-os
0
0
Aktualisiert 5d ago

wavecap-audio

Analyze recorded audio files from WaveCap. Use when the user wants to inspect audio recordings, check audio quality, list available recordings, or get audio file metadata.

TobiasWooldridge/WaveCap
0
0
Aktualisiert 5d ago

livekit-stt-selfhosted

Marketplace

Build self-hosted speech-to-text APIs using Hugging Face models (Whisper, Wav2Vec2) and create LiveKit voice agent plugins. Use when building STT infrastructure, creating custom LiveKit plugins, deploying self-hosted transcription services, or integrating Whisper/HF models with LiveKit agents. Includes FastAPI server templates, LiveKit plugin implementation, model selection guides, and production deployment patterns.

Okeysir198/P20251122-claude-skills
0
0
Aktualisiert 5d ago

elevenlabs-agents

Work with ElevenLabs Conversational AI agents - initiate calls, retrieve transcripts, manage phone numbers, and analyze agent conversations. Use when building or testing voice AI applications.

taskcrew/elevenlabs-agents
0
0
Aktualisiert 5d ago