🎨

Audio Processing

357 skills in Content & Media > Audio Processing

swarm

Multi-perspective reasoning through Upanishadic Antahkarana voices. Use for complex problems requiring diverse viewpoints and synthesis.

genomewalker/cc-soul
0
0
업데이트 5d ago

marketing-writer

Write authentic, conversion-focused marketing content for product features and launches. Use when Maurice ships a feature, needs landing page copy, tweet threads, launch emails, or any marketing content. Automatically analyzes codebase to understand features and value props. Brand voice is casual, direct, no corporate buzzwords - focuses on real benefits in simple language.

skycruzer/fleet-management-v2
0
0
업데이트 5d ago

brand-voice

Define or extract a consistent brand voice that other skills can use. Two modes - Extract (analyze existing content you're proud of) or Build (strategically construct a voice from scratch). Use when starting a project, when copy sounds generic, or when output needs to sound like a specific person/brand. Triggers on: what's my voice, analyze my brand, help me define my voice, make this sound like me, voice guide, brand personality. Outputs a voice profile that can be fed into direct-response-copy and other content skills.

GroundMountCompany/groundmounts-app
0
0
업데이트 5d ago

typescript-taste

Apply rigorous TypeScript type design with strong inference, minimal constraints, and sound fallbacks.

iplaylf2/khora
0
0
업데이트 5d ago

audio-quality-checker

Analyze the WaveCap-SDR audio stream to assess tuning quality, detect silence, noise, proper audio, or distortion. Use when checking if SDR channels are properly configured or debugging audio issues.

majiayu000/claude-skill-registry
0
0
업데이트 5d ago

cover-letter-voice

Develop authentic cover letter narrative using philosophy, patterns, and job's cultural requirements

majiayu000/claude-skill-registry
0
0
업데이트 5d ago

ui-token-first

Enforce UI token usage for Espresso Engineered frontend work. Use when editing Svelte/SvelteKit UI, styling typography, voice lines, headers, cards, surfaces, or layout so styles come from frontend/src/lib/ui tokens instead of app.css or ad-hoc CSS.

nickabeelee/espresso-engineered
0
0
업데이트 5d ago

dheplab-newsletter

DHEPLab Newsletter content pipeline for LinkedIn and Substack. Creates, optimizes,and manages thought leadership content establishing DHEPLab as the premier voicein digital health economics. Use for LinkedIn posts, Substack newsletters, contentcalendar management, and engagement tracking.

sysylvia/ssylvia-website
0
0
업데이트 5d ago

recipe-builder

Create and manage WaveCap-SDR recipe templates for common capture scenarios. Use when setting up new band plans, creating presets for trunking systems, or building reusable multi-channel configurations for marine/aviation/broadcast monitoring.

majiayu000/claude-skill-registry
0
0
업데이트 5d ago

nnt-compiler

Work with the NNT (Nakul Notation Tool) compiler - parse music notation shorthand, query musical structures, and export to MusicXML, ABC, and other formats for PhD research and educational content

theslyprofessor/claude-skills
0
0
업데이트 5d ago

audio-transcribe

Marketplace

使用 Whisper 将音频/视频转换为文字,支持词级别时间戳。Use when user wants to 语音转文字, 音频转文字, 视频转文字, 字幕生成, transcribe audio, speech to text, generate subtitles, 识别语音.

InfQuest/vibe-ops-plugin
0
0
업데이트 5d ago

ai-multimodal

Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.

Leoph1688/ClaudeKit
0
0
업데이트 5d ago

gemini-audio

Guide for implementing Google Gemini API audio capabilities - analyze audio with transcription, summarization, and understanding (up to 9.5 hours), plus generate speech with controllable TTS. Use when processing audio files, creating transcripts, analyzing speech/music/sounds, or generating natural speech from text.

levanminhduc/LuongHoaThoNew
0
0
업데이트 5d ago

audio-effect

Create standard SuperCollider audio effects for Bice-Box (delays, reverbs, filters, distortions). Provides templates, ControlSpecs, common patterns, and MCP workflow for safely creating/updating effects.

majiayu000/claude-skill-registry
0
0
업데이트 5d ago

widget-tester

Expert assistant for testing the embeddable Bible widget functionality in the KR92 Bible Voice project. Use when creating widget tests, validating embed API responses, testing reference formats, checking audio integration, or creating regression test cases.

majiayu000/claude-skill-registry
0
0
업데이트 5d ago

ai-multimodal

Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.

nodays-off/rack-reserve
0
0
업데이트 5d ago

content-research-writer

Creates high-quality content (blog posts, tweets, newsletters, documentation) that matches the user's writing style and voice. Performs web research to find citations and supporting evidence. Use when user requests blog posts, marketing content, newsletters, tweets, or any written content that should sound authentic and be well-researched.

breverdbidder/life-os
0
0
업데이트 5d ago

wavecap-audio

Analyze recorded audio files from WaveCap. Use when the user wants to inspect audio recordings, check audio quality, list available recordings, or get audio file metadata.

TobiasWooldridge/WaveCap
0
0
업데이트 5d ago

livekit-stt-selfhosted

Marketplace

Build self-hosted speech-to-text APIs using Hugging Face models (Whisper, Wav2Vec2) and create LiveKit voice agent plugins. Use when building STT infrastructure, creating custom LiveKit plugins, deploying self-hosted transcription services, or integrating Whisper/HF models with LiveKit agents. Includes FastAPI server templates, LiveKit plugin implementation, model selection guides, and production deployment patterns.

Okeysir198/P20251122-claude-skills
0
0
업데이트 5d ago

elevenlabs-agents

Work with ElevenLabs Conversational AI agents - initiate calls, retrieve transcripts, manage phone numbers, and analyze agent conversations. Use when building or testing voice AI applications.

taskcrew/elevenlabs-agents
0
0
업데이트 5d ago