Audio Processing
357 skills in Content & Media > Audio Processing
openai-latam-audiobook
Create complete audiobook with OpenAI GPT-4o-mini translation to Argentine Spanish. Full pipeline - translate, TTS, video with background image. Auto-scales parallelism based on file size. Use when user wants to create audiobook from Russian text to LATAM Spanish.
asr
Implement speech-to-text (ASR/automatic speech recognition) capabilities using the z-ai-web-dev-sdk. Use this skill when the user needs to transcribe audio files, convert speech to text, build voice input features, or process audio recordings. Supports base64 encoded audio files and returns accurate text transcriptions.
campaign-page-copy
Generates full campaign page content structured for Kickstarter/Indiegogo using positioning, product, and voice assets.
bigquery-object-table-agent
BigQuery Object Tables를 활용한 비정형 데이터(오디오, 이미지 등) 분석 및 Audio Analytics Agent 구축 가이드. GCS 데이터 연동, 메타데이터 캐싱, AI 모델 통합, ADK 에이전트 구현 패턴을 다룹니다.
channel-optimizer
Auto-tune channel parameters to find optimal offset, squelch, and AGC settings for best audio quality. Use when setting up new channels, improving weak signals, or finding the sweet spot for demodulation settings.
podcast-production
Podcast production patterns and workflows. Use when recording podcasts, editing audio, transcribing episodes, generating show notes, RSS feed management, or podcast distribution.
rust-candle-whisper
Implement native Rust ML inference with Candle framework. Use when building GPU-accelerated ML pipelines without Python dependencies.
transcribe-audio-to-text
Transcribe audio files to text using audinota cli
wavecap-hallucination
Configure WaveCap hallucination detection and prevention. Use when Whisper outputs gibberish, repeated phrases, or phantom text on silent audio.
brand-strategy
This skill should be used when translating research insights into actionable brand strategy frameworks. Use this when developing positioning statements, messaging architectures, audience strategies, or voice guidelines based on completed research. This skill provides strategic synthesis workflows, validation frameworks, and strategy document templates for evidence-based brand strategy development.
sag
ElevenLabs text-to-speech with mac-style say UX.
marketing-writer
Create marketing content optimized for both human readers and LLM discovery (GEO/AEO). Use when the user needs to write or improve marketing materials including landing page copy, tweet threads, launch emails, blog posts, or feature announcements. Automatically analyzes the user's codebase to understand product features and value propositions. Applies casual, direct brand voice and Generative Engine Optimization principles to maximize visibility in AI search results.
claude-hook-builder
Interactive hook creator for Claude Code. Triggers when user mentions creating hooks, PreToolUse, PostToolUse, hook validation, hook configuration, settings.json hooks, or wants to automate tool execution workflows.
livekit-voice-agent
Guide for building production-ready LiveKit voice AI agents with multi-agent workflows and intelligent handoffs. Use when creating real-time voice agents that need to transfer control between specialized agents, implement supervisor escalation, or build complex conversational systems.
content-publishing
Automated content publishing pipeline for ID8Labs. Generates essays in Eddie's voice, publishes to id8labs.app, and distributes to social media (X, LinkedIn). Triggers on keywords like release, announce, publish, essay, research article, content pipeline.
writer
Generate content in your authentic voice across emails, blogs, social media, and reports
m4b-audiobook-builder
Build and merge M4B audiobooks on Linux from multiple audio files or multi-part M4B sets, with chapter generation, metadata normalization, UTF-8/Russian encoding handling, and validation. Use when combining MP3/M4A/AAC/FLAC/OGG/WAV into one M4B, merging split M4B parts, or fixing audiobook chapters and metadata.