音訊處理
357 skills in 內容與媒體 > 音訊處理
amcs-producer-notes-generator
Generate production notes defining song structure, hooks, instrumentation hints, per-section tags, and mix parameters. Aligns with style spec and blueprint production guidelines. Use when creating arrangement, dynamics, and audio engineering guidance for composition and rendering.
Create LinkedIn posts for lead generation and thought leadership. Use when projecting ideas to LinkedIn, creating post series, or reviewing LinkedIn content. Includes LinkedIn-specific voice (punchier, more personal), format rules, and weekly rhythm guidance.
blog-post-reviewer
Review and provide feedback on blog posts to ensure they match the established writing voice and style guidelines.
blog-voice-review
Review blog content for authentic voice and tone. Checks if content sounds like Fabio's conversational, honest technical writing style.Trigger phrases: "voice", "voice review", "tone", "sounds like me", "authentic", "check voice", "voice check"
torchaudio
Audio signal processing library for PyTorch. Covers feature extraction (spectrograms, mel-scale), waveform manipulation, and GPU-accelerated data augmentation techniques. (torchaudio, melscale, spectrogram, pitchshift, specaugment, waveform, resample)
traktor-dj-autonomous
Complete autonomous DJ system for Traktor Pro 3 with MIDI control, intelligent track selection, energy flow management, and professional mixing workflows. Use when working on DJ automation, Traktor integration, autonomous music mixing, or building DJ agents that need real-time performance capabilities.
content-research-writer
Write blog posts/newsletters in YOUR voice with research and citations
brand-guidelines
Brand voice and visual identity guidelines
print-styles
Write print-friendly CSS using @media print. Use when creating printable pages, invoices, receipts, articles, or any content users might print.
prose-generation
Guidelines for generating coherent book prose including voice and tone, paragraph structure, transition techniques, and chapter flow patterns. Use when writing book content, structuring chapters, or maintaining narrative consistency.
ai-multimodal
Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.
speech-pathology-ai
Expert speech-language pathologist specializing in AI-powered speech therapy, phoneme analysis, articulation visualization, voice disorders, fluency intervention, and assistive communication technology. Activate on 'speech therapy', 'articulation', 'phoneme analysis', 'voice disorder', 'fluency', 'stuttering', 'AAC', 'pronunciation', 'speech recognition', 'mellifluo.us'. NOT for general audio processing, music production, or voice acting coaching without clinical context.
whisper-transcribe
Transcribes audio and video files to text using OpenAI's Whisper CLI with contextual grounding. Converts audio/video to text, transcribes recordings, and creates transcripts from media files. Use when asked to "whisper transcribe", "transcribe audio", "convert recording to text", or "speech to text". Uses markdown files in the same directory as context to improve transcription accuracy for technical terms, proper nouns, and domain-specific vocabulary.
cross-repo-sync
Sync files between BobTheSkull5 and BobFast5 repositories, especially audio files and shared components. Use when syncing audio, copying static files, or managing dual-repo workflow.
gray-swan-ipi-wave-2-executor
Execute Indirect Prompt Injection attacks for Gray Swan AI Arena Wave 2 with pre-built payloads, model profiling, and evidence collection automation
swarm
Multi-perspective reasoning through Upanishadic Antahkarana voices. Use for complex problems requiring diverse viewpoints and synthesis.
ai-multimodal
Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.
marketing-content-guidelines
Social media content creation framework for MYCURE products. Auto-activates for social media posts, marketing campaigns, brand voice content, Instagram/LinkedIn/Twitter/Facebook posts. Includes Hook-Body-CTA-Hashtags structure, content pillars strategy, platform-specific formatting, and character count validation for optimal engagement.
create-script
Transforms content into a voiceover-ready script optimized for Chatterbox TTS. Use when the user provides ANY content for voiceover - URLs, raw text, video scripts, notes, or asks to "create a script" for audio.
notebooklm-superskill
Generate slide decks, audio podcasts, infographics, and video overviews from NotebookLM notebooks. Customizable by audience, format, language (80+), orientation, and visual themes. Use when asked to generate slides, create podcast, make infographic, video overview, or automate NotebookLM content creation.