source-detection
Detect document source type (Zoom, Slack, JIRA, Confluence, email, meeting notes) from content patterns. Use when processing raw documents during intake or when identifying document types.
$ インストール
git clone https://github.com/majiayu000/claude-skill-registry /tmp/claude-skill-registry && cp -r /tmp/claude-skill-registry/skills/data/source-detection ~/.claude/skills/claude-skill-registry// tip: Run this command in your terminal to install the skill
name: source-detection description: Detect document source type (Zoom, Slack, JIRA, Confluence, email, meeting notes) from content patterns. Use when processing raw documents during intake or when identifying document types.
Source Detection Patterns
Use these patterns to identify document source types from plain text content.
Detection Priority
Check patterns in this order (first match wins with high confidence):
- Zoom transcripts (most distinctive patterns)
- Slack conversations
- JIRA tickets
- Confluence/wiki pages
- Email threads
- Meeting notes
- General notes (default)
Zoom Transcript
High confidence indicators (any 2+ = zoom):
- Timestamps in
HH:MM:SSformat (e.g.,00:15:32,01:23:45) - Speaker labels with colon:
John Smith:followed by text WEBVTTheader at start of file- Multiple speakers with consistent formatting
Medium confidence indicators (supporting evidence):
- Phrases: "you're on mute", "can you hear me", "let me share my screen"
- "unmute", "muted", "screen share"
- Time-based speaker transitions
Example patterns:
WEBVTT
00:00:15.000 --> 00:00:20.000
John Smith: Good morning everyone, let's get started.
00:00:21.000 --> 00:00:25.000
Jane Doe: Hi John, can you hear me okay?
Slack Conversation
High confidence indicators (any 2+ = slack):
- Channel references:
#channel-name - User mentions:
@usernameor@User Name - Timestamp formats:
10:30 AM,2:45 PM,14:30 - Thread markers: "replied to a thread", "Thread", "in thread"
Medium confidence indicators:
- Emoji reactions:
:thumbsup:,:+1:,:smile: - Bot messages with structured formatting
- "Slack" mentioned in content
- Day/date groupings: "Monday, January 15th"
Example patterns:
#data-pipeline
Monday, January 15th
@john.smith 10:30 AM
Hey team, quick update on the migration
@jane.doe 10:32 AM
Thanks! :+1: Can you share the timeline?
replied to a thread:
@john.smith: Sure, here's the breakdown...
JIRA Ticket
High confidence indicators (any 2+ = jira):
- Ticket ID pattern:
[A-Z]{2,10}-\d+(e.g.,PROJ-1234,DATA-567) - Section headers: "Summary:", "Description:", "Acceptance Criteria:"
- Status fields: "Status:", "Priority:", "Assignee:", "Reporter:"
Medium confidence indicators:
- Status values: "To Do", "In Progress", "Done", "Blocked", "In Review"
- "Story Points:", "Sprint:", "Epic Link:"
- "Components:", "Labels:", "Fix Version:"
- Atlassian/JIRA mentioned
Example patterns:
PROJ-1234: Implement user authentication
Status: In Progress
Priority: High
Assignee: John Smith
Reporter: Jane Doe
Description:
As a user, I want to log in securely...
Acceptance Criteria:
- [ ] User can log in with email/password
- [ ] Session expires after 30 minutes
Confluence/Wiki
High confidence indicators (any 2+ = confluence):
- Page/Space references: "Page:", "Space:", "Parent:"
- Wiki-style headers with
=signs:== Section == - Confluence macros:
{code},{info},{note},{panel} - Table of contents patterns
Medium confidence indicators:
- Breadcrumb paths:
Space > Parent Page > This Page - "Last modified by", "Created by" with timestamps
- Atlassian/Confluence mentioned
- Wiki-style links:
[[Page Name]]or[Link Text|Page Name]
Example patterns:
Space: Engineering
Page: Architecture Overview
Created by: John Smith | Last modified: Jan 15, 2024
= Architecture Overview =
== Components ==
{info}
This document describes the high-level architecture.
{info}
Email Thread
High confidence indicators (any 2+ = email):
- Headers: "From:", "To:", "Subject:", "Date:", "Sent:"
- Email addresses:
name@domain.com - Reply indicators: "Re:", "RE:", "Fwd:", "FW:"
Medium confidence indicators:
- CC/BCC fields
- Signature blocks (lines starting with
--or common sign-offs) - "Original Message" or "Forwarded message"
- Quote indicators:
>at start of lines
Example patterns:
From: John Smith <john.smith@company.com>
To: Jane Doe <jane.doe@company.com>
Subject: Re: Quarterly Planning
Date: Mon, Jan 15, 2024 10:30 AM
Hi Jane,
Thanks for sending over the proposal...
--
John Smith
Senior Engineer
Meeting Notes
High confidence indicators (any 2+ = meeting):
- Section headers: "Agenda", "Attendees", "Action Items", "Minutes"
- Structured bullet lists under sections
- Date header at document start
- "Meeting" in title or first lines
Medium confidence indicators:
- Time references for agenda items
- Names followed by task assignments
- "Next meeting:", "Follow-up:"
- Decision markers: "Decided:", "Agreed:", "TODO:"
Example patterns:
Meeting Notes - Sprint Planning
Date: January 15, 2024
Attendees:
- John Smith
- Jane Doe
- Bob Wilson
Agenda:
1. Review previous sprint
2. Plan upcoming work
Action Items:
- [ ] John: Update documentation
- [ ] Jane: Review PRs
General Notes (Default)
If no specific pattern matches with confidence, classify as notes.
Indicators that confirm "notes":
- Unstructured text
- Personal notes style
- Mixed content without clear source indicators
- Bullet points without meeting structure
Confidence Levels
- High: 2+ high-confidence indicators match
- Medium: 1 high-confidence indicator OR 3+ medium-confidence indicators
- Low: Only medium-confidence indicators OR uncertain match
Output Format
When detecting source, return:
source: zoom|slack|jira|confluence|email|meeting|notes
confidence: high|medium|low
indicators:
- "indicator 1 found"
- "indicator 2 found"
Repository
