newsletter-events-research
Research events from Instagram, web aggregators, and Facebook event URLs. Use when scraping event sources, downloading flyer images, or extracting event details.
$ 설치
git clone https://github.com/aniketpanjwani/local_media_tools /tmp/local_media_tools && cp -r /tmp/local_media_tools/skills/newsletter-events-research ~/.claude/skills/local_media_tools// tip: Run this command in your terminal to install the skill
name: newsletter-events-research description: Research events from Instagram, web aggregators, and Facebook event URLs. Use when scraping event sources, downloading flyer images, or extracting event details.
<essential_principles>
How This Skill Works
This skill gathers raw event data from configured sources. It does NOT write newsletter content - use newsletter-events-write for that.
Data Sources
- Instagram - Via ScrapeCreators API (requires API key)
- Web Aggregators - Via Firecrawl (requires API key)
- Facebook Events - Pass event URLs directly (e.g.,
https://facebook.com/events/123456)
Output
Research produces structured data saved to ~/.config/local-media-tools/data/:
data/raw/instagram_<handle>.json- Raw API responsesdata/images/instagram/<handle>/- Downloaded flyer imagesdata/events.db- SQLite database with profiles, posts, events, venues
Key Principle
Images are critical. Many venues post event details only in flyer images, not captions. Always analyze downloaded images with Claude's vision.
Image Download Requirement: Instagram CDN URLs return 403 when accessed via WebFetch. Images MUST be downloaded using Python's requests library with proper User-Agent headers, then analyzed locally using the Read tool.
</essential_principles>
NEVER use curl or raw API calls. Always use the CLI tools provided:
Instagram:
# Scrape all configured accounts
uv run python scripts/cli_instagram.py scrape --all
# Scrape specific account
uv run python scripts/cli_instagram.py scrape --handle wayside_cider
# List posts from database
uv run python scripts/cli_instagram.py list-posts --handle wayside_cider
# Show database statistics
uv run python scripts/cli_instagram.py show-stats
# Classify posts (single or batch)
uv run python scripts/cli_instagram.py classify --post-id 123 --classification event --reason "Has future date"
uv run python scripts/cli_instagram.py classify --batch-json '[{"post_id": "123", "classification": "event", "reason": "..."}]'
The CLI tools ensure:
- Correct API parameters (
handle, notusername) - Rate limiting (2 calls/second)
- Automatic retry on 429/5xx errors
- Proper database storage with FK relationships
- Raw responses saved to
~/.config/local-media-tools/data/raw/
Do NOT:
- Use
curlto call ScrapeCreators API directly - Write raw SQL to insert data
- Guess API parameter names
- Instagram - Scrape Instagram accounts for events
- Web Aggregators - Scrape web event aggregator sites
- All configured sources - Full research from all sources in config
- Facebook event URLs - Pass specific event URLs to scrape
You can also paste Facebook event URLs directly:
https://facebook.com/events/123456https://facebook.com/events/789012
Wait for response before proceeding.
<reference_index>
All domain knowledge in references/:
APIs: scrapecreators-api.md, facebook-scraper-api.md, firecrawl-api.md Detection: event-detection.md </reference_index>
<workflows_index>
| Workflow | Purpose |
|---|---|
| research-instagram.md | Scrape Instagram, download images, extract events |
| research-facebook.md | Scrape individual Facebook event URLs |
| research-web-aggregator.md | Dispatcher for web scraping (calls scrape + extract) |
| research-web-scrape.md | Phase 1: Scrape pages, return JSON |
| research-web-extract.md | Phase 2: Extract events from JSON, save via CLI |
| research-all.md | Run all research workflows |
| </workflows_index> |
<success_criteria> Research is complete when:
- CLI tool used to scrape accounts (not curl)
- Raw data saved to
~/.config/local-media-tools/data/raw/ - Posts saved to database with profiles
- Posts classified as event/not_event/ambiguous
- Events extracted from classified posts
- Data ready for
newsletter-events-writeskill </success_criteria>
Repository
