chromatin-state-inference
This skill should be used when users need to infer chromatin states from histone modification ChIP-seq data using chromHMM. It provides workflows for chromatin state segmentation, model training, state annotation.
$ Installieren
git clone https://github.com/BIsnake2001/ChromSkills /tmp/ChromSkills && cp -r /tmp/ChromSkills/15.chromatin-state-inference ~/.claude/skills/ChromSkills// tip: Run this command in your terminal to install the skill
name: chromatin-state-inference description: This skill should be used when users need to infer chromatin states from histone modification ChIP-seq data using chromHMM. It provides workflows for chromatin state segmentation, model training, state annotation.
ChromHMM Chromatin State Inference
Overview
This skill enables comprehensive chromatin state analysis using chromHMM for histone modification ChIP-seq data. ChromHMM uses a multivariate Hidden Markov Model to segment the genome into discrete chromatin states based on combinatorial patterns of histone modifications.
Main steps include:
- Refer to Inputs & Outputs to verify necessary files.
- Always prompt user if required files are missing.
- Always prompt user for genome assembly used.
- Always prompt user for the bin size for generating binarized files.
- Always prompt user for the bin size for the number of states the ChromHMM target.
- Always prompt user for the absolute path of ChromHMM JAR file.
- Run chromHMM workflow: Binarization → Learning.
When to use this skill
Use this skill when you need to infer chromatin states from histone modification ChIP-seq data using chromHMM.
Inputs & Outputs
Inputs
(1) Option 1: BED files of aligned reads
<mark1>.bed
<mark2>.bed
... # Other marks
(1) Option 2: BAM files of aligned reads
<mark1>.bam
<mark2>.bam
... # Other marks
Outputs
chromhmm_output/
binarized/
*.txt
model/
*.txt
... # other files output by the ChromHMM
Decision Tree
Step 1: Prepare the cellmarkfile
- Prepare a .txt file (without header) containing following three columns:
- sample name
- marker name
- name of the BED/BAM file
Step 2: Data Binarization
-
For BAM inputs:
Call:mcp__chromhmm-tools__binarize_bamwith:ChromHMM_path: Path to ChromHMM JAR file, provided by usergenome: Provide by user (e.g.hg38)input_dir: Directory containing BAM filescellmarkfile: Cell mark file defining histone modificationsoutput_dir: (e.g.binarized/)bin_size: Provided by user
-
For BED inputs:
Callmcp__chromhmm-tools__binarize_bedinstead.
Step 3: Model Learning
Call
mcp__chromhmm-tools__learn_model
with:
ChromHMM_path: Path to ChromHMM JAR file, provided by userbinarized_dir: Directory binarized file located innum_states: Provide by user (e.g. 15)output_model_dir: (e.g.model_15_states/)genome: Provide by user (e.g.hg38)num_states: Provide by user (e.g.hg38)threads: (e.g. 4)
Parameter Optimization
Number of States
- 8 states: Basic chromatin states
- 15 states: Standard comprehensive states
- 25 states: High-resolution states
- Optimization: Use Bayesian Information Criterion (BIC)
Bin Size
- 200bp: Standard resolution
- 100bp: High resolution (requires more memory)
- 500bp: Low resolution (faster computation)
State Interpretation
Common Chromatin States
- Active Promoter: H3K4me3, H3K27ac
- Weak Promoter: H3K4me3
- Poised Promoter: H3K4me3, H3K27me3
- Strong Enhancer: H3K27ac, H3K4me1
- Weak Enhancer: H3K4me1
- Insulator: CTCF
- Transcribed: H3K36me3
- Repressed: H3K27me3
- Heterochromatin: Low signal across marks
Troubleshooting
- Memory errors: Reduce bin size or number of states
- Convergence problems: Increase iterations or adjust learning rate
- Uninterpretable states: Check input data quality and mark combinations
- Missing chromosomes: Verify chromosome naming consistency
Repository
