Data Science
1726 skills in Data & AI > Data Science
matplotlib
Create static visualizations including bar charts, line plots, pie charts, and more. Use when generating charts for reports, creating data visualizations, producing publication-quality figures, or embedding plots in PDF/HTML reports.
bioinformatics
Computational biology analysis techniques. Use when analyzing genomic data, sequences, or biological datasets.
thermodynamic-calculations
Thermodynamic calculations for battery materials. Use for Arrhenius analysis, activation energy calculations, ionic conductivity modeling, and temperature-dependent property analysis.
protein-analysis
Analyze protein sequences for properties like molecular weight, isoelectric point, secondary structure, and domains.
pandas
Tabular data manipulation and analysis with DataFrame structures. Read/write CSV/Excel files, filter rows, aggregate groups, merge datasets, pivot tables, and apply transformations. Use when processing exchange rate feeds, analyzing time-series currency data, cleaning and transforming datasets, joining multiple data sources, computing grouped statistics, handling missing values, or working with structured tabular data. Built on NumPy for performance.
biopython
BioPython library for computational molecular biology. Use when working with biological sequence data, parsing bioinformatics file formats (FASTA, GenBank, PDB), performing sequence analysis, or accessing biological databases programmatically.
deseq2
Use DESeq2 for differential gene expression analysis from RNA-seq count data.
biopython
Bioinformatics library for sequence analysis. Use when working with DNA/RNA/protein sequences, FASTA files, or biological data.
pandas
Analyze and transform tabular data using pandas DataFrames. Use when processing invoice data at scale, performing aggregations, merging datasets, cleaning data, or generating statistical summaries from invoice records.
pandas
DataFrame-based data manipulation for time-series analysis, filtering, grouping, and aggregation. Use when loading financial data from CSVs, calculating rolling statistics, resampling time series, joining datasets, pivoting tables, or handling missing data. Critical for processing historical price data and portfolio holdings.
numpy
NumPy library for numerical computing in Python. Use for array operations, linear algebra, statistical calculations, normalization, and mathematical transformations.
gatk
Use GATK (Genome Analysis Toolkit) for variant discovery and genotyping in DNA sequencing data. Use when performing best-practice variant calling on Illumina data, applying BQSR, or joint genotyping.
numpy
NumPy library for numerical computing in Python. Use for array operations, statistical calculations, matrix operations, and numerical analysis of network data.
pandas
Analyze tabular data with pandas. Use for CSV/Excel files, data manipulation, filtering, and aggregation.
test-reporting-analytics
Advanced test reporting, quality dashboards, predictive analytics, trend analysis, and executive reporting for QE metrics. Use when communicating quality status, tracking trends, or making data-driven decisions.
test-design-techniques
Systematic test design with boundary value analysis, equivalence partitioning, decision tables, state transition testing, and combinatorial testing. Use when designing comprehensive test cases, reducing redundant tests, or ensuring systematic coverage.
Unnamed Skill
Professional investment analysis skill for stock market analysis. Provides 7 core capabilities: (1) Daily market overview, (2) Stock fundamental analysis, (3) Technical chart analysis, (4) Strategy simulation, (5) Risk management, (6) Stock screening, (7) News impact analysis. Use when users need comprehensive stock analysis, investment research, portfolio management, or market analysis tasks.
update-ignored-endpoints
Update the IGNORED_ENDPOINTS.md documentation file with current endpoint coverage analysis. Use when documentation needs to be refreshed or when verifying ignored endpoint status.
tecton
Run Tecton plan and tests via Pants in the data-science repo. Handles long-running commands with proper output capture to avoid truncation.
data-sql-optimization
Production-grade SQL optimization with AI-assisted query analysis, EXPLAIN ANALYZE automation, balanced indexing strategies, performance tuning, schema design, and operations across PostgreSQL, MySQL, SQL Server, Oracle, SQLite.