Agent skills
bio-alignment-files-bam-statis...

Agent skill

bio-alignment-files-bam-statistics

Stars 2,009

Forks 275

Install this agent skill to your Project

npx add-skill https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/bio-alignment-files-bam-statistics

SKILL.md

name: bio-alignment-files-bam-statistics description: Generate alignment statistics using samtools flagstat, stats, depth, and coverage. Use when assessing alignment quality, calculating coverage, or generating QC reports. tool_type: cli primary_tool: samtools measurable_outcome: Execute skill workflow successfully with valid output within 15 minutes. allowed-tools:

read_file
run_shell_command

BAM Statistics

Generate alignment statistics using samtools and pysam.

Quick Summary Commands

Command	Output	Speed
`flagstat`	Read counts by category	Very fast
`idxstats`	Per-chromosome counts	Very fast (needs index)
`stats`	Comprehensive statistics	Moderate
`depth`	Per-position depth	Slow (full scan)
`coverage`	Per-region coverage	Fast (needs index)

samtools flagstat

Fast summary of alignment flags.

bash

samtools flagstat input.bam

Output:

10000000 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
50000 + 0 supplementary
0 + 0 duplicates
9800000 + 0 mapped (98.00% : N/A)
9950000 + 0 paired in sequencing
4975000 + 0 read1
4975000 + 0 read2
9700000 + 0 properly paired (97.49% : N/A)
9750000 + 0 with itself and mate mapped
100000 + 0 singletons (1.01% : N/A)
25000 + 0 with mate mapped to a different chr
10000 + 0 with mate mapped to a different chr (mapQ>=5)

Multi-threaded

bash

samtools flagstat -@ 4 input.bam

Output to File

bash

samtools flagstat input.bam > flagstat.txt

samtools idxstats

Per-chromosome read counts (requires index).

bash

samtools idxstats input.bam

Output format: chrom length mapped unmapped

chr1    248956422    5000000    1000
chr2    242193529    4800000    800
chrM    16569        50000      100
*       0            0          150000

Parse idxstats

bash

# Total mapped reads
samtools idxstats input.bam | awk '{sum += $3} END {print sum}'

# Mitochondrial percentage
samtools idxstats input.bam | awk '
    /^chrM/ {mt = $3}
    {total += $3}
    END {print mt/total*100 "% mitochondrial"}'

samtools stats

Comprehensive statistics including insert size, base quality, and more.

bash

samtools stats input.bam > stats.txt

View Summary Numbers

bash

samtools stats input.bam | grep "^SN"

Key summary fields:

raw total sequences - Total reads
reads mapped - Mapped reads
reads mapped and paired - Properly paired
insert size average - Mean insert size
insert size standard deviation - Insert size spread
average length - Mean read length
error rate - Mismatch rate

Generate Plots (with plot-bamstats)

bash

samtools stats input.bam > stats.txt
plot-bamstats -p plots/ stats.txt

Stats for Specific Region

bash

samtools stats input.bam chr1:1000000-2000000 > region_stats.txt

samtools depth

Per-position read depth.

Basic Depth

bash

samtools depth input.bam > depth.txt

Output: chrom position depth

Depth at Specific Positions

bash

samtools depth -r chr1:1000-2000 input.bam

Include Zero-Depth Positions

bash

samtools depth -a input.bam > depth_with_zeros.txt

Maximum Depth Cap

bash

samtools depth -d 0 input.bam  # No cap (default 8000)

Depth from BED Regions

bash

samtools depth -b regions.bed input.bam

Calculate Mean Depth

bash

samtools depth input.bam | awk '{sum += $3; n++} END {print sum/n}'

samtools coverage

Per-chromosome or per-region coverage statistics (faster than depth).

bash

samtools coverage input.bam

Output columns:

#rname - Reference name
startpos - Start position
endpos - End position
numreads - Number of reads
covbases - Bases with coverage
coverage - Percentage of bases covered
meandepth - Mean depth
meanbaseq - Mean base quality
meanmapq - Mean mapping quality

Coverage for Specific Region

bash

samtools coverage -r chr1:1000000-2000000 input.bam

Coverage from BED

bash

samtools coverage -b regions.bed input.bam

Histogram Output

bash

samtools coverage -m input.bam

pysam Python Alternative

Count Reads

python

import pysam

with pysam.AlignmentFile('input.bam', 'rb') as bam:
    total = mapped = paired = proper = 0
    for read in bam:
        total += 1
        if not read.is_unmapped:
            mapped += 1
        if read.is_paired:
            paired += 1
        if read.is_proper_pair:
            proper += 1

    print(f'Total: {total}')
    print(f'Mapped: {mapped} ({mapped/total*100:.1f}%)')
    print(f'Properly paired: {proper} ({proper/paired*100:.1f}%)')

Per-Chromosome Counts

python

import pysam

with pysam.AlignmentFile('input.bam', 'rb') as bam:
    for stat in bam.get_index_statistics():
        print(f'{stat.contig}: {stat.mapped} mapped, {stat.unmapped} unmapped')

Calculate Depth at Position

python

import pysam

with pysam.AlignmentFile('input.bam', 'rb') as bam:
    for pileup in bam.pileup('chr1', 1000000, 1000001):
        print(f'Position {pileup.pos}: depth {pileup.n}')

Mean Depth in Region

python

import pysam

def mean_depth(bam_path, chrom, start, end):
    depths = []
    with pysam.AlignmentFile(bam_path, 'rb') as bam:
        for pileup in bam.pileup(chrom, start, end, truncate=True):
            depths.append(pileup.n)

    if depths:
        return sum(depths) / len(depths)
    return 0

depth = mean_depth('input.bam', 'chr1', 1000000, 2000000)
print(f'Mean depth: {depth:.1f}x')

Coverage Statistics

python

import pysam

def coverage_stats(bam_path, chrom, start, end):
    covered = 0
    total_depth = 0

    with pysam.AlignmentFile(bam_path, 'rb') as bam:
        for pileup in bam.pileup(chrom, start, end, truncate=True):
            covered += 1
            total_depth += pileup.n

    length = end - start
    pct_covered = covered / length * 100
    mean_depth = total_depth / length if length > 0 else 0

    return {
        'length': length,
        'covered_bases': covered,
        'pct_covered': pct_covered,
        'mean_depth': mean_depth
    }

stats = coverage_stats('input.bam', 'chr1', 1000000, 2000000)
print(f'Coverage: {stats["pct_covered"]:.1f}%')
print(f'Mean depth: {stats["mean_depth"]:.1f}x')

Insert Size Distribution

python

import pysam
from collections import Counter

insert_sizes = Counter()

with pysam.AlignmentFile('input.bam', 'rb') as bam:
    for read in bam:
        if read.is_proper_pair and read.is_read1 and read.template_length > 0:
            insert_sizes[read.template_length] += 1

sizes = list(insert_sizes.keys())
mean_insert = sum(s * c for s, c in insert_sizes.items()) / sum(insert_sizes.values())
print(f'Mean insert size: {mean_insert:.0f}')
print(f'Min: {min(sizes)}, Max: {max(sizes)}')

Quick Reference

Task	Command
Quick counts	`samtools flagstat input.bam`
Per-chrom counts	`samtools idxstats input.bam`
Full stats	`samtools stats input.bam`
Coverage summary	`samtools coverage input.bam`
Per-position depth	`samtools depth input.bam`
Mean depth	`samtools depth input.bam \| awk '{sum+=$3;n++}END{print sum/n}'`

Common Metrics

Metric	Good	Concerning
Mapping rate	>95%	<80%
Proper pair rate	>90%	<70%
Duplicate rate	<20%	>40%
Error rate	<1%	>2%
Coverage uniformity	<2x CV	>3x CV

Related Skills

sam-bam-basics - View alignment files
alignment-indexing - idxstats requires index
duplicate-handling - Check duplicate rates
alignment-filtering - Filter before stats
sequence-io/sequence-statistics - FASTA/FASTQ statistics

Maintainer

FreedomIntelligence Core maintainer

Source details

Full Name: FreedomIntelligence/OpenClaw-Medical-Skills
Branch: main
Path in repo: skills/bio-alignment-files-bam-statistics
Topics: claude-code skills openclaw awesome clawhub openclaw-skills medical nanoclaw

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

FreedomIntelligence/OpenClaw-Medical-Skills

vcf-annotator

Annotate VCF variants with VEP, ClinVar, gnomAD frequencies, and ancestry-aware context. Generates prioritised variant reports.

2,009 275

Explore

FreedomIntelligence/OpenClaw-Medical-Skills

chemist-analyst

Analyzes events through chemistry lens using molecular structure, reaction mechanisms, thermodynamics, kinetics, and analytical techniques (spectroscopy, chromatography, mass spectrometry). Provides insights on chemical processes, material properties, reaction pathways, synthesis, and analytical methods. Use when: Chemical reactions, material analysis, synthesis planning, process optimization, environmental chemistry. Evaluates: Molecular structure, reaction mechanisms, yield, selectivity, safety, environmental impact.

2,009 275

Explore

FreedomIntelligence/OpenClaw-Medical-Skills

bio-alignment-io

Read, write, and convert multiple sequence alignment files using Biopython Bio.AlignIO. Supports Clustal, PHYLIP, Stockholm, FASTA, Nexus, and other alignment formats for phylogenetics and conservation analysis. Use when reading, writing, or converting alignment file formats.

2,009 275

Explore

FreedomIntelligence/OpenClaw-Medical-Skills

sleep-analyzer

分析睡眠数据、识别睡眠模式、评估睡眠质量，并提供个性化睡眠改善建议。支持与其他健康数据的关联分析。

2,009 275

Explore

FreedomIntelligence/OpenClaw-Medical-Skills

metabolomics-workbench-database

Access NIH Metabolomics Workbench via REST API (4,200+ studies). Query metabolites, RefMet nomenclature, MS/NMR data, m/z searches, study metadata, for metabolomics and biomarker discovery.

2,009 275

Explore

FreedomIntelligence/OpenClaw-Medical-Skills

bio-hi-c-analysis-matrix-operations

Balance, normalize, and transform Hi-C contact matrices using cooler and cooltools. Apply iterative correction (ICE), compute expected values, and generate observed/expected matrices. Use when normalizing or transforming Hi-C matrices.

2,009 275

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

BAM Statistics

Quick Summary Commands

samtools flagstat

Multi-threaded

Output to File

samtools idxstats

Parse idxstats

samtools stats

View Summary Numbers

Generate Plots (with plot-bamstats)

Stats for Specific Region

samtools depth

Basic Depth

Depth at Specific Positions

Include Zero-Depth Positions

Maximum Depth Cap

Depth from BED Regions

Calculate Mean Depth

samtools coverage

Coverage for Specific Region

Coverage from BED

Histogram Output

pysam Python Alternative

Count Reads

Per-Chromosome Counts

Calculate Depth at Position

Mean Depth in Region

Coverage Statistics

Insert Size Distribution

Quick Reference

Common Metrics

Related Skills

Recommended Agent Skills

vcf-annotator

chemist-analyst

bio-alignment-io

sleep-analyzer

metabolomics-workbench-database

bio-hi-c-analysis-matrix-operations