Agent skill
structural-variant-calling
Install this agent skill to your Project
npx add-skill https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/variant-interpretation-acmg/bioSkills/structural-variant-calling
SKILL.md
name: bio-variant-calling-structural-variant-calling description: Call structural variants (SVs) from short-read sequencing using Manta, Delly, and LUMPY. Detects deletions, insertions, inversions, duplications, and translocations that are too large for standard SNV callers. Use when detecting structural variants from short-read data. tool_type: cli primary_tool: manta measurable_outcome: Execute skill workflow successfully with valid output within 15 minutes. allowed-tools:
- read_file
- run_shell_command
Structural Variant Calling (Short Reads)
Manta (Recommended)
# Configure Manta run (creates runWorkflow.py)
configManta.py \
--bam sample.bam \
--referenceFasta reference.fa \
--runDir manta_run
# Execute
manta_run/runWorkflow.py -j 8
# Output: manta_run/results/variants/
# - diploidSV.vcf.gz (germline SVs)
# - candidateSV.vcf.gz (all candidates)
# - candidateSmallIndels.vcf.gz (small indels)
Manta Tumor-Normal Mode
# Somatic SV calling
configManta.py \
--tumorBam tumor.bam \
--normalBam normal.bam \
--referenceFasta reference.fa \
--runDir manta_somatic
manta_somatic/runWorkflow.py -j 8
# Output includes:
# - somaticSV.vcf.gz (somatic SVs)
# - diploidSV.vcf.gz (germline SVs)
Manta Options
# WES mode (for exome data)
configManta.py \
--bam sample.bam \
--referenceFasta reference.fa \
--exome \ # Use exome settings
--callRegions regions.bed.gz \ # Restrict to regions
--runDir manta_exome
# RNA-seq mode
configManta.py \
--bam rnaseq.bam \
--referenceFasta reference.fa \
--rna \ # RNA-seq mode
--runDir manta_rna
Delly
# Call SVs
delly call \
-g reference.fa \
-o sv_calls.bcf \
sample.bam
# Convert to VCF
bcftools view sv_calls.bcf > sv_calls.vcf
# Multiple samples (joint calling)
delly call \
-g reference.fa \
-o joint_svs.bcf \
sample1.bam sample2.bam sample3.bam
Delly Somatic Mode
# Call with tumor-normal
delly call \
-g reference.fa \
-o svs.bcf \
tumor.bam normal.bam
# Create sample file
echo -e "tumor\ttumor\nnormal\tcontrol" > samples.tsv
# Filter for somatic
delly filter \
-f somatic \
-o somatic_svs.bcf \
-s samples.tsv \
svs.bcf
Delly SV Types
# Call specific SV type
delly call -t DEL -g ref.fa -o deletions.bcf sample.bam
delly call -t DUP -g ref.fa -o duplications.bcf sample.bam
delly call -t INV -g ref.fa -o inversions.bcf sample.bam
delly call -t BND -g ref.fa -o translocations.bcf sample.bam
delly call -t INS -g ref.fa -o insertions.bcf sample.bam
LUMPY
# Extract split reads and discordant pairs
samtools view -b -F 1294 sample.bam > discordant.bam
samtools view -h sample.bam | \
/path/to/lumpy-sv/scripts/extractSplitReads_BwaMem -i stdin | \
samtools view -Sb - > splitters.bam
# Run LUMPY
lumpyexpress \
-B sample.bam \
-S splitters.bam \
-D discordant.bam \
-o lumpy_svs.vcf
Smoove (LUMPY Wrapper)
# Simplified LUMPY workflow
smoove call \
--name sample \
--fasta reference.fa \
--outdir smoove_output \
-p 8 \
sample.bam
# Output: smoove_output/sample-smoove.genotyped.vcf.gz
Merge Multiple Callers
# Use SURVIVOR to merge callsets
# Create file listing VCFs
ls manta_svs.vcf delly_svs.vcf lumpy_svs.vcf > vcf_list.txt
# Merge with parameters
SURVIVOR merge vcf_list.txt 1000 2 1 1 0 50 merged_svs.vcf
# Parameters: max_dist min_callers type_agree strand_agree estimate_dist min_size
Filter SV Calls
# Filter by quality
bcftools view -i 'QUAL >= 20' svs.vcf > svs.filtered.vcf
# Filter by size
bcftools view -i 'ABS(SVLEN) >= 50' svs.vcf > svs.min50.vcf
# Filter by SV type
bcftools view -i 'SVTYPE="DEL"' svs.vcf > deletions.vcf
bcftools view -i 'SVTYPE="INS"' svs.vcf > insertions.vcf
bcftools view -i 'SVTYPE="INV"' svs.vcf > inversions.vcf
bcftools view -i 'SVTYPE="DUP"' svs.vcf > duplications.vcf
bcftools view -i 'SVTYPE="BND"' svs.vcf > translocations.vcf
# Keep only PASS
bcftools view -f PASS svs.vcf > svs.pass.vcf
Annotate SVs
# AnnotSV annotation
AnnotSV \
-SVinputFile svs.vcf \
-genomeBuild GRCh38 \
-outputFile annotated_svs
# Output includes: genes, DGV, gnomAD-SV, ClinVar
SV Types
| Type | Code | Description |
|---|---|---|
| Deletion | DEL | Sequence removed |
| Insertion | INS | Sequence inserted |
| Inversion | INV | Sequence reversed |
| Duplication | DUP | Sequence duplicated |
| Translocation | BND | Breakend (inter-chromosomal) |
Comparison: Manta vs Delly vs LUMPY
| Feature | Manta | Delly | LUMPY |
|---|---|---|---|
| Speed | Fast | Medium | Medium |
| Sensitivity | High | High | High |
| Small SVs | Good | Moderate | Good |
| Large SVs | Good | Good | Good |
| RNA-seq | Yes | No | No |
| Somatic | Yes | Yes | Limited |
Coverage Guidelines
| Coverage | Detection Ability |
|---|---|
| 10x | Large SVs (>1kb) |
| 30x | Most SVs |
| 50x+ | Small SVs, better breakpoints |
Long-Read SV Callers
For long-read data (ONT/PacBio HiFi), use specialized callers with higher sensitivity:
| Caller | Best For | Notes |
|---|---|---|
| CuteSV | ONT/HiFi | Fast, accurate for all SV types |
| Sniffles2 | ONT/HiFi | Population-scale, multisample |
| PBSV | PacBio | Official PacBio caller |
See long-read-sequencing/structural-variants for long-read SV workflows.
Related Skills
- long-read-sequencing/structural-variants - Long-read SV calling
- copy-number/cnvkit-analysis - Copy number variants
- variant-calling/filtering-best-practices - Filter VCF files
- alignment-files/alignment-filtering - Prepare BAM files
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
vcf-annotator
Annotate VCF variants with VEP, ClinVar, gnomAD frequencies, and ancestry-aware context. Generates prioritised variant reports.
chemist-analyst
Analyzes events through chemistry lens using molecular structure, reaction mechanisms, thermodynamics, kinetics, and analytical techniques (spectroscopy, chromatography, mass spectrometry). Provides insights on chemical processes, material properties, reaction pathways, synthesis, and analytical methods. Use when: Chemical reactions, material analysis, synthesis planning, process optimization, environmental chemistry. Evaluates: Molecular structure, reaction mechanisms, yield, selectivity, safety, environmental impact.
bio-alignment-io
Read, write, and convert multiple sequence alignment files using Biopython Bio.AlignIO. Supports Clustal, PHYLIP, Stockholm, FASTA, Nexus, and other alignment formats for phylogenetics and conservation analysis. Use when reading, writing, or converting alignment file formats.
sleep-analyzer
分析睡眠数据、识别睡眠模式、评估睡眠质量,并提供个性化睡眠改善建议。支持与其他健康数据的关联分析。
metabolomics-workbench-database
Access NIH Metabolomics Workbench via REST API (4,200+ studies). Query metabolites, RefMet nomenclature, MS/NMR data, m/z searches, study metadata, for metabolomics and biomarker discovery.
bio-hi-c-analysis-matrix-operations
Balance, normalize, and transform Hi-C contact matrices using cooler and cooltools. Apply iterative correction (ICE), compute expected values, and generate observed/expected matrices. Use when normalizing or transforming Hi-C matrices.
Didn't find tool you were looking for?