Agent skills
scfoundation-model-agent

Agent skill

scfoundation-model-agent

Stars 2,009

Forks 275

Install this agent skill to your Project

npx add-skill https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/scfoundation-model-agent

SKILL.md

name: 'scfoundation-model-agent' description: 'Unified agent for leveraging single-cell foundation models (scGPT, scBERT, Geneformer, scFoundation) for cross-species annotation, perturbation prediction, and gene network inference.' measurable_outcome: Execute skill workflow successfully with valid output within 15 minutes. allowed-tools:

read_file
run_shell_command

scFoundation Model Agent

The scFoundation Model Agent provides a unified interface to leverage state-of-the-art single-cell foundation models for diverse downstream tasks. It integrates scGPT, scBERT, Geneformer, scFoundation, and emerging models to enable cross-species cell annotation, in silico perturbation prediction, gene regulatory network inference, and batch integration.

When to Use This Skill

When annotating cell types across species (human, mouse, cross-species).
For predicting perturbation effects (knockouts, drug treatments) in silico.
To infer gene regulatory networks from single-cell data.
When integrating batches without losing biological signal.
For generating cell embeddings for downstream analysis.

Core Capabilities

Cross-Species Cell Annotation: Transfer cell type labels across species using unified embeddings.
In Silico Perturbation: Predict gene expression changes from knockouts/treatments.
Gene Regulatory Network Inference: Discover TF-target relationships from attention patterns.
Batch Integration: Remove technical variation while preserving biology.
Cell Embedding Generation: Generate universal cell representations for any downstream task.
Multi-Model Ensemble: Combine predictions from multiple foundation models.

Supported Foundation Models

Model	Parameters	Training Data	Strengths
scGPT	50M	33M human cells	General purpose, perturbations
Geneformer	10M	30M cells	Chromatin, gene networks
scBERT	20M	1.2M cells	Cell type annotation
scFoundation	100M	50M cells	Large-scale, multi-species
scTab	15M	22M cells	Tabular prediction
UCE (Universal Cell Embeddings)	100M	36M cells	Cross-species transfer

Workflow

Input: Single-cell RNA-seq data (AnnData format).
Model Selection: Choose appropriate model(s) for task.
Preprocessing: Tokenize genes, normalize expression.
Inference: Generate embeddings or predictions.
Task Execution: Annotation, perturbation, or network inference.
Ensemble (Optional): Combine multi-model predictions.
Output: Annotated data, predictions, networks.

Example Usage

User: "Use scGPT to predict the effect of CRISPR knockout of TP53 on these cancer cells."

Agent Action:

bash

python3 Skills/Genomics/scFoundation_Model_Agent/foundation_predict.py \
    --input cancer_cells.h5ad \
    --model scgpt \
    --task perturbation \
    --perturbation "TP53 knockout" \
    --model_checkpoint scgpt_human_gene_v1.pt \
    --output tp53_ko_predictions.h5ad

Task-Specific Usage

Cell Type Annotation

bash

python3 foundation_predict.py \
    --input query_cells.h5ad \
    --model geneformer \
    --task annotation \
    --reference tabula_sapiens.h5ad \
    --output annotated_cells.h5ad

Gene Network Inference

bash

python3 foundation_predict.py \
    --input cells.h5ad \
    --model scgpt \
    --task grn_inference \
    --transcription_factors tf_list.txt \
    --output gene_network.csv

Batch Integration

bash

python3 foundation_predict.py \
    --input multi_batch.h5ad \
    --model scfoundation \
    --task integration \
    --batch_key batch \
    --output integrated.h5ad

Output Formats

Task	Output	Format
Annotation	Cell type labels	.h5ad obs column
Perturbation	Predicted expression	.h5ad layer
GRN	TF-target edges	.csv, .graphml
Integration	Corrected embeddings	.h5ad obsm
Embeddings	Cell representations	.h5ad obsm

Performance Benchmarks

Task	Model	Dataset	Performance
Annotation	scGPT	Tabula Sapiens	93% accuracy
Annotation	Geneformer	HLCA	91% accuracy
Perturbation (R²)	scGPT	Norman 2019	0.87
Integration (kBET)	scFoundation	Multi-atlas	0.92
Cross-species	UCE	Human→Mouse	85% F1

AI/ML Architecture

Transformer Backbone:

Gene-level tokenization
Attention-based gene interactions
Masked expression prediction pretraining

Perturbation Module:

Conditional generation
Counterfactual prediction
Dose-response modeling

Transfer Learning:

Zero-shot annotation
Few-shot fine-tuning
Domain adaptation

Prerequisites

Python 3.10+
PyTorch 2.0+
transformers, flash-attn
Scanpy, AnnData
Model-specific weights
GPU with 16GB+ VRAM

Related Skills

Nicheformer_Spatial_Agent - For spatial foundation models
scGPT_Agent - Dedicated scGPT workflows
Cell_Type_Annotation - Traditional annotation methods
Pathway_Analysis - Gene set enrichment

Model Selection Guide

Use Case	Recommended Model	Reason
General annotation	scGPT	Broad training, robust
Cross-species	UCE	Species-agnostic embeddings
Perturbation	scGPT	Best perturbation performance
GRN inference	Geneformer	Attention → regulatory links
Large-scale	scFoundation	Efficient, scalable
Tabular prediction	scTab	Optimized for classification

Special Considerations

Gene Coverage: Models trained on variable gene sets; check overlap
Species: Some models human-only; use UCE for cross-species
Compute: Large models need significant GPU memory
Fine-Tuning: Task-specific fine-tuning improves performance
Versioning: Model weights update frequently; track versions

Ensemble Strategies

Strategy	Method	Benefit
Majority Vote	Mode of predictions	Robust to outliers
Weighted Average	Confidence-weighted	Leverages uncertainty
Stacking	Meta-model	Learns model strengths
Attention Fusion	Cross-model attention	Deep integration

Author

AI Group - Biomedical AI Platform

Maintainer

FreedomIntelligence Core maintainer

Source details

Full Name: FreedomIntelligence/OpenClaw-Medical-Skills
Branch: main
Path in repo: skills/scfoundation-model-agent
Topics: claude-code skills openclaw awesome clawhub openclaw-skills medical nanoclaw

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

FreedomIntelligence/OpenClaw-Medical-Skills

vcf-annotator

Annotate VCF variants with VEP, ClinVar, gnomAD frequencies, and ancestry-aware context. Generates prioritised variant reports.

2,009 275

Explore

FreedomIntelligence/OpenClaw-Medical-Skills

chemist-analyst

Analyzes events through chemistry lens using molecular structure, reaction mechanisms, thermodynamics, kinetics, and analytical techniques (spectroscopy, chromatography, mass spectrometry). Provides insights on chemical processes, material properties, reaction pathways, synthesis, and analytical methods. Use when: Chemical reactions, material analysis, synthesis planning, process optimization, environmental chemistry. Evaluates: Molecular structure, reaction mechanisms, yield, selectivity, safety, environmental impact.

2,009 275

Explore

FreedomIntelligence/OpenClaw-Medical-Skills

bio-alignment-io

Read, write, and convert multiple sequence alignment files using Biopython Bio.AlignIO. Supports Clustal, PHYLIP, Stockholm, FASTA, Nexus, and other alignment formats for phylogenetics and conservation analysis. Use when reading, writing, or converting alignment file formats.

2,009 275

Explore

FreedomIntelligence/OpenClaw-Medical-Skills

sleep-analyzer

分析睡眠数据、识别睡眠模式、评估睡眠质量，并提供个性化睡眠改善建议。支持与其他健康数据的关联分析。

2,009 275

Explore

FreedomIntelligence/OpenClaw-Medical-Skills

metabolomics-workbench-database

Access NIH Metabolomics Workbench via REST API (4,200+ studies). Query metabolites, RefMet nomenclature, MS/NMR data, m/z searches, study metadata, for metabolomics and biomarker discovery.

2,009 275

Explore

FreedomIntelligence/OpenClaw-Medical-Skills

bio-hi-c-analysis-matrix-operations

Balance, normalize, and transform Hi-C contact matrices using cooler and cooltools. Apply iterative correction (ICE), compute expected values, and generate observed/expected matrices. Use when normalizing or transforming Hi-C matrices.

2,009 275

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

scFoundation Model Agent

When to Use This Skill

Core Capabilities

Supported Foundation Models

Workflow

Example Usage

Task-Specific Usage

Cell Type Annotation

Gene Network Inference

Batch Integration

Output Formats

Performance Benchmarks

AI/ML Architecture

Prerequisites

Related Skills

Model Selection Guide

Special Considerations

Ensemble Strategies

Author

Recommended Agent Skills

vcf-annotator

chemist-analyst

bio-alignment-io

sleep-analyzer

metabolomics-workbench-database

bio-hi-c-analysis-matrix-operations