Agent skill

single-cell-multi-omics-integration

Quick-reference sheet for OmicVerse tutorials spanning MOFA, GLUE pairing, SIMBA integration, TOSICA transfer, and StaVIA cartography.

Stars 2,009
Forks 275

Install this agent skill to your Project

npx add-skill https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/single-multiomics

SKILL.md

Single-Cell Multi-Omics Tutorials Cheat Sheet

This skill walk-through summarizes the OmicVerse notebooks that cover paired and unpaired multi-omic integration, multi-batch embedding, reference transfer, and trajectory cartography.

MOFA on paired scRNA + scATAC (t_mofa.ipynb)

  • Data preparation: Load preprocessed AnnData objects for RNA (rna_p_n_raw.h5ad) and ATAC (atac_p_n_raw.h5ad) with ov.utils.read, and initialise pyMOFA with matching omics and omics_name lists.
  • Model training: Call mofa_preprocess() to select highly variable features and run the factor model with mofa_run(outfile=...), which exports the learned MOFA+ factors to an HDF5 model file.
  • Result inspection: Reload downstream AnnData, append factor scores via ov.single.factor_exact, and explore factor–cluster associations using factor_correlation, get_weights, and the plotting helpers in pyMOFAART (plot_r2, plot_cor, plot_factor, plot_weights, etc.).
  • Export workflow: Persist factors and weights through the MOFA HDF5 artifact and reuse them by instantiating pyMOFAART(model_path=...) for later annotation or visualisation sessions.
  • Dependencies & hardware: Requires mofapy2; plots optionally rely on pymde/scvi-tools but run on CPU.

MOFA after GLUE pairing (t_mofa_glue.ipynb)

  • Data preparation: Start from GLUE-derived embeddings (rna-emb.h5ad, atac.emb.h5ad), build a GLUE_pair object, and run correlation() to align unpaired cells before subsetting to highly variable features.
  • Model training: Instantiate pyMOFA with the aligned AnnData objects, run mofa_preprocess(), and save the joint factors through mofa_run(outfile='models/chen_rna_atac.hdf5').
  • Result inspection: Use pyMOFAART plus AnnData that now contains the GLUE embeddings to compute factors (get_factors) and visualise variance explained, factor–cluster correlations, and ranked feature weights.
  • Export workflow: Reuse the saved MOFA HDF5 model for downstream inspection; GLUE embeddings can be embedded with scvi.model.utils.mde (GPU-accelerated MDE is optional, sc.tl.umap works on CPU).
  • Dependencies & hardware: Requires both mofapy2 and the GLUE tooling (scglue, scvi-tools, pymde); GPU acceleration only affects optional MDE visualisation.

SIMBA batch integration (t_simba.ipynb)

  • Data preparation: Fetch the concatenated AnnData (simba_adata_raw.h5ad) derived from multiple pancreas studies and pass it, alongside a results directory, to pySIMBA.
  • Model training: Execute preprocess(...) to bin features and build a SIMBA-compatible graph, then call gen_graph() followed by train(num_workers=...) to launch PyTorch-BigGraph optimisation (can scale with CPU workers) and load(...) to resume trained checkpoints.
  • Result inspection: Apply batch_correction() to obtain the harmonised AnnData with SIMBA embeddings (X_simba) and visualise using mde/sc.tl.umap coloured by cell type or batch.
  • Export workflow: Training outputs reside in the workdir (e.g., result_human_pancreas/pbg/graph0); reuse them with simba_object.load(...) for later analyses.
  • Dependencies & hardware: Requires installing simba and simba_pbg (PyTorch BigGraph backend). GPU is optional; make sure adequate CPU threads and memory are available for graph training.

TOSICA reference transfer (t_tosica.ipynb)

  • Data preparation: Download demo AnnData references (demo_train.h5ad, demo_test.h5ad) and required gene-set GMT files via ov.utils.download_tosica_gmt(); confirm datasets are log-normalised before training.
  • Model training: Create pyTOSICA with the reference AnnData, chosen pathway mask, label key, project directory, and batch size; train with train(epochs=...), then persist weights with save() and optionally reload via load().
  • Result inspection: Generate predictions on query AnnData through predicted(pre_adata=...), embed with OmicVerse preprocessing and GPU-enabled mde (UMAP fallback available), and explore pathway attention to interpret transformer heads.
  • Export workflow: Saved project folder keeps model checkpoints and attention summaries; reuse the exported assets to annotate future datasets without retraining from scratch.
  • Dependencies & hardware: Needs TOSICA (PyTorch transformer) plus downloaded gene-set masks; avoid setting depth=2 if memory is constrained. GPU acceleration improves embedding (mde) but training runs on standard PyTorch (CPU/GPU depending on environment).

StaVIA trajectory cartography (t_stavia.ipynb)

  • Data preparation: Load example dentate gyrus velocity data via scvelo.datasets.dentategyrus(), preprocess with OmicVerse (preprocess, scale, pca, neighbours, UMAP) to populate the AnnData matrices used by VIA.
  • Model training: Configure VIA hyperparameters (components, neighbours, seeds, root selection) and instantiate/run VIA.core.VIA on the chosen representation (adata.obsm['scaled|original|X_pca']).
  • Result inspection: Store outputs such as pseudotime (single_cell_pt_markov), cluster graph abstractions, trajectory curves, atlas views, and stream plots through VIA plotting helpers.
  • Export workflow: Persist derived visualisations and animations (e.g., animate_streamplot_ov, animate_atlas) to files (.gif) for reporting; recompute edge bundles via make_edgebundle_milestone when needed.
  • Dependencies & hardware: Relies on scvelo, pyVIA, and OmicVerse plotting; computations are CPU-bound though producing large stream/animation outputs benefits from ample memory.

Expand your agent's capabilities with these related and highly-rated skills.

FreedomIntelligence/OpenClaw-Medical-Skills

vcf-annotator

Annotate VCF variants with VEP, ClinVar, gnomAD frequencies, and ancestry-aware context. Generates prioritised variant reports.

2,009 275
Explore
FreedomIntelligence/OpenClaw-Medical-Skills

chemist-analyst

Analyzes events through chemistry lens using molecular structure, reaction mechanisms, thermodynamics, kinetics, and analytical techniques (spectroscopy, chromatography, mass spectrometry). Provides insights on chemical processes, material properties, reaction pathways, synthesis, and analytical methods. Use when: Chemical reactions, material analysis, synthesis planning, process optimization, environmental chemistry. Evaluates: Molecular structure, reaction mechanisms, yield, selectivity, safety, environmental impact.

2,009 275
Explore
FreedomIntelligence/OpenClaw-Medical-Skills

bio-alignment-io

Read, write, and convert multiple sequence alignment files using Biopython Bio.AlignIO. Supports Clustal, PHYLIP, Stockholm, FASTA, Nexus, and other alignment formats for phylogenetics and conservation analysis. Use when reading, writing, or converting alignment file formats.

2,009 275
Explore
FreedomIntelligence/OpenClaw-Medical-Skills

sleep-analyzer

分析睡眠数据、识别睡眠模式、评估睡眠质量,并提供个性化睡眠改善建议。支持与其他健康数据的关联分析。

2,009 275
Explore
FreedomIntelligence/OpenClaw-Medical-Skills

metabolomics-workbench-database

Access NIH Metabolomics Workbench via REST API (4,200+ studies). Query metabolites, RefMet nomenclature, MS/NMR data, m/z searches, study metadata, for metabolomics and biomarker discovery.

2,009 275
Explore
FreedomIntelligence/OpenClaw-Medical-Skills

bio-hi-c-analysis-matrix-operations

Balance, normalize, and transform Hi-C contact matrices using cooler and cooltools. Apply iterative correction (ICE), compute expected values, and generate observed/expected matrices. Use when normalizing or transforming Hi-C matrices.

2,009 275
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results