Agent skill

bio-systems-biology-context-specific-models

Stars 2,009
Forks 275

Install this agent skill to your Project

npx add-skill https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/bio-systems-biology-context-specific-models

SKILL.md


name: bio-systems-biology-context-specific-models description: Build tissue and condition-specific metabolic models using GIMME, iMAT, and INIT algorithms with expression data constraints. Create models that reflect cell-type specific metabolism. Use when building tissue-specific metabolic models or integrating transcriptomics with FBA. tool_type: python primary_tool: cobrapy measurable_outcome: Execute skill workflow successfully with valid output within 15 minutes. allowed-tools:

  • read_file
  • run_shell_command

Context-Specific Models

GIMME Algorithm

python
import cobra
import numpy as np

def gimme(model, expression_data, threshold=0.25, required_growth=0.1):
    '''Gene Inactivity Moderated by Metabolism and Expression (GIMME)

    Creates context-specific model by:
    1. Penalizing flux through lowly-expressed reactions
    2. Requiring minimum biomass production

    Args:
        expression_data: dict mapping gene_id -> expression value
        threshold: Expression percentile below which genes are inactive
                  0.25 = bottom 25% considered inactive
        required_growth: Minimum growth rate to maintain

    Returns:
        Context-specific model with inactive reactions constrained
    '''
    # Calculate expression threshold
    values = list(expression_data.values())
    cutoff = np.percentile(values, threshold * 100)

    # Identify lowly-expressed genes
    low_expressed = {g for g, v in expression_data.items() if v < cutoff}

    # Create context model
    context_model = model.copy()

    # Set minimum growth constraint
    context_model.reactions.get_by_id('Biomass_Ecoli_core').lower_bound = required_growth

    # Minimize flux through reactions with low-expressed genes
    for rxn in context_model.reactions:
        genes = {g.id for g in rxn.genes}
        if genes and genes.issubset(low_expressed):
            # This reaction is likely inactive - constrain it
            rxn.upper_bound = min(rxn.upper_bound, 1.0)
            rxn.lower_bound = max(rxn.lower_bound, -1.0)

    return context_model

iMAT Algorithm

python
def imat(model, expression_data, high_threshold=0.75, low_threshold=0.25):
    '''Integrative Metabolic Analysis Tool (iMAT)

    Maximizes agreement between flux activity and expression:
    - Highly expressed reactions should carry flux
    - Lowly expressed reactions should have zero flux

    More sophisticated than GIMME - uses MILP optimization.
    '''
    from cobra import Reaction

    # Classify reactions by expression
    high_expr_rxns = []
    low_expr_rxns = []

    for rxn in model.reactions:
        if rxn.genes:
            # Aggregate gene expression (use max for OR, min for AND)
            gene_expr = [expression_data.get(g.id, 0.5) for g in rxn.genes]
            rxn_expr = max(gene_expr)  # Simplified OR logic

            if rxn_expr > np.percentile(list(expression_data.values()), high_threshold * 100):
                high_expr_rxns.append(rxn.id)
            elif rxn_expr < np.percentile(list(expression_data.values()), low_threshold * 100):
                low_expr_rxns.append(rxn.id)

    # Create MILP to maximize consistent reactions
    # This is a simplified version - full iMAT uses binary variables
    context_model = model.copy()

    # Force flux through highly expressed reactions
    for rxn_id in high_expr_rxns:
        rxn = context_model.reactions.get_by_id(rxn_id)
        rxn.lower_bound = max(rxn.lower_bound, 0.01)

    # Constrain lowly expressed reactions
    for rxn_id in low_expr_rxns:
        rxn = context_model.reactions.get_by_id(rxn_id)
        rxn.upper_bound = min(rxn.upper_bound, 0.1)
        rxn.lower_bound = max(rxn.lower_bound, -0.1)

    return context_model, high_expr_rxns, low_expr_rxns

Expression Data Integration

python
def load_expression_data(filepath, gene_col='gene_id', expr_col='TPM'):
    '''Load and normalize expression data

    Accepts:
    - RNA-seq counts (TPM, FPKM)
    - Microarray intensities
    - Proteomics abundances

    Returns dict mapping gene_id -> normalized expression
    '''
    import pandas as pd

    df = pd.read_csv(filepath)

    # Log-transform if needed (high dynamic range)
    expr = df[expr_col].values
    if expr.max() / expr.mean() > 100:
        expr = np.log2(expr + 1)

    # Normalize to 0-1 range
    expr_norm = (expr - expr.min()) / (expr.max() - expr.min())

    return dict(zip(df[gene_col], expr_norm))


def aggregate_gene_expression(model, expression_data, method='max'):
    '''Map gene expression to reactions

    Methods:
    - 'max': Use maximum gene expression (OR logic)
    - 'min': Use minimum gene expression (AND logic)
    - 'mean': Average across genes

    For GPR: (A and B) or C
    - min(A, B) for the complex
    - max(complex, C) for the alternatives
    '''
    rxn_expression = {}

    for rxn in model.reactions:
        if not rxn.genes:
            rxn_expression[rxn.id] = 0.5  # Default for non-enzymatic
            continue

        gene_expr = [expression_data.get(g.id, 0.5) for g in rxn.genes]

        if method == 'max':
            rxn_expression[rxn.id] = max(gene_expr)
        elif method == 'min':
            rxn_expression[rxn.id] = min(gene_expr)
        else:
            rxn_expression[rxn.id] = np.mean(gene_expr)

    return rxn_expression

Tissue-Specific Human Models

python
def create_tissue_model(generic_model, gtex_expression, tissue='liver'):
    '''Create tissue-specific model from GTEx expression data

    GTEx provides median TPM for 54 human tissues.
    Download from: https://gtexportal.org/home/datasets
    '''
    import pandas as pd

    # Load GTEx median expression
    gtex = pd.read_csv(gtex_expression, sep='\t')

    # Extract tissue column
    tissue_col = [c for c in gtex.columns if tissue.lower() in c.lower()][0]
    expression = dict(zip(gtex['gene_id'], gtex[tissue_col]))

    # Apply GIMME
    tissue_model = gimme(generic_model, expression, threshold=0.25)

    return tissue_model

Validate Context Model

python
def validate_context_model(original, context, expression_data):
    '''Compare original and context-specific models

    Checks:
    1. Growth capability maintained
    2. Inactive reactions reduced
    3. Active reactions maintained
    '''
    # Growth comparison
    orig_growth = original.optimize().objective_value
    context_growth = context.optimize().objective_value

    # Count constrained reactions
    constrained = 0
    for rxn in context.reactions:
        orig_rxn = original.reactions.get_by_id(rxn.id)
        if rxn.upper_bound < orig_rxn.upper_bound:
            constrained += 1

    return {
        'original_growth': orig_growth,
        'context_growth': context_growth,
        'growth_ratio': context_growth / orig_growth,
        'constrained_reactions': constrained,
        'total_reactions': len(context.reactions)
    }

Related Skills

  • systems-biology/flux-balance-analysis - Run FBA on context models
  • differential-expression/de-results - Generate expression data
  • single-cell/clustering - Cell-type specific expression

Expand your agent's capabilities with these related and highly-rated skills.

FreedomIntelligence/OpenClaw-Medical-Skills

vcf-annotator

Annotate VCF variants with VEP, ClinVar, gnomAD frequencies, and ancestry-aware context. Generates prioritised variant reports.

2,009 275
Explore
FreedomIntelligence/OpenClaw-Medical-Skills

chemist-analyst

Analyzes events through chemistry lens using molecular structure, reaction mechanisms, thermodynamics, kinetics, and analytical techniques (spectroscopy, chromatography, mass spectrometry). Provides insights on chemical processes, material properties, reaction pathways, synthesis, and analytical methods. Use when: Chemical reactions, material analysis, synthesis planning, process optimization, environmental chemistry. Evaluates: Molecular structure, reaction mechanisms, yield, selectivity, safety, environmental impact.

2,009 275
Explore
FreedomIntelligence/OpenClaw-Medical-Skills

bio-alignment-io

Read, write, and convert multiple sequence alignment files using Biopython Bio.AlignIO. Supports Clustal, PHYLIP, Stockholm, FASTA, Nexus, and other alignment formats for phylogenetics and conservation analysis. Use when reading, writing, or converting alignment file formats.

2,009 275
Explore
FreedomIntelligence/OpenClaw-Medical-Skills

sleep-analyzer

分析睡眠数据、识别睡眠模式、评估睡眠质量,并提供个性化睡眠改善建议。支持与其他健康数据的关联分析。

2,009 275
Explore
FreedomIntelligence/OpenClaw-Medical-Skills

metabolomics-workbench-database

Access NIH Metabolomics Workbench via REST API (4,200+ studies). Query metabolites, RefMet nomenclature, MS/NMR data, m/z searches, study metadata, for metabolomics and biomarker discovery.

2,009 275
Explore
FreedomIntelligence/OpenClaw-Medical-Skills

bio-hi-c-analysis-matrix-operations

Balance, normalize, and transform Hi-C contact matrices using cooler and cooltools. Apply iterative correction (ICE), compute expected values, and generate observed/expected matrices. Use when normalizing or transforming Hi-C matrices.

2,009 275
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results