Agent skill
scientific-critical-thinking
Install this agent skill to your Project
npx add-skill https://github.com/drshailesh88/integrated_content_OS/tree/main/skills/cardiology/scientific-critical-thinking
SKILL.md
Scientific Critical Thinking
Systematic evaluation of research rigor through methodology assessment, bias detection, and evidence quality frameworks.
Triggers
- User asks to evaluate a study's quality
- User needs to assess evidence strength
- User is reviewing trial methodology
- User wants to identify limitations or biases
- User is critiquing research for an editorial
Core Capabilities
1. Methodology Critique
Validity Assessment:
| Type | Question | Red Flags |
|---|---|---|
| Internal | Did the study measure what it intended? | Confounders, selection bias |
| External | Can results generalize? | Narrow population, artificial setting |
| Construct | Do measures capture the concept? | Surrogate endpoints, proxy measures |
| Statistical | Are conclusions supported by data? | Underpowered, multiple testing |
Study Design Hierarchy:
- Systematic reviews/meta-analyses of RCTs
- Individual RCTs
- Cohort studies
- Case-control studies
- Cross-sectional studies
- Case series/reports
- Expert opinion
2. Bias Detection
Cognitive Biases in Research:
- Confirmation bias: Interpreting data to support hypothesis
- HARKing: Hypothesizing after results known
- Publication bias: Positive results published more
- Spin: Overstating or misrepresenting findings
Selection Biases:
- Sampling bias (non-representative)
- Volunteer bias (healthier participants)
- Attrition bias (differential dropout)
- Survivorship bias (only studying survivors)
Measurement Biases:
- Observer/detection bias
- Recall bias
- Social desirability bias
- Hawthorne effect
Analysis Biases:
- P-hacking (multiple testing)
- Outcome switching
- Selective reporting
- Data dredging
3. Statistical Evaluation Checklist
- Sample size adequate? (power analysis done?)
- Statistical test appropriate for data type?
- Multiple comparison correction applied?
- Effect sizes reported (not just p-values)?
- Confidence intervals provided?
- Missing data handled appropriately?
- Assumptions of tests verified?
4. Evidence Quality Assessment (GRADE)
Quality Levels:
| Level | Meaning | Implications |
|---|---|---|
| High | Very confident in estimate | Strong recommendation |
| Moderate | Moderately confident | Conditional recommendation |
| Low | Limited confidence | Further research likely |
| Very Low | Little confidence | Estimate highly uncertain |
Downgrade Factors:
- Risk of bias
- Inconsistency across studies
- Indirectness (surrogate outcomes)
- Imprecision (wide CIs)
- Publication bias
Upgrade Factors:
- Large effect size
- Dose-response relationship
- Residual confounding would reduce effect
5. Logical Fallacy Detection
Causation Fallacies:
- Post hoc ergo propter hoc (after = because of)
- Correlation ≠ causation
- Reverse causation
- Confounding as causation
Generalization Errors:
- Hasty generalization (small sample)
- Ecological fallacy (group to individual)
- Exception fallacy (individual to group)
Statistical Fallacies:
- Texas sharpshooter (finding patterns in noise)
- Base rate neglect
- Regression to mean confusion
- Multiple endpoints fishing
6. Research Design Questions
When evaluating a study, ask:
- Question: Is the research question clear and answerable?
- Design: Is the study design appropriate for the question?
- Population: Is the sample representative of target population?
- Intervention: Was the intervention clearly defined and consistent?
- Comparison: Was the control group appropriate?
- Outcome: Were outcomes clinically meaningful and measured reliably?
- Follow-up: Was follow-up long enough and complete enough?
- Analysis: Was the analysis appropriate and pre-specified?
7. Claim Evaluation Framework
For any scientific claim:
- Identify the assertion - What exactly is being claimed?
- Evaluate supporting evidence - What studies support it?
- Check logical connection - Does evidence actually support claim?
- Assess proportionality - Is strength of claim proportional to evidence?
- Detect overgeneralization - Are limits of findings respected?
- Flag red flags - Conflicts of interest, spin, p-hacking?
Application to Cardiology Content
Evaluating Trial Results
- Check randomization and blinding adequacy
- Assess primary endpoint clinical relevance
- Evaluate intention-to-treat vs per-protocol
- Look for protocol changes mid-trial
- Examine subgroup analyses critically
- Consider funding source influence
For Editorials/Newsletters
- Acknowledge study limitations explicitly
- Don't overstate findings
- Note where evidence is weak
- Distinguish association from causation
- Highlight what questions remain
Critique Output Format
When critiquing research:
- Summary: Brief overview of what study did
- Strengths: What was done well
- Critical concerns: Major methodological issues
- Important limitations: Secondary concerns
- Minor issues: Small points for completeness
- Overall assessment: Balanced conclusion on reliability
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
pufferlib
This skill should be used when working with reinforcement learning tasks including high-performance RL training, custom environment development, vectorized parallel simulation, multi-agent systems, or integration with existing RL environments (Gymnasium, PettingZoo, Atari, Procgen, etc.). Use this skill for implementing PPO training, creating PufferEnv environments, optimizing RL performance, or developing policies with CNNs/LSTMs.
fluidsim
Framework for computational fluid dynamics simulations using Python. Use when running fluid dynamics simulations including Navier-Stokes equations (2D/3D), shallow water equations, stratified flows, or when analyzing turbulence, vortex dynamics, or geophysical flows. Provides pseudospectral methods with FFT, HPC support, and comprehensive output analysis.
metabolomics-workbench-database
Access NIH Metabolomics Workbench via REST API (4,200+ studies). Query metabolites, RefMet nomenclature, MS/NMR data, m/z searches, study metadata, for metabolomics and biomarker discovery.
geniml
This skill should be used when working with genomic interval data (BED files) for machine learning tasks. Use for training region embeddings (Region2Vec, BEDspace), single-cell ATAC-seq analysis (scEmbed), building consensus peaks (universes), or any ML-based analysis of genomic regions. Applies to BED file collections, scATAC-seq data, chromatin accessibility datasets, and region-based genomic feature learning.
zinc-database
Access ZINC (230M+ purchasable compounds). Search by ZINC ID/SMILES, similarity searches, 3D-ready structures for docking, analog discovery, for virtual screening and drug discovery.
astropy
Comprehensive Python library for astronomy and astrophysics. This skill should be used when working with astronomical data including celestial coordinates, physical units, FITS files, cosmological calculations, time systems, tables, world coordinate systems (WCS), and astronomical data analysis. Use when tasks involve coordinate transformations, unit conversions, FITS file manipulation, cosmological distance calculations, time scale conversions, or astronomical data processing.
Didn't find tool you were looking for?