Agent skill
paper-navigator
End-to-end academic paper workflow: disambiguate queries, discover papers (search, citation traversal, recommendations, arXiv monitoring, trending, GitHub search), evaluate (TLDR, citations, code, SOTA), read with structured analysis (3-level strategy), and organize into literature maps or reports. Use when: finding papers, reading a paper, related work, literature survey, citation analysis, research trends, SOTA results, datasets, or literature reports. Do NOT use for writing a literature review section (use paper-writing), comparing research ideas (use idea-tournament), or planning paper structure (use paper-planning).
Install this agent skill to your Project
npx add-skill https://github.com/EvoScientist/EvoSkills/tree/main/skills/paper-navigator
Metadata
Additional technical details for this skill
- tags
-
core research literature papers search citation
- author
- EvoScientist
- version
- 1.0.0
SKILL.md
Paper Navigator
End-to-end paper workflow in five stages:
┌──────────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Disambiguate │ → │ Discover │ → │ Evaluate │ → │ Read │ → │ Organize │
└──────────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘
| Stage | Input | Scripts | Output |
|---|---|---|---|
| Disambiguate | user query | (agent-driven: web search + intent analysis) | search plan + resolved identifiers |
| Discover | keywords / author / field | scholar_search, citation_traverse, recommend, author_search, arxiv_monitor, trending, github_search |
candidate list |
| Evaluate | candidate list | scholar_search (TLDR/citations), find_code, sota, dataset_search |
reading list |
| Read | paper ID / URL | fetch_paper + references/reading-strategy.md |
structured notes |
| Organize | multiple notes | literature_report, (agent applies framework) |
literature map / report |
Setup: All scripts use httpx (already in project). Optional env vars for higher rate limits:
S2_API_KEY— Semantic Scholar (request here)JINA_API_KEY— Jina Reader (free tier works without key)GITHUB_TOKEN— GitHub personal access token (for higher rate limits ongithub_search,find_code)HF_TOKEN— HuggingFace token (optional, for higher rate limits onfind_code,dataset_search,sota)
Scripts are in skills/paper-navigator/scripts/. Run via python skills/paper-navigator/scripts/<name>.py.
Stage 0: Disambiguate (Intent Analysis + Background Research)
Before searching, analyze the user's query to understand intent and resolve ambiguous terms. This stage prevents the common failure mode where academic APIs return zero results because the user's term doesn't match any paper title (e.g., "DeepSeek Engram" is a project/module name, not the paper title).
Step 1: Classify User Intent
Determine what the user wants:
| Intent | Signal | Strategy |
|---|---|---|
| Find a specific paper | User gives a title, author, or URL | Direct search (skip to Stage 1 Path A) |
| Explore a topic/field | User asks "what's new in X" or "survey X" | Broad search + trending (Stage 1 Paths A + F) |
| Track recent advances | User asks "latest" or "recent" about X | arXiv monitor + trending (Stage 1 Paths E + F) |
| Find a baseline | User asks for code, SOTA, or implementation | Search + code check (Stage 1 + 2) |
| Ambiguous/colloquial term | User uses a project name, module name, or nickname | This needs special handling (see below) |
| Related work / literature map | User wants connections between papers | Citation traversal + recommend (Stage 1 Paths B + C) |
Step 2: Resolve Ambiguous Terms
When the user's query might be a colloquial name, project name, or module name (rather than a paper title), take these steps:
- Quick academic search — Try
scholar_searchwith the exact query - If zero results — The term is likely a project/colloquial name. Broaden the search:
- Web search (via research-agent or web tools): Search for the exact phrase to find GitHub repos, blog posts, or social media mentions that reveal the actual paper title or arXiv ID
- GitHub search: Run
github_search.py --query "USER_QUERY"to find relevant repositories, which often link to papers
- Extract identifiers — From web/GitHub results, extract:
- Actual paper title (for academic search)
- arXiv ID (for direct paper lookup)
- GitHub repo URL (for code + paper discovery)
- Author names (for author_search)
- Re-enter Discover — Use the resolved identifiers to run the appropriate Stage 1 paths
Step 3: Generate Search Plan
Based on intent + resolved terms, output a search plan:
🔍 Disambiguation Report for "deepseek engram"
├── Intent: Track recent advances (ambiguous term)
├── Resolution: "Engram" is a module name from DeepSeek AI
│ ├── Actual paper: "Conditional Memory via Scalable Lookup" (ArXiv:2601.07372)
│ └── GitHub: https://github.com/deepseek-ai/Engram
└── Search Plan:
├── scholar_search --query "Conditional Memory Scalable Lookup" --sort-by year
├── citation_traverse --paper-id ArXiv:2601.07372 --direction forward
├── github_search --query "deepseek engram"
└── trending --query "conditional memory engram" --period 90
This step is agent-driven (no script) — the orchestrating agent performs the web search and intent analysis, then selects the appropriate scripts.
Stage 1: Discover
Seven discovery paths, ordered by frequency of use.
Path A: Keyword Search (most common)
python scripts/scholar_search.py --query "transformer attention mechanism" --limit 20 --sort-by citations
Options: --year-min/--year-max, --open-access-only, --sort-by relevance|citations|year.
Returns: title, authors, year, citations, TLDR, OA PDF link.
Path B: Citation Traversal (from a seed paper)
# Forward — who cited this paper
python scripts/citation_traverse.py --paper-id ArXiv:1706.03762 --direction forward --limit 20
# Backward — what this paper cites
python scripts/citation_traverse.py --paper-id ArXiv:1706.03762 --direction backward --limit 20
# Co-citation — papers frequently cited alongside this one (sister works)
python scripts/citation_traverse.py --paper-id ArXiv:1706.03762 --direction co-citation --limit 15
Co-citation is the most powerful discovery method — it finds closely related work that keyword search misses.
Path C: "More Like This" Recommendations
python scripts/recommend.py --positive ArXiv:1706.03762,ArXiv:2005.14165 --limit 15
# Optionally exclude certain directions:
python scripts/recommend.py --positive ArXiv:1706.03762 --negative ArXiv:2301.00001 --limit 10
Path D: Author Tracking
python scripts/author_search.py --name "Geoffrey Hinton" --papers --limit 20 --sort-by citations
Path E: New Paper Monitoring
# By category (see references/arxiv-categories.md for codes)
python scripts/arxiv_monitor.py --categories cs.CL,cs.AI --days 3 --limit 30
# By keywords
python scripts/arxiv_monitor.py --keywords "chain of thought,reasoning" --days 7
Path F: Trending Detection
python scripts/trending.py --query "large language models" --period 90 --limit 15
Ranks by citation velocity (citations/month). Useful for finding rapidly rising papers.
Path G: GitHub Search (for unreleased or industry papers)
python scripts/github_search.py --query "deepseek engram" --limit 10
python scripts/github_search.py --query "mamba state space model" --sort stars
Options: --sort stars|updated|relevance, --json.
Useful when:
- Papers haven't been published on arXiv yet
- Industry labs release code before papers
- Looking for implementations, forks, or community extensions
Returns: repo name, description, stars, language, dates, URL, topics.
Citation Graph Visualization
After traversal, visualize with Mermaid (keep ≤30 nodes):
graph TD
SEED["Attention Is All You Need<br/>2017 · 100k+"]
A["BERT · 2018"] --> SEED
B["GPT-2 · 2019"] --> SEED
C["Vision Transformer · 2020"] --> SEED
Stage 2: Evaluate
Goal: filter candidates into a reading list. Use data already returned by Discover scripts plus targeted checks.
Quick Assessment (from scholar_search output)
| Signal | What it tells you |
|---|---|
| TLDR | One-sentence understanding |
| Citation count | Overall impact |
| Influential citations | Quality of impact |
| Year + venue | Recency and authority |
| Open Access PDF | Whether you can read full text |
Code Availability Check
python scripts/find_code.py --arxiv-id 1706.03762
Returns: GitHub URLs, stars, framework, whether official implementation.
Top Models by Task
python scripts/sota.py --task "text-generation" --limit 10
# List available pipeline tags:
python scripts/sota.py --task "translation" --list-tasks
# Sort by likes instead of downloads:
python scripts/sota.py --task "text-generation" --sort likes
Dataset Discovery
python scripts/dataset_search.py --query "sentiment analysis" --limit 10
Reproducibility Assessment
After gathering the above, assess each paper:
| Dimension | Check | Score |
|---|---|---|
| Code | Open-source? Official? Stars? Last update? | |
| Results | Reproduced on SOTA leaderboard? | |
| Data | Dataset publicly available? | |
| Overall | High / Medium / Low / None |
Stage 3: Read
Fetch Full Text
# By paper ID (auto-resolves to best URL via S2 metadata)
python scripts/fetch_paper.py --paper-id ArXiv:1706.03762
# By direct URL
python scripts/fetch_paper.py --url "https://arxiv.org/abs/1706.03762"
# Metadata only (no full text fetch)
python scripts/fetch_paper.py --paper-id ArXiv:1706.03762 --metadata-only
Uses Jina Reader (r.jina.ai) to convert any paper URL to clean Markdown. Works with arXiv HTML, PDF links, and publisher pages.
Choose Reading Depth
| Level | Goal | When to use | Effort |
|---|---|---|---|
| L1 Technical | Can reimplement the method | Building directly on this paper | High |
| L2 Analytical | Understand motivation, design choices, tradeoffs | Most papers in your survey | Medium |
| L3 Contextual | Know what it is and where it fits | Quick scanning, staying current | Low |
Most papers need only L2-L3. Reserve L1 for papers you will build upon.
Detailed reading methodology: references/reading-strategy.md
Take Notes
Use the template at assets/paper-summary-template.md. Save notes to /artifacts/paper-notes/{paper-id}.md.
Key questions to answer:
- What problem does this paper address?
- What is the key contribution (one sentence)?
- What is the key technical insight?
- What are the limitations (stated and unstated)?
- How does this relate to my research?
Stage 4: Organize
After reading multiple papers, build two structures to map the literature.
Novelty Tree
Classify each paper:
| Type | Meaning | Novelty |
|---|---|---|
| 1 | Milestone — defines a new task or paradigm | Highest |
| 2 | New pipeline or data representation | High |
| 3 | New module or component | Medium |
| 4 | Incremental improvement on existing approach | Low |
Challenge-Insight Tree
Build a many-to-many mapping:
- Extract challenges: From each paper, what technical problem does it solve?
- Extract insights: What technique or key idea does it use?
- Build the map:
Challenge: Long-range dependencies in sequences
├── Insight: Self-attention (Transformer)
├── Insight: State-space models (Mamba)
└── Insight: Linear attention approximation
Challenge: Quadratic attention cost
├── Insight: Sparse attention patterns
├── Insight: Linear attention
└── Insight: IO-aware computation (Flash Attention)
- Analyze the map:
- Challenges with many solutions → well-studied area
- Challenges with few solutions → research opportunity
- Insights that solve many challenges → powerful, versatile technique
- Insights not yet applied to a challenge → potential for transfer
Save to /artifacts/literature-tree.md and update incrementally.
Generate Literature Report
Use the literature_report.py script to generate a structured, intent-adapted report:
# Full survey report (default)
python scripts/literature_report.py --paper-ids ArXiv:2601.07372,ArXiv:2501.12948
# Quick scan — brief table only
python scripts/literature_report.py --paper-ids ArXiv:2601.07372 --intent quick_scan
# Deep dive — full analysis + reading recommendations
python scripts/literature_report.py --paper-ids ArXiv:2601.07372 --intent deep_dive
# Baseline hunt — focus on code + reproducibility
python scripts/literature_report.py --paper-ids ArXiv:2601.07372 --intent baseline_hunt
# Save to file
python scripts/literature_report.py --paper-ids ArXiv:2601.07372 --output /artifacts/report.md
| Intent | Output includes |
|---|---|
survey (default) |
Summary, paper table, citation analysis, novelty tree, challenge-insight tree, recommendations |
quick_scan |
Brief table: title, authors, year, citations, TLDR |
deep_dive |
Everything in survey + per-paper reading level recommendations + detailed notes |
baseline_hunt |
Code availability, SOTA position, dataset access, reproducibility scores |
Common Workflows
Workflow 1: Comprehensive Literature Survey (full pipeline)
"Help me survey transformers in medical imaging"
- Discover:
scholar_search --query "transformer medical imaging" --limit 20 --sort-by citations→ pick top results →citation_traverse --direction forwardon seminal papers - Evaluate: Review TLDR + citations → shortlist top 10 →
find_codeto check reproducibility - Read:
fetch_paperfor top 5 → L2 reading → notes using template - Organize: Classify by novelty type → build challenge-insight tree → output survey report
Workflow 2: Find and Read a Specific Paper
"Find Attention Is All You Need and analyze it"
- Discover:
scholar_search --query "Attention Is All You Need" - Evaluate: Check TLDR + citations
- Read:
fetch_paper→ L1 or L2 reading → notes
Workflow 3: Track Field Developments
"What's new in NLP this week?"
- Discover:
arxiv_monitor --categories cs.CL --days 7+trending --query "NLP" --period 30 - Evaluate: Scan TLDRs, highlight high-potential papers
Workflow 4: Find a Baseline with Code
"I need a baseline for text classification with code"
- Discover:
scholar_search --query "text classification" --sort-by citations - Evaluate:
find_codeon top results +sota --task "text-classification"→ pick one with official code + high downloads - Output: recommended baseline + GitHub link + model page
Workflow 5: Read a Paper by URL
"Read this paper: arxiv.org/abs/2301.12345"
- Read:
fetch_paper --url "https://arxiv.org/abs/2301.12345"→ choose reading level → notes
Workflow 6: Ambiguous Query Resolution
"Find the latest about deepseek engram"
- Disambiguate:
- Intent: ambiguous term (project/module name)
scholar_searchreturns 0 results → broaden search- Web search reveals: GitHub repo
deepseek-ai/Engram, actual paper title "Conditional Memory via Scalable Lookup" - Extract arXiv ID:
2601.07372
- Discover:
scholar_searchwith resolved title +github_searchwith original term +citation_traverseon arXiv ID - Evaluate: Review results, check code via
find_codeor GitHub - Read:
fetch_paperfor top papers - Organize:
literature_report.py --intent surveyto generate structured report
Script Reference
All scripts output Markdown to stdout, errors to stderr. Common flags:
| Flag | Description |
|---|---|
--limit N |
Max results (prevents oversized output) |
--json |
Raw JSON output (for programmatic use) |
Paper ID Formats
Scripts accept multiple ID formats and normalize automatically:
- S2 ID:
649def34f8be52c8b66281af98ae884c09aef38b - arXiv:
ArXiv:1706.03762or1706.03762orhttps://arxiv.org/abs/1706.03762 - DOI:
DOI:10.18653/v1/N18-3011or10.18653/v1/N18-3011
Rate Limits
| API | Without key | With key |
|---|---|---|
| Semantic Scholar | ~100 req / 5 min | ~1 req/s sustained |
| arXiv | 1 req / 3s (courtesy) | N/A |
| Jina Reader | Free tier | Higher with key |
| HuggingFace | 500 req / 300s | Higher with HF_TOKEN |
| GitHub | 10 req/min (unauthenticated) | 5,000 req/hr (set GITHUB_TOKEN) |
Error Handling
All scripts retry on 429 (rate limit) and 5xx errors with exponential backoff (2s, 4s, 8s). Non-retryable errors print to stderr and exit.
Integration
- research-ideation: After organizing papers with the novelty tree and challenge-insight tree, feed gaps into research-ideation for idea generation.
- experiment-pipeline: After finding a baseline via Workflow 4, hand off to experiment-pipeline.
- literature-review: The paper notes and literature tree from Stage 3-4 serve as input for literature-review skill's formal write-up.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
paper-writing
Guides writing academic papers section by section using an 11-step workflow with LaTeX templates and counterintuitive writing tactics. Covers Abstract, Introduction, Method, Experiments, Related Work, Conclusion, and Supplementary. Use when: user asks to write or draft a paper section, needs LaTeX templates, wants to improve academic writing quality, optimize novelty framing, or mentions 'write introduction', 'draft method', 'paper writing'. Do NOT use for pre-submission review (use paper-review), experiment execution (use experiment-pipeline), or paper planning/story design (use paper-planning).
evo-memory
Manages persistent research memory across ideation and experimentation cycles. Maintains two stores: Ideation Memory M_I (feasible/unsuccessful directions) and Experimentation Memory M_E (reusable strategies for data processing, model training, architecture, debugging). Three evolution mechanisms: IDE (after idea-tournament), IVE (after experiment failure — classifies failures as implementation vs fundamental), ESE (after experiment success — extracts reusable strategies). Use when: updating memory after completing idea tournaments or experiment pipelines, classifying why a method failed (implementation vs fundamental failure), starting a new research cycle needing prior knowledge, user mentions 'update memory', 'classify failure', 'what worked before', 'research history', 'evolution'. Do NOT use for running experiments (use experiment-pipeline), debugging experiment code (use experiment-craft), or generating ideas (use idea-tournament).
paper-review
Guides self-review of YOUR OWN academic paper before submission with adversarial stress-testing. Core method: 5-aspect checklist (contribution sufficiency, writing clarity, results quality, testing completeness, method design), counterintuitive protocol (reject-first simulation, delete unsupported claims, score trust, promote limitations, attack novelty), reverse-outlining, and figure/table quality checks. Use when: user wants to self-review or self-check their own paper draft before submission, stress-test their claims, prepare for reviewer criticism, or mentions 'self-review', 'check my draft', 'is my paper ready'. Do NOT use for writing a peer review of someone else's paper, and do NOT use after receiving actual reviews (use paper-rebuttal instead).
experiment-craft
Use this skill when the user wants to debug, diagnose, or systematically iterate on an experiment that already exists, or when they need a structured experiment log for tracking runs, hypotheses, failures, results, and next steps during active research. Apply it to underperforming methods, training that will not converge, regressions after a change, inconsistent results across datasets, aimless experimentation without progress, and questions like 'why doesn't this work?', 'no progress after many attempts', or 'how should I investigate this failure?'. Also use it for setting up practical experiment logging/record-keeping that supports debugging and iteration. Do not use it for designing a brand-new experiment pipeline or full experiment program (use experiment-pipeline), generating research ideas, fixing isolated coding/syntax errors, or writing retrospective summaries into research memory/notes/knowledge bases.
experiment-pipeline
Guides structured 4-stage experiment execution with attempt budgets and gate conditions: Stage 1 initial implementation (reproduce baseline), Stage 2 hyperparameter tuning, Stage 3 proposed method validation, Stage 4 ablation study. Integrates with evo-memory (load prior strategies, trigger IVE/ESE) and experiment-craft (5-step diagnostic on failure). Use when: user has a planned experiment, needs to reproduce baselines, organize experiment workflow, or systematically validate a method. Do NOT use for debugging a specific experiment failure (use experiment-craft) or designing which experiments to run (use paper-planning).
academic-slides
Use this skill for creating or refining an academic slide deck and the talk built around it: structuring a conference talk, thesis defense, lab meeting, or paper-to-slides deck; deciding the narrative arc and slide breakdown; improving slide design and visual hierarchy; planning rehearsal, timing, Q&A, and backup slides; or generating the .pptx. Reach for it when the user is shaping the presentation itself. Do not use for writing the paper, producing standalone speaker notes/scripts/transcripts, making posters, creating isolated figures/charts outside a slide deck, or building non-academic presentations.
Didn't find tool you were looking for?