Sponsored by

Find leads on Reddit on auto pilot

Agent skills
codebase-exploration

Agent skill

codebase-exploration

Explore and understand codebases using SocratiCode semantic search, dependency graphs, and context artifacts. Use when exploring code, understanding architecture, finding functions/types, analyzing dependencies, searching database schemas or API specs, or when socraticode/codebase_search tools are available. Activates when the user asks about code structure, wants to find where a feature lives, or needs to understand how code is organized.

View SKILL.md on GitHub Repository

Stars 808

Forks 110

Install this agent skill to your Project

npx add-skill https://github.com/giancarloerra/SocratiCode/tree/main/skills/codebase-exploration

SKILL.md

SocratiCode Codebase Exploration

Use SocratiCode MCP tools to explore codebases efficiently. The core principle: search before reading — the index gives you a map of the codebase in milliseconds; raw file reading is expensive and context-consuming.

Workflow

1. Start most explorations with `codebase_search`

Hybrid semantic + keyword search (vector + BM25, RRF-fused) runs in a single call.

Broad queries for orientation: "how is authentication handled", "database connection setup", "error handling patterns"
Precise queries for symbol lookup: exact function names, constants, type names
Prefer search results to infer which files to read — do not speculatively open files
Use fileFilter to narrow to a specific file path, languageFilter for a specific language
Adjust minScore (default 0.10) for precision vs recall — lower for more results, higher for stricter matching

When to use grep instead: If you already know the exact identifier, error string, or regex pattern, grep/ripgrep is faster and more precise — no semantic gap to bridge. Use codebase_search when exploring, asking conceptual questions, or when you don't know which files to look in.

2. Follow the graph before following imports

Use codebase_graph_query to see what a file imports and what depends on it before diving into its contents. This prevents unnecessary reading of transitive dependencies.

codebase_graph_query — imports and dependents for any file (pass relative path)
codebase_graph_stats — architecture overview: total files, edges, most connected files, orphans, language breakdown
codebase_graph_circular — find circular dependencies (these cause subtle runtime bugs; check proactively when debugging unexpected behavior)
codebase_graph_visualize — Mermaid diagram color-coded by language, circular deps highlighted in red

The graph is auto-built after indexing. Use codebase_graph_status to check if the graph is ready.

3. Read files only after narrowing via search

Once search results clearly point to 1-3 files, read only the relevant sections. Never read a file just to find out if it's relevant — search first.

A single codebase_search call returns ranked, deduplicated snippets from across the entire codebase in milliseconds. This gives you a broad map at negligible token cost — far cheaper than opening files speculatively.

4. Leverage context artifacts for non-code knowledge

Projects can define a .socraticodecontextartifacts.json config to expose database schemas, API specs, infrastructure configs, architecture docs, and other project knowledge that lives outside source code.

codebase_context — list available artifacts (names, descriptions, paths, index status)
codebase_context_search — semantic search across all artifacts (or filter with artifactName)
Artifacts are auto-indexed on first search and auto-detect staleness

Run codebase_context early to see what's available. Use codebase_context_search before asking about database structure, API contracts, or infrastructure.

5. Check status if something seems wrong

codebase_status — check index status, progress, watcher state, graph status
If search returns no results, the project may not be indexed yet
If the watcher is inactive, results may be stale — run codebase_update or start the watcher

6. Get an overview of all tools

codebase_about — quick reference of all SocratiCode tools and a typical workflow

Goal → Tool Quick Reference

Goal	Tool
Understand what a codebase does / where a feature lives	`codebase_search` (broad query)
Find a specific function, constant, or type	`codebase_search` (exact name) or grep
Find exact error messages, log strings, or regex patterns	grep / ripgrep
See what a file imports or what depends on it	`codebase_graph_query`
Get architecture overview (files, edges, most connected)	`codebase_graph_stats`
Spot circular dependencies	`codebase_graph_circular`
Visualize module structure	`codebase_graph_visualize`
Check graph build status	`codebase_graph_status`
Verify index is up to date	`codebase_status`
Discover available schemas, specs, configs	`codebase_context`
Find database tables, API endpoints, infra configs	`codebase_context_search`
Quick overview of all tools	`codebase_about`

For full parameter details on every tool, see references/tool-reference.md.

Maintainer

giancarloerra Core maintainer

Source details

Full Name: giancarloerra/SocratiCode
Branch: main
Path in repo: skills/codebase-exploration
License: GNU Affero General Public License v3.0
Topics: ai claude mcp openai gemini code-graph docker qdrant semantic-search vector-search ai-assistant embeddings vector-database ollama ast codebase-analysis codebase-intelligence semantic vector-embeddings vectorization

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

giancarloerra/SocratiCode

codebase-management

Set up, index, and manage SocratiCode codebase indexing. Use when the user wants to index a project, check infrastructure health, start/stop file watching, configure context artifacts, troubleshoot indexing issues, manage the code graph, or any SocratiCode administrative task. Activates when the user mentions indexing, setting up search, SocratiCode infrastructure, or managing the codebase index.

davila7/claude-code-templates

verl-rl-training

Provides guidance for training LLMs with reinforcement learning using verl (Volcano Engine RL). Use when implementing RLHF, GRPO, PPO, or other RL algorithms for LLM post-training at scale with flexible infrastructure backends.

davila7/claude-code-templates

openrlhf-training

High-performance RLHF framework with Ray+vLLM acceleration. Use for PPO, GRPO, RLOO, DPO training of large models (7B-70B+). Built on Ray, vLLM, ZeRO-3. 2× faster than DeepSpeedChat with distributed architecture and GPU resource sharing.

davila7/claude-code-templates

gguf-quantization

GGUF format and llama.cpp quantization for efficient CPU/GPU inference. Use when deploying models on consumer hardware, Apple Silicon, or when needing flexible quantization from 2-8 bit without GPU requirements.

davila7/claude-code-templates

Claude Code Guide

Master guide for using Claude Code effectively. Includes configuration templates, prompting strategies "Thinking" keywords, debugging techniques, and best practices for interacting with the agent.

davila7/claude-code-templates

qdrant-vector-search

High-performance vector similarity search engine for RAG and semantic search. Use when building production RAG systems requiring fast nearest neighbor search, hybrid search with filtering, or scalable vector storage with Rust-powered performance.

Didn't find tool you were looking for?