Agent skill
unknown-kcns008-cluster-skills
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/development/unknown-kcns008-cluster-skills
SKILL.md
OSGrep - Semantic Code Search
Purpose: Provides fast, local, natural-language search over the repository using osgrep CLI for semantic code analysis without requiring MCP servers.
When to Use: When you need to find code patterns, understand implementations, locate features, or perform natural-language searches across the codebase.
Overview
OSGrep is a lightweight semantic grep tool that enables natural-language search over codebases. It uses local embeddings to understand code semantically, making it ideal for finding implementations, patterns, and features without exact string matching.
Unlike traditional grep, osgrep understands the meaning of your query and returns semantically relevant results, making it perfect for exploratory code analysis in cluster-code projects.
Installation & Setup
Initial Installation
npm install -g osgrep
Model Download (One-Time Setup)
osgrep setup
✅ GOOD: Run osgrep setup once after installation to download models
⚠️ NOTE: This is optional but significantly improves search performance
Verification
osgrep doctor # Verify installation and models
osgrep list # Show indexed repositories
Core Search Patterns
Basic Natural Language Search
# Find authentication validation logic
osgrep "how do we validate auth?"
# Locate pagination helpers
osgrep "pagination helper"
# Find error handling patterns
osgrep "error handling"
# Search for feature flags
osgrep "feature flags"
✅ GOOD: Use natural language queries that describe what you're looking for ❌ BAD: Don't use exact string matching patterns like traditional grep
Machine-Readable JSON Output
# Get structured JSON output for programmatic processing
osgrep --json "pagination helper"
# Save results to file for analysis
osgrep --json "authentication logic" > /tmp/osgrep-results.json
💡 TIP: Use --json when you need to process results programmatically or integrate with other tools
Controlling Search Depth
# Get more results per file (default varies)
osgrep "error handling" --per-file 3
# Get comprehensive results per file
osgrep "kubernetes integration" --per-file 5
Compact Output
# Show only file paths (no context)
osgrep "feature flags" --compact
✅ GOOD: Use --compact when you just need to know which files contain the pattern
💡 TIP: Combine with other tools like xargs for batch operations
Index Management
Keeping Index Fresh
# Re-index modified files before searching (recommended)
osgrep --sync "recent changes"
# Manual full re-index of the repository
osgrep index
⚠️ IMPORTANT: Always run from the repository root so auto-store detection works correctly
✅ BEST PRACTICE: Use --sync before important searches to ensure index is up-to-date
❌ BAD: Forgetting to sync after major code changes leads to stale results
Index Storage
OSGrep stores embeddings in ~/.osgrep/data/<store> by default.
# Clean slate - remove all indexed data for current repo
rm -rf ~/.osgrep/data/<store-name>
# Then re-index
osgrep index
Hot Daemon (Performance Optimization)
Starting the Daemon
# Start background daemon to keep embeddings warm
osgrep serve # Default port: 4444
# Custom port
osgrep serve --port 5000
✅ GOOD: Use daemon for repeated searches during active development sessions 💡 TIP: The CLI automatically falls back to on-demand mode if daemon isn't running
When to Use Daemon
- ✅ Active development sessions with frequent searches
- ✅ When working on large refactoring tasks
- ✅ During code review sessions
- ❌ One-off searches (daemon overhead not worth it)
- ❌ CI/CD pipelines (use direct mode)
Cluster-Code Specific Patterns
Finding Kubernetes Integration Points
# Locate kubectl client usage
osgrep "kubectl client initialization"
# Find pod management logic
osgrep "how do we manage pods?"
# Locate resource fetching patterns
osgrep "fetching kubernetes resources"
LLM Provider Integration
# Find LLM provider implementations
osgrep "LLM provider registration"
# Locate streaming response handlers
osgrep "streaming LLM responses"
Plugin System
# Find plugin registration logic
osgrep "plugin registration and lifecycle"
# Locate hook implementations
osgrep "command hooks implementation"
Error Handling & Logging
# Find error handling patterns
osgrep "error handling and recovery"
# Locate logging implementations
osgrep "structured logging patterns"
Integration with Claude Code Workflow
Exploratory Code Analysis
When exploring unfamiliar parts of cluster-code:
# 1. Get overview of a feature area
osgrep "authentication middleware" --per-file 2
# 2. Dive deeper into specific files
osgrep "JWT token validation" --per-file 5
# 3. Export for detailed analysis
osgrep --json "auth implementation" > /tmp/auth-analysis.json
Refactoring Support
# 1. Find all usages of a pattern before refactoring
osgrep "deprecated API calls" --json
# 2. Identify files needing updates
osgrep "old configuration pattern" --compact
# 3. Verify refactoring completeness
osgrep "old pattern that should be gone" # Should return no results
Code Review Assistance
# Find similar implementations for consistency
osgrep "similar validation logic"
# Locate test coverage patterns
osgrep "test cases for error handling"
# Check for security patterns
osgrep "input sanitization"
Best Practices
✅ DO
- Run from repo root: Ensures auto-store detection works correctly
- Use natural language: "how do we handle errors?" not "error.*catch"
- Sync before important searches:
osgrep --sync "query"ensures fresh results - Use --json for automation: Enables programmatic processing and integration
- Start daemon for dev sessions: Keeps embeddings warm for faster repeated searches
- Keep queries focused: Specific queries return more relevant results
❌ DON'T
- Mix with traditional grep syntax: osgrep uses semantic search, not regex
- Forget to index: Run
osgrep indexafter cloning or major changes - Search from subdirectories: Always run from repo root
- Over-rely on cache: Sync regularly when code changes frequently
- Use for exact string matching: Use traditional
greporrgfor that
⚠️ WARNINGS
- Stale Index: If results seem outdated, run
osgrep --sync "query"orosgrep index - Storage Growth: Index data accumulates in
~/.osgrep/data/- clean periodically - Model Download: First
osgrep setuprequires internet and downloads ~100MB - Port Conflicts: Default daemon port 4444 may conflict with other services
Common Workflows
1. Initial Repository Setup
cd /home/user/cluster-code
npm install -g osgrep
osgrep setup # One-time model download
osgrep index # Index the repository
osgrep serve & # Optional: start daemon
2. Daily Development
# Start of day: ensure fresh index
osgrep --sync "startup checks"
# During development: semantic searches
osgrep "how do we parse command line args?"
# Before commit: verify changes
osgrep --sync "new feature I just added"
3. Deep Code Investigation
# Step 1: Broad search
osgrep "authentication flow" --per-file 2 > /tmp/auth-overview.txt
# Step 2: Narrow down
osgrep "JWT token generation" --per-file 5 --json > /tmp/jwt-details.json
# Step 3: Analyze specific files (open in editor)
4. Troubleshooting
# Verify installation
osgrep doctor
# Check what's indexed
osgrep list
# Fresh start
rm -rf ~/.osgrep/data/cluster-code
osgrep index
Performance Tips
Optimize Search Speed
- Use daemon for repeated searches:
osgrep serve & - Limit results: Use
--per-file 1for quick overview - Use --compact: When you only need file paths
- Keep index fresh: Regular
--syncis faster than full re-index
Manage Storage
# Check index size
du -sh ~/.osgrep/data/*
# Clean old repositories
osgrep list
rm -rf ~/.osgrep/data/<unused-repo>
Troubleshooting
Common Issues
Problem: osgrep: command not found
# Solution: Install globally
npm install -g osgrep
Problem: Slow first search
# Solution: Run setup to download models
osgrep setup
Problem: Outdated results
# Solution: Sync or re-index
osgrep --sync "query"
# OR
osgrep index
Problem: Daemon won't start (port conflict)
# Solution: Use different port
osgrep serve --port 5000
Problem: No results for recent code
# Solution: Ensure running from repo root and re-index
cd /home/user/cluster-code
osgrep index
Comparison with Other Tools
OSGrep vs Traditional Grep
| Feature | osgrep | grep/rg |
|---|---|---|
| Search Type | Semantic (meaning-based) | Literal (string/regex) |
| Query Style | Natural language | Regex patterns |
| Setup | Requires indexing | None |
| Speed (indexed) | Very fast | Fast |
| Best For | Exploratory, conceptual | Exact matches |
✅ USE OSGREP when:
- You don't know exact variable/function names
- Exploring unfamiliar code
- Finding conceptually similar code
- Natural language queries work better
✅ USE GREP/RG when:
- You need exact string matches
- Working with regex patterns
- No setup time available
- Searching non-code files
Resources
Official Documentation
- OSGrep GitHub: Check
npm info osgrepfor repository link - CLI Help:
osgrep --help - Diagnostics:
osgrep doctor
Related Cluster-Code Skills
- cluster-dev-guidelines.md: TypeScript development patterns
- skill-developer.md: Creating new skills
Integration Points
OSGrep works seamlessly with:
- Claude Code's exploration workflows
- Traditional grep/rg for hybrid searches
- Your editor's "open file" capabilities
- CI/CD pipelines (direct mode without daemon)
Quick Reference
# Essential Commands
osgrep "natural language query" # Basic search
osgrep --json "query" # JSON output
osgrep --sync "query" # Sync before search
osgrep "query" --per-file 3 # More results per file
osgrep "query" --compact # File paths only
# Index Management
osgrep index # Full re-index
osgrep list # Show indexed repos
osgrep doctor # Verify setup
# Daemon
osgrep serve # Start daemon (port 4444)
osgrep serve --port 5000 # Custom port
# Cleanup
rm -rf ~/.osgrep/data/<store> # Remove index
Last Updated: 2025-11-25 Skill Version: 1.0.0 Compatible with: cluster-code v0.x - v2.x
Didn't find tool you were looking for?