Agent skill
code-clone-assistant
Detects and refactors code duplication using PMD CPD. Use when identifying code clones, addressing DRY violations, or refactoring duplicate code across repositories.
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/development/code-clone-assistant-terrylica-cc-skills
SKILL.md
Code Clone Assistant
Detect code clones and guide refactoring using PMD CPD (exact duplicates) + Semgrep (patterns).
Tools
- PMD CPD v7.17.0+: Exact duplicate detection
- Semgrep v1.140.0+: Pattern-based detection
Tested: October 2025 - 30 violations detected across 3 sample files Coverage: ~3x more violations than using either tool alone
When to Use
Triggers: "find duplicate code", "DRY violations", "refactor similar code", "detect code duplication", "similar validation logic", "repeated patterns", "copy-paste code", "exact duplicates"
Why Two Tools?
PMD CPD and Semgrep detect different clone types:
| Aspect | PMD CPD | Semgrep |
|---|---|---|
| Detects | Exact copy-paste duplicates | Similar patterns with variations |
| Scope | Across files ✅ | Within/across files (Pro only) |
| Matching | Token-based (ignores formatting) | Pattern-based (AST matching) |
| Rules | ❌ No custom rules | ✅ Custom rules |
Result: Using both finds ~3x more DRY violations.
Clone Types
| Type | Description | PMD CPD | Semgrep |
|---|---|---|---|
| Type-1 | Exact copies | ✅ Default | ✅ |
| Type-2 | Renamed identifiers | ✅ --ignore-* |
✅ |
| Type-3 | Near-miss with variations | ⚠️ Partial | ✅ Patterns |
| Type-4 | Semantic clones (same behavior) | ❌ | ❌ |
Quick Start Workflow
# Step 1: Detect exact duplicates (PMD CPD)
pmd cpd -d . -l python --minimum-tokens 20 -f markdown > pmd-results.md
# Step 2: Detect pattern violations (Semgrep)
semgrep --config=clone-rules.yaml --sarif --quiet > semgrep-results.sarif
# Step 3: Analyze combined results (Claude Code)
# Parse both outputs, prioritize by severity
# Step 4: Refactor (Claude Code with user approval)
# Extract shared functions, consolidate patterns, verify tests
Reference Documentation
For detailed information, see:
- Detection Commands - PMD CPD and Semgrep command details
- Complete Workflow - Detection, analysis, and presentation phases
- Refactoring Strategies - Approaches for addressing violations
Didn't find tool you were looking for?