improve-prompt

Your task is to analyze prompts systematically and improve them using proven techniques. You balance practical improvements with avoiding over-engineering. You are decisive, specific, and focused on measurable quality gains.

Audience: The improved prompts and analysis will be read by:

The original prompt author (who needs to understand what changed and why)
Other developers who may maintain or modify the prompt
Non-technical stakeholders who may review prompt changes

Background: This methodology is based on:

Anthropic's official prompt engineering documentation (2024-2025)
Techniques validated for Claude 3/Claude 3.5/Claude 4 model families
Common failure patterns observed in production prompts

Success Criteria: An improved prompt is successful when:

Claude's output quality measurably increases
The prompt is clearer to humans reading it
Each improvement can be explained in 1-2 sentences
Length increase is proportional to quality gain (aim for <3x original length)

Analyze a prompt and produce an improved version based on Anthropic's prompt engineering best practices.

Process Overview

Extract - Isolate the prompt from its source (code, markdown, etc.)
Analyze - Run automated checks + apply judgment checklist with visible reasoning
Improve - Rewrite the prompt with all relevant techniques
Document - Save improved prompt and summary of changes

CRITICAL FIRST STEP: Before any analysis, extract the raw prompt into a temporary file.

Prompts may be embedded in:

Python files (as strings, f-strings, triple-quoted strings)
JavaScript/TypeScript (template literals, string variables)
Markdown files (mixed with documentation)
Config files (YAML, JSON)
Direct user input

Actions:

Read the source file: {{source_file_path}}
Identify the prompt boundaries (look for string delimiters, variable assignments, etc.)
Extract ONLY the prompt text itself (not the code around it)
Save to: {{temp_prompt_file}} (default: /tmp/prompt_to_analyze.txt)

bash

# Create temp file for extracted prompt
# Path: {{temp_prompt_file}}

If the prompt is already standalone text, skip extraction.

Step 1: Run Automated Checks

Execute the Python check scripts on the extracted prompt:

bash

# Check for XML tags
python3 {{scripts_dir}}/checks/xml_tags.py {{temp_prompt_file}}

# Check for template variables
python3 {{scripts_dir}}/checks/variables.py {{temp_prompt_file}}

Note the results - they inform but don't dictate decisions.

Step 2: Apply Judgment Checklist

Read CHECKLIST.md and work through each item systematically.

CRITICAL: You must output your reasoning as you evaluate each technique.

Use this thinking structure:

Decision: [Add XML tags / Keep as-is / Modify existing] Rationale: [1-2 sentences explaining why]

2. Variables Analysis

Script result: [What the script found] Current state: [Describe what you observe] Criteria evaluation:

Is this a template for reuse: [yes/no]
Are there dynamic parts: [yes/no - what are they?]
Any placeholder descriptions: [yes/no]

Decision: [Add variables / Keep as-is / Modify existing] Rationale: [1-2 sentences explaining why]

3. Role/Persona Analysis

[Continue same pattern for all 10 checklist items...]

4. Examples Analysis

...

5. Chain of Thought Analysis

...

6. Clarity Analysis

...

7. Output Format Analysis

...

8. Constraints Analysis

...

9. Context Analysis

...

10. Task Decomposition Analysis

...

Overall Assessment

Confidence level: [High/Medium/Low] Primary improvements needed:

[Most critical improvement]
[Second priority]
[Third priority]

Techniques to skip:

[Technique]: [Why it's not needed for this prompt]

<improvement_plan> Based on the above analysis, I will:

[Specific action with concrete details]
[Specific action with concrete details]
[Specific action with concrete details] </improvement_plan>

Important: Your thinking must be visible in the output. This ensures thorough analysis and makes your decisions transparent and reviewable.

Checklist Summary

Structure Checks

XML Tags: Would structure help this prompt? (Script shows current state)
Variables: Are there dynamic parts? (Script shows current state)

Content Checks

Role/Persona: Does this task benefit from expertise framing?
Examples: Would input/output examples clarify expectations?
Chain of Thought: Is this a reasoning task that needs step-by-step?

Quality Checks

Clarity: Are there vague terms to make specific?
Output Format: Is the expected format explicitly stated?
Constraints: Are boundaries and exclusions clear?
Context: Is necessary background provided?

Architecture Check

Task Decomposition: Should this be split into multiple prompts?

Step 3: Write the Improved Prompt

Create the improved version applying all relevant techniques identified in your <improvement_plan>.

Step 4: Save Output Files

Create two files in {{output_directory}} (default: same directory as source prompt):

File 1: `{{output_prefix}}_improved.md`

markdown

# Improved Prompt

## Original Prompt
[The original prompt text]

## Improved Version
[The fully rewritten prompt with all improvements]

## Usage Notes
[Any notes about variables, how to use, etc.]

File 2: `{{output_prefix}}_summary.md`

markdown

# Prompt Improvement Summary

## Automated Check Results
- **XML Tags**: [Found X tags / None found]
- **Variables**: [Found X variables / None found]

## Changes Made

### Added
- [What was added and why]

### Modified
- [What was changed and why]

### Removed
- [What was removed and why, if anything]

## Techniques Applied
- [Technique 1]: [Brief rationale]
- [Technique 2]: [Brief rationale]

## Techniques Considered but Skipped
- [Technique]: [Why it wasn't needed]

## Confidence
[High/Medium/Low] - [Brief explanation]

What You MUST NOT Do

Skip the extraction step for embedded prompts
Make changes without reasoning through the checklist first
Add techniques solely to make the prompt longer
Over-engineer prompts that are already effective
Add examples to simple, unambiguous tasks
Add chain of thought to straightforward formatting tasks
Guess at improvements without evidence from the checklist analysis

Scope Boundaries

Focus ONLY on the prompt text itself (not surrounding code architecture)
Improve the prompt, not the task it describes
Evaluate based on Anthropic's documented best practices, not personal preference
Consider the specific use case, not generic "best practices"

Length Guidelines

Improved prompts: typically 150-300% of original length (not 500%+)
_summary.md: 200-400 words
Each technique rationale: 1-2 sentences maximum

Quality Standards

Be decisive: make clear yes/no decisions, not "maybe" or "it depends" without resolution
Be specific: every vague term should become concrete
Be concise: don't add words that don't add value
Be practical: improvements should make the prompt more effective, not just longer
Be respectful: explain changes constructively, don't criticize the original

Variable	Default	Description
`{{source_file_path}}`	(user provided)	Path to file containing the prompt
`{{temp_prompt_file}}`	`/tmp/prompt_to_analyze.txt`	Temporary file for extracted prompt
`{{scripts_dir}}`	`scripts`	Directory containing check scripts
`{{output_directory}}`	(same as source)	Where to save output files
`{{output_prefix}}`	`improved_prompt`	Prefix for output filenames

CHECKLIST.md - Full evaluation checklist with decision criteria
reference/techniques.md - Prompt engineering techniques reference

Example 1: Simple Prompt Enhancement

Before:

Write a good summary of this article.

After:

xml

<instructions>
Summarize the following article in 2-3 paragraphs (150-200 words total).

Focus on:
- The main argument or thesis
- Key supporting evidence
- The conclusion or implications

Write for a general audience. Avoid jargon.
</instructions>

<article>
{{article_content}}
</article>

<output_format>
Structure your summary as:
1. Opening paragraph: Main thesis and context
2. Body paragraph: Key evidence and arguments
3. Closing: Conclusions and significance
</output_format>

Techniques applied:

Added XML structure (<instructions>, <article>, <output_format>)
Specified word count (150-200 words)
Defined focus areas (thesis, evidence, conclusion)
Added output structure with numbered sections
Made variable explicit ({{article_content}})
Added audience and tone guidance

Example 2: Code Review Prompt

Before:

Review this Python code and tell me if there are any issues.

```python
{{user_code}}


**After:**
```xml
<role>
You are a senior Python developer conducting a code review. Focus on practical issues that affect production reliability.
</role>

<instructions>
Review the following Python code for:
1. Security vulnerabilities (SQL injection, XSS, hardcoded secrets)
2. Performance issues (O(n²) algorithms, memory leaks, unnecessary I/O)
3. Code quality (PEP 8 compliance, missing type hints, unclear naming)
4. Logic errors or unhandled edge cases

For each issue found, provide:
- **Severity**: Critical / High / Medium / Low
- **Line number(s)**: Where the issue occurs
- **Description**: What the problem is
- **Fix**: Specific recommendation
</instructions>

<code language="python">
{{user_code}}
</code>

<output_format>
## Security Issues
[List or "None found"]

## Performance Issues
[List or "None found"]

## Code Quality Issues
[List or "None found"]

## Logic/Edge Case Issues
[List or "None found"]

## Summary
- **Total issues**: X
- **Critical/High priority**: Y
- **Overall assessment**: [Brief 1-2 sentence verdict]
</output_format>

<constraints>
- Focus on issues that matter in production, not style nitpicks
- If code looks solid, say so clearly rather than inventing problems
- Limit to top 10 issues if many are found
</constraints>

Techniques applied:

Added expert role (senior Python developer)
Structured categories for review focus
Specified output format with clear sections
Added severity classification
Included constraints to prevent over-critique
Added language attribute to code block

Example 3: Already Good Prompt (No Changes Needed)

Before:

xml

<instructions>
Analyze the sentiment of this customer review. Classify as positive, negative, or neutral.
Provide a confidence score (0-100).
</instructions>

<review>
{{customer_review}}
</review>

<output_format>
{
  "sentiment": "positive|negative|neutral",
  "confidence": 85,
  "reasoning": "Brief explanation"
}
</output_format>

After: No changes needed.

Why this prompt is already effective:

Clear XML structure separating instructions, input, and output format
Specific output format (JSON with exact keys)
Explicit classification options (positive/negative/neutral)
Quantified confidence (0-100 scale)
Variable for dynamic content ({{customer_review}})
Reasoning requirement for transparency

Lesson: Not every prompt needs improvement. Recognize when a prompt already follows best practices and avoid adding complexity that doesn't add value.

Example 4: Over-Engineering Warning

Before:

Translate this to Spanish: {{text}}

What NOT to do (over-engineered):

xml

<role>You are a professional translator specializing in English-to-Spanish translation with 20 years of experience.</role>
<instructions>
<task>Translate the following text from English to Spanish.</task>
<guidelines>
<step1>Read the input text carefully to understand context</step1>
<step2>Consider cultural nuances and regional variations</step2>
<step3>Choose appropriate register (formal/informal)</step3>
<step4>Produce the translation maintaining original meaning</step4>
<step5>Review for accuracy and naturalness</step5>
</guidelines>
</instructions>
<input>{{text}}</input>
<constraints>
<constraint>Maintain original meaning exactly</constraint>
<constraint>Use formal register unless informal is clearly appropriate</constraint>
<constraint>Preserve all formatting</constraint>
<constraint>Do not add explanations unless asked</constraint>
</constraints>
<output_format>Provide only the translated text.</output_format>

What TO do (appropriate improvement):

Translate the following text to Spanish (Spain, formal register):

{{text}}

Provide only the translation, no explanations.

Lesson: For simple, well-defined tasks, keep it simple. The over-engineered version adds 200+ words but provides no meaningful improvement for a straightforward translation. Only add structure when it adds value.

Example 5: Task Decomposition Recommendation

Before:

Analyze this research paper. First summarize it, then list the methodology flaws, then suggest improvements, then write an abstract, then create a citation in APA format.

After: Recommend splitting into chained prompts.

Why decompose:

5 distinct tasks requiring different focus
Output would be very long and hard to review
Each task benefits from full attention
Easier to iterate on individual components

Recommended chain:

Prompt 1: Summary

xml

<instructions>
Summarize this research paper in 200-300 words, covering:
- Research question
- Methodology
- Key findings
- Conclusions
</instructions>

<paper>{{paper_content}}</paper>

Prompt 2: Methodology Critique (uses output from Prompt 1)

xml

<instructions>
Review the methodology of this research paper. Identify:
- 3-5 potential flaws or limitations
- Severity of each (major/minor)
- Impact on conclusions
</instructions>

<paper>{{paper_content}}</paper>
<summary>{{summary_from_prompt_1}}</summary>

Prompt 3: Improvements (uses outputs from Prompts 1-2) ...and so on.

Lesson: When a prompt tries to do too much, break it into focused steps. Each prompt in the chain gets Claude's full attention, improving quality across all outputs.

Search AI Tools

Install this agent skill to your Project

SKILL.md

Process Overview

Step 1: Run Automated Checks

Step 2: Apply Judgment Checklist

2. Variables Analysis

3. Role/Persona Analysis

4. Examples Analysis

5. Chain of Thought Analysis

6. Clarity Analysis

7. Output Format Analysis

8. Constraints Analysis

9. Context Analysis

10. Task Decomposition Analysis

Overall Assessment

Checklist Summary

Structure Checks

Content Checks

Quality Checks

Architecture Check

Step 3: Write the Improved Prompt

Step 4: Save Output Files

File 1: `{{output_prefix}}_improved.md`

File 2: `{{output_prefix}}_summary.md`

What You MUST NOT Do

Scope Boundaries

Length Guidelines

Quality Standards

Example 1: Simple Prompt Enhancement

Example 2: Code Review Prompt

Example 3: Already Good Prompt (No Changes Needed)

Example 4: Over-Engineering Warning

Example 5: Task Decomposition Recommendation

Search AI Tools

Install this agent skill to your Project

SKILL.md

Process Overview

Step 1: Run Automated Checks

Step 2: Apply Judgment Checklist

2. Variables Analysis

3. Role/Persona Analysis

4. Examples Analysis

5. Chain of Thought Analysis

6. Clarity Analysis

7. Output Format Analysis

8. Constraints Analysis

9. Context Analysis

10. Task Decomposition Analysis

Overall Assessment

Checklist Summary

Structure Checks

Content Checks

Quality Checks

Architecture Check

Step 3: Write the Improved Prompt

Step 4: Save Output Files

File 1: {{output_prefix}}_improved.md

File 2: {{output_prefix}}_summary.md

What You MUST NOT Do

Scope Boundaries

Length Guidelines

Quality Standards

Example 1: Simple Prompt Enhancement

Example 2: Code Review Prompt

Example 3: Already Good Prompt (No Changes Needed)

Example 4: Over-Engineering Warning

Example 5: Task Decomposition Recommendation

File 1: `{{output_prefix}}_improved.md`

File 2: `{{output_prefix}}_summary.md`