Agent skill

sql-query-optimization

Stars 109
Forks 14

Install this agent skill to your Project

npx add-skill https://github.com/secondsky/claude-skills/tree/main/plugins/sql-query-optimization/skills/sql-query-optimization

SKILL.md

SQL Query Optimization

Status: Production Ready ✅ Last Updated: 2025-12-15 Latest Versions: PostgreSQL 17, MySQL 8.4 Dependencies: None


Quick Start (10 Minutes)

1. Identify Slow Query

sql
-- PostgreSQL: Enable pg_stat_statements
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Find slowest queries
SELECT query, mean_exec_time, calls
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;

2. Analyze with EXPLAIN

sql
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT * FROM orders WHERE user_id = 123;

-- Look for:
-- - Seq Scan on large tables → needs index
-- - High "Rows Removed by Filter" → poor selectivity
-- - Temp read/written → increase work_mem

3. Create Index

sql
-- Add missing index
CREATE INDEX CONCURRENTLY idx_orders_user
ON orders(user_id);

-- Verify improvement
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE user_id = 123;
-- Execution time should drop 10-100x

Critical Rules

Always Do ✓

Rule Why Example
Index foreign keys JOINs need indexed columns CREATE INDEX idx_orders_user ON orders(user_id)
Use EXPLAIN ANALYZE before production Verify query plan is optimal EXPLAIN (ANALYZE, BUFFERS) <query>
Select specific columns Reduces data transfer 90% SELECT id, name FROM users not SELECT *
Add LIMIT to unbounded queries Prevents memory exhaustion SELECT * FROM logs ORDER BY id LIMIT 100
Use prepared statements Prevents SQL injection + faster db.query('SELECT * FROM users WHERE id = $1', [id])
Run ANALYZE after bulk operations Updates query planner statistics ANALYZE table_name
Monitor pg_stat_statements Track query performance over time Review daily for regressions
Use connection pooling Reduces connection overhead 10x new Pool({ max: 20 })

Never Do ✗

Anti-Pattern Problem Fix
SELECT * in production Fetches unnecessary columns Select specific columns only
Leading wildcard LIKE '%term%' Cannot use index Use full-text search instead
String concatenation for SQL SQL injection vulnerability Use parameterized queries
No LIMIT on large results Memory exhaustion Always add LIMIT + pagination
N+1 queries in loops Network latency × N Use JOIN or batch loading
Ignoring EXPLAIN output Deploy slow queries to production Always EXPLAIN before deploy
Multiple INSERTs in loop Slow bulk operations Use batch INSERT with multiple VALUES
OFFSET for pagination O(n) time, scans skipped rows Use cursor-based pagination

Top 7 Critical Errors

1. Sequential Scan on Large Table

Symptom: Seq Scan on orders (cost=0.00..150000.00) on 1M+ rows Cause: No index on filter column Fix: CREATE INDEX idx_orders_column ON orders(column) Impact: 10-100x faster

2. Missing Index on Foreign Key

Symptom: Slow JOINs (5+ seconds) Cause: Foreign key columns not indexed Fix: CREATE INDEX idx_orders_user_id ON orders(user_id) Impact: 50-500x faster JOINs

3. N+1 Query Problem

Symptom: 1 + N queries for N records Cause: ORM lazy loading in loop Fix: Use JOIN or eager loading: SELECT u.*, o.* FROM users u LEFT JOIN orders o ON u.id = o.user_id Impact: N queries → 1 query

4. Leading Wildcard LIKE

Symptom: WHERE name LIKE '%search%' sequential scan Cause: Index cannot match middle of string Fix: Use full-text search (GIN index) or trigrams Impact: 100-1000x faster

5. SELECT * in Production

Symptom: High network traffic, slow responses Cause: Fetches all 50 columns instead of needed 3 Fix: SELECT id, name, email (explicit column list) Impact: 90% less data transfer

6. Missing LIMIT on Large Results

Symptom: Server out of memory, query timeout Cause: Attempting to return 5M rows Fix: SELECT * FROM logs WHERE ... LIMIT 100 + pagination Impact: Constant memory usage

7. Stale Statistics After Bulk Load

Symptom: Wrong query plan chosen despite index Cause: PostgreSQL statistics outdated Fix: ANALYZE table_name after bulk operations Impact: Correct query plan selection

See references/error-catalog.md for all 12 errors with detailed solutions.


Common Patterns Summary

Pattern Use Case Example Performance
B-Tree Index Equality, range, sort queries CREATE INDEX idx ON t(col) Default, best general purpose
Composite Index Multi-column WHERE clauses CREATE INDEX idx ON t(c1, c2) 5-50x faster than single index
Covering Index Include all query columns CREATE INDEX idx ON t(c1) INCLUDE (c2) 2-10x faster (no heap fetch)
Partial Index Filter subset of rows CREATE INDEX idx ON t(c) WHERE status='active' 50-90% smaller index
JOIN Rewrite Replace IN subquery INNER JOIN users u ON o.user_id = u.id 5-20x faster than subquery
Batch INSERT Bulk data loading INSERT INTO t VALUES (..),(..) 10-100x faster than individual
Cursor Pagination Large offset performance WHERE id > last_id LIMIT 100 Constant time vs O(n)

Configuration Summary

PostgreSQL Config

sql
-- Increase work_mem for complex queries (reloadable - no restart needed)
SET work_mem = '256MB';

-- Increase shared_buffers for better caching (25% of RAM)
ALTER SYSTEM SET shared_buffers = '8GB';

-- IMPORTANT: shared_buffers requires a full PostgreSQL server restart!
-- This setting is NOT reloadable via pg_reload_conf()
--
-- To apply shared_buffers change:
-- 1. Stop PostgreSQL:   sudo systemctl stop postgresql
-- 2. Start PostgreSQL:  sudo systemctl start postgresql
-- OR use:              sudo systemctl restart postgresql
--
-- Verify the change took effect:
-- SHOW shared_buffers;

-- Enable auto-vacuum (reloadable - can use pg_reload_conf)
ALTER SYSTEM SET autovacuum = on;

-- Reload config (ONLY works for parameters that don't require restart)
-- This will NOT reload shared_buffers - restart required for that!
SELECT pg_reload_conf();

MySQL Config

ini
# my.cnf
[mysqld]
innodb_buffer_pool_size = 8G  # 70% of RAM
max_connections = 500
slow_query_log = 1
long_query_time = 1

When to Load References

Performance Analysis:

  • Load references/explain-analysis.md when: Reading EXPLAIN output, understanding query plans, analyzing buffer statistics, comparing PostgreSQL vs MySQL EXPLAIN
  • Load references/performance-monitoring.md when: Setting up monitoring, tracking slow queries over time, monitoring cache hit ratios, identifying bloated tables

Index Optimization:

  • Load references/index-strategies.md when: Choosing index type (B-Tree, GIN, GiST, Hash), creating composite indexes, determining column order, using covering indexes, implementing partial indexes, monitoring index usage

Query Optimization:

  • Load references/query-rewrites.md when: Rewriting slow queries, converting subqueries to JOINs, eliminating N+1 queries, implementing pagination, optimizing LIKE queries, batching operations

Systematic Process:

  • Load references/optimization-workflow.md when: Following step-by-step optimization process, creating optimization hypothesis, measuring improvements, monitoring long-term performance

Error Resolution:

  • Load references/error-catalog.md when: Debugging specific errors (sequential scans, missing indexes, N+1 queries, etc.), understanding root causes, implementing verified solutions

Using Bundled Resources

Templates (Copy-Paste SQL)

bash
# EXPLAIN query templates
templates/explain-query.sql

# Index creation patterns
templates/index-examples.sql

# Query rewrite examples
templates/query-rewrites.sql

# Monitoring queries
templates/monitoring-queries.sql

References (Deep Dives)

bash
# Comprehensive guides
references/error-catalog.md              # All 12 errors + solutions
references/explain-analysis.md           # Reading query plans
references/index-strategies.md           # Index types & selection
references/query-rewrites.md             # Before/after optimizations
references/performance-monitoring.md     # Long-term monitoring
references/optimization-workflow.md      # Systematic process

Dependencies

PostgreSQL Extensions:

  • pg_stat_statements - Query performance tracking (built-in)
  • pg_trgm - Trigram similarity search (optional, for fuzzy matching)

MySQL:

  • performance_schema - Performance monitoring (enabled by default in 8.0+)

No additional dependencies required.


Known Issues Prevention

Issue Symptom Prevention
Sequential scans Seq Scan on 1M+ rows Index filter columns before production
Missing FK indexes Slow JOINs Always index foreign keys
N+1 queries 1+N database calls Use JOIN or eager loading
Leading wildcards LIKE '%x%' slow Use full-text search (GIN)
SELECT * bloat High network traffic Select specific columns
No LIMIT Memory exhaustion Always LIMIT unbounded queries
Stale statistics Wrong query plans ANALYZE after bulk operations
Wrong index order Index exists but not used Match query pattern
Missing composite Multiple WHERE slow Create composite index
No connection pool High latency Implement pooling (20-50 connections)
SQL injection Security vulnerability Use prepared statements only
Temp spills Disk I/O on sorts Increase work_mem

Complete Setup Checklist

Production Deployment:

  • Enable pg_stat_statements or performance_schema
  • Index all foreign key columns
  • Index columns in WHERE, JOIN, ORDER BY clauses
  • Replace SELECT * with specific columns
  • Add LIMIT to all unbounded queries
  • Use prepared statements (parameterized queries)
  • Implement connection pooling (20-50 connections)
  • Configure work_mem (256MB-1GB per connection)
  • Configure shared_buffers (25% of RAM for PostgreSQL)
  • Enable slow query logging (threshold: 100-1000ms)
  • Run EXPLAIN ANALYZE on all critical queries
  • Set up daily monitoring of pg_stat_statements
  • Schedule ANALYZE after nightly bulk operations
  • Monitor cache hit ratio (target: >99%)
  • Review and drop unused indexes monthly

Production Example

Before Optimization:

sql
-- Query: Fetch user orders
SELECT * FROM orders WHERE user_id = 123;

-- Performance:
-- Execution time: 2500ms
-- Seq Scan on orders (1M rows scanned)
-- Network: 50MB transferred
-- No index on user_id

After Optimization:

sql
-- Add index
CREATE INDEX CONCURRENTLY idx_orders_user_id ON orders(user_id);

-- Optimize query
SELECT id, total, status, created_at
FROM orders
WHERE user_id = 123
LIMIT 100;

-- Performance:
-- Execution time: 12ms (208x faster!)
-- Index Scan using idx_orders_user_id (100 rows)
-- Network: 50KB transferred (1000x less!)
-- Covering index with INCLUDE

Result: 208x faster execution, 1000x less data transfer


For comprehensive optimization guidance, error resolution, and production patterns, load the appropriate reference files listed in "When to Load References" above.

Expand your agent's capabilities with these related and highly-rated skills.

Didn't find tool you were looking for?

Be as detailed as possible for better results