Agent skill
mxcp-expert-raw-labs-raw-labs-claude-mark
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/development/mxcp-expert-raw-labs-raw-labs-claude-mark
SKILL.md
MXCP Expert Skill
MXCP is an enterprise framework for building production-ready AI tools with SQL and Python.
MXCP Mindset
Internalize these before implementing anything:
- MXCP is opinionated - There's ONE right way to do most things. Don't invent patterns.
- If it's common, MXCP provides it - Auth, testing, data access, policies. Check before building.
- Schema docs are truth - When unsure about syntax, read the schema doc. Don't guess.
- Validate constantly - Run
mxcp validateafter every file change. Errors compound. - Read before writing - 2 minutes reading docs saves 20 minutes debugging.
Pre-Implementation Checklist
Before writing ANY YAML or code:
- Read common-mistakes.md - saves 90% of debugging time
- Read the relevant schema doc (tool.md, resource.md, or prompt.md)
- Check if MXCP already provides this feature (see Capabilities table)
- Know the required fields and valid types
MXCP Capabilities
| Category | Features | When to Use |
|---|---|---|
| Endpoints | Tools, Resources, Prompts | Tools=actions/queries, Resources=data by URI, Prompts=message templates |
| Languages | SQL, Python | SQL=database/simple, Python=complex logic/APIs |
| Data Access | DuckDB (local files, HTTP, S3, PostgreSQL, MySQL, SQLite) | Connect to any data source via DuckDB extensions |
| Data Transform | dbt (seeds, SQL models, Python models) | Clean, test, materialize static data |
| Security | OAuth, CEL policies, audit logs | Authentication + authorization |
| Quality | validate, test, lint, evals | Ensure correctness and LLM usability |
| Deployment | stdio, streamable-http | Local dev (stdio), production (HTTP) |
Reference Documentation
| Category | Key References |
|---|---|
| Getting Started | quickstart, hello-world |
| Endpoints | sql-endpoints, python-endpoints |
| Schemas | tool, resource, prompt |
| Quality | testing, validation, linting |
| Security | authentication, policies |
| Operations | configuration, deployment |
| Reference | cli, sql, python, type-system |
| Integrations | dbt, duckdb, excel |
Quick Reference: What Docs to Read
| When implementing... | Read first |
|---|---|
| Any YAML | common-mistakes.md |
| Tools, Resources, Prompts | tool.md, resource.md, prompt.md |
| Authentication/Authorization | authentication.md, policies.md |
| Tests | testing.md |
| Data access (files, DBs) | duckdb.md |
| Data transformation | dbt.md |
| Python endpoints | python.md |
| SQL endpoints | sql.md |
Implementation Methodology
Follow this methodology for every MXCP project. Run mxcp validate after EVERY file change.
Step 0: Project Setup
mkdir my-project && cd my-project
uv venv && source .venv/bin/activate
uv pip install mxcp
mxcp init --bootstrap
mxcp validate # Verify setup
Step 1: Task Analysis & Data Ingestion
Analyze the task first:
- What is the user trying to accomplish?
- Is data ingestion needed? What format (CSV, Excel, API, database)?
- Is the data properly structured or does it need transformation?
- What questions will users need answered? (Design schema accordingly)
Decision: Ingest or query directly?
| Data Characteristic | Approach | Why |
|---|---|---|
| Static/one-time (loaded once) | Ingest with dbt | Data quality tests, transformations, persistence |
| Dynamic/changing (files updated) | DuckDB direct read | Always reads latest data, no sync needed |
Ingestion approaches (for static data):
| Scenario | Approach |
|---|---|
| Simple CSV, static reference data | mxcp dbt seed |
| Excel, complex transformations | dbt Python models |
Direct read approaches (for dynamic data):
-- DuckDB reads files directly - always gets latest data
SELECT * FROM read_csv_auto('data/sales.csv');
SELECT * FROM read_parquet('data/*.parquet');
SELECT * FROM read_json_auto('https://api.example.com/data.json');
Connect to external databases via DuckDB:
-- PostgreSQL (requires postgres extension)
ATTACH 'postgresql://user:pass@host:5432/db' AS pg (TYPE postgres);
SELECT * FROM pg.public.users;
-- MySQL (requires mysql extension)
ATTACH 'host=localhost user=root database=mydb' AS mysql (TYPE mysql);
SELECT * FROM mysql.orders;
See duckdb.md for S3, HTTP auth, and secret management.
After ingestion (if using dbt), verify:
mxcp dbt test # Data quality tests
mxcp query "SELECT * FROM table LIMIT 5" # Manual verification
Step 2: Implementation
Choose endpoint type based on use case:
| Use Case | Endpoint Type | Example |
|---|---|---|
| Query data, perform actions | Tool | get_customer, create_order |
| Access data by URI/path | Resource | employee://{id}/profile |
| Reusable message templates | Prompt | data_analysis with Jinja2 |
Choose implementation language:
| Scenario | Language | Reference |
|---|---|---|
| Database queries, aggregations, file reading | SQL | sql-endpoints.md |
| Complex logic, external APIs, ML, file processing | Python | python-endpoints.md |
Development cycle for each endpoint:
# 1. Create the YAML definition
mxcp validate # Fix errors immediately
# 2. Create the implementation (SQL or Python)
mxcp validate # Validate again
# 3. Manual verification
mxcp run tool NAME --param key=value
# 4. Add tests and run
mxcp test
Python code requirements:
- Modular, maintainable code
- Each module independently testable
- Use
pytestfor Python logic testing
Step 3: Metadata Quality
Tools will be used by LLMs. Ensure clear metadata:
- name: Descriptive, follows
snake_case - description: Clear purpose, when to use, what it returns
- parameters: Each has description, correct type, examples
- return: Documented structure with property descriptions
tool:
name: search_customers
description: |
Search customers by name or email. Returns matching customer records
with contact info and account status. Use for customer lookups.
parameters:
- name: query
type: string
description: Search term (matches name or email, case-insensitive)
examples: ["john", "smith@example.com"]
Step 4: Validation
Run after every file change:
mxcp validate
mxcp validate --debug # For detailed errors
Step 5: Linting
Check metadata quality for LLM consumption:
mxcp lint
Address all warnings about descriptions, examples, and documentation.
Step 6: Evals (Only if Requested)
Create evals only if the user explicitly asks:
mxcp evals # AI behavior testing
Step 7: Security & Features (Only if Requested)
Implement only if the user requests authentication, policies, or observability:
- Authentication: Configure in
~/.mxcp/config.yml(see Security Features section) - Policies: Add CEL expressions to tool definitions
- Observability: Configure OpenTelemetry
Test security with simulated user context:
mxcp run tool NAME --param key=value \
--user-context '{"role": "admin", "email": "test@example.com"}'
Step 8: Deployment (Only if Requested)
Implement only if the user explicitly asks for deployment:
| Transport | Use Case | Command |
|---|---|---|
stdio |
Local dev, Claude Desktop | mxcp serve (default) |
streamable-http |
Production, web clients | mxcp serve --transport streamable-http --port 8000 |
See Deployment for Docker, systemd, production setup.
Definition of Done
A project is complete when:
-
mxcp validatepasses with no errors -
mxcp testpasses with all tests green -
mxcp dbt testpasses (if using dbt) -
mxcp lintshows no critical issues - Manual verification with
mxcp runconfirms expected behavior - Security tested with
--user-context(if auth/policies configured)
Testing Requirements
| Test Type | Must Verify | Reference |
|---|---|---|
| MXCP endpoint | Valid inputs, edge cases (nulls, boundaries), error handling | testing.md |
| dbt data | not_null, unique, relationships, accepted_values |
dbt.md |
| Python modules | Unit tests with pytest |
- |
Critical: Use the Default Database
MXCP automatically creates and manages a DuckDB database. Do not configure a custom database path unless the user explicitly asks for it.
When you run mxcp init, MXCP creates:
- Database at
data/db-default.duckdb(ordata/db-{profile}.duckdb) - All tables, seeds, and dbt models go into this database automatically
Use the default (no database configuration needed):
# mxcp-site.yml - Minimal config
mxcp: 1
project: my-project
profile: default
# Database is automatically created at data/db-default.duckdb
Only configure duckdb.path if the user explicitly requests it (e.g., shared database, specific location, read-only mode). Do not proactively add database configuration.
Common Mistakes
Root cause of most errors: implementing without reading docs first.
Before implementing, always:
- Read the relevant schema doc (tool.md, resource.md, prompt.md)
- Check common-mistakes.md for known pitfalls
- Run
mxcp validateafter every change
Valid types: string, number, integer, boolean, array, object
SQL syntax: Verify DuckDB-specific syntax in duckdb.md. See common-mistakes.md for pitfalls.
Project Structure
mxcp-project/
├── mxcp-site.yml # Project configuration (required)
├── tools/ # Tool definitions (.yml)
├── resources/ # Resource definitions (.yml)
├── prompts/ # Prompt definitions (.yml)
├── sql/ # SQL implementations
├── python/ # Python implementations
├── evals/ # LLM evaluation tests
└── data/ # Database files (db-default.duckdb)
Directory rules:
- Tools MUST be in
tools/*.yml - Resources MUST be in
resources/*.yml - Prompts MUST be in
prompts/*.yml - SQL files in
sql/, referenced via relative paths - Python files in
python/, referenced via relative paths
Golden Path: Complete Tool Example
This shows a complete, correct tool with all required fields and tests:
# tools/get_customer.yml
mxcp: 1
tool:
name: get_customer
description: Get customer by ID. Returns customer profile with contact info.
parameters:
- name: customer_id
type: integer
description: The customer's unique identifier
return:
type: object
properties:
id: {type: integer}
name: {type: string}
email: {type: string}
source:
file: ../sql/get_customer.sql
tests:
- name: existing_customer
arguments: [{key: customer_id, value: 1}]
result_contains: {id: 1}
- name: not_found
arguments: [{key: customer_id, value: 99999}]
result: null
-- sql/get_customer.sql
SELECT id, name, email FROM customers WHERE id = $customer_id
SQL vs Python: Use SQL for queries/aggregations. Use Python (language: python) for complex logic, APIs, ML.
Security Features
CRITICAL: Use MXCP built-in security. NEVER write custom authentication code.
| Feature | Built-in Solution | Reference |
|---|---|---|
| Authentication | OAuth in ~/.mxcp/config.yml |
authentication.md |
| Access Control | CEL policies in YAML | policies.md |
| User Context | SQL: get_username(), get_user_email() |
sql.md |
| External APIs | SQL: get_user_external_token() |
authentication.md |
| Audit Logs | Built-in logging | auditing.md |
Supported OAuth providers: GitHub, Google, Atlassian, Salesforce, Keycloak
CLI Quick Reference
# Project
mxcp init --bootstrap # Create new project
mxcp list # List all endpoints
# Quality
mxcp validate # Check structure
mxcp test # Run tests
mxcp lint # Check metadata
mxcp evals # AI behavior tests
# Running
mxcp serve # Start MCP server
mxcp run tool NAME --param k=v # Run tool manually
# Database
mxcp query "SELECT 1" # Execute SQL
# Operations
mxcp drift-snapshot # Create baseline
mxcp drift-check # Detect changes
mxcp log --since 1h # Query audit logs
Troubleshooting
mxcp validate --debug # Detailed validation errors
mxcp run tool NAME --debug # Debug tool execution
mxcp list # See available endpoints
Common issues: YAML syntax, missing required fields, invalid types, file paths.
Project Templates
Complete runnable examples in assets/project-templates/. Start with:
python-demo- Python endpoint patternscovid_owid- Data workflow with dbt
cp -r assets/project-templates/python-demo my-project
cd my-project
mxcp validate && mxcp test
See Configuration for mxcp-site.yml and config.yml options.
Didn't find tool you were looking for?