Agent skill
autopredict
Wrap the howdymary/autopredict Polymarket trading-agent repo. Use when you need to scan live Polymarket markets, inspect structural event mispricing, evaluate a market with your own fair probability, run reproducible backtests against a JSON dataset, tune strategy parameters safely, or review the repo's paper/live trading scaffolds and failure modes.
Install this agent skill to your Project
npx add-skill https://github.com/ckorhonen/claude-skills/tree/main/skills/autopredict
SKILL.md
AutoPredict
Quick Start — Simple Examples
New to AutoPredict? Start here before reading the full docs.
1. Scan what's trending on Polymarket right now
python3 predict.py --top 10
Shows the 10 most active markets with spreads, depth, and overround signals.
2. Show me the 5 most liquid markets
python3 predict.py --top 5 --verbose
Lists markets sorted by liquidity with full execution details.
3. Browse multi-outcome events for structural mispricing
python3 predict.py --events --top 10
Checks whether event probabilities sum to more or less than 100%.
4. What does the order book look like for a specific market?
python3 predict.py --fair 0.55 <condition_id>
Replace <condition_id> with the Polymarket ID. Provide your own fair probability estimate and AutoPredict evaluates the trade.
Run
python3 predict.py --helpfor all flags. No credentials required for live reads.
AutoPredict is an execution framework for prediction-market trading. It is not a forecasting model.
- You provide
fair_prob. - The repo evaluates execution quality: side, order type, size, spread, depth, slippage, and risk.
- Live market reads require internet but no credentials.
- Real trading is scaffolded, not production-ready.
This skill was audited against the upstream repository layout and command surface, not just the README.
What Is Real vs Scaffold
Reliable entry points
python3 predict.pyscans live Polymarket markets.python3 predict.py --eventsinspects multi-outcome event overround / underround.python3 predict.py --fair 0.60 <condition_id>evaluates one market using your explicit probability.python3 -m autopredict.cli backtest --dataset ...runs an offline backtest.python3 -m autopredict.cli score-latestprints the most recent saved metrics JSON.
Partially implemented or scaffold-only
python3 -m autopredict.cli learn analyzeonly works if you already have JSONL trade logs. Plain CLI backtests do not create those logs.python3 -m autopredict.cli learn tuneandlearn improveare placeholders that point to a nonexistentscripts/learn_and_improve.py.python3 -m autopredict.cli trade-liveis intentionally disabled by config.scripts/run_paper.pyandscripts/run_live.pyare deployment scaffolds.run_live.pyuses aMockVenueAdapter, so it is not a real exchange adapter.
Repo Map
predict.py: live Polymarket scanner and one-off evaluation path.autopredict/cli.py: packaged CLI used for backtest, score-latest, and learning commands.run_experiment.py: simple offline backtest harness used byautopredict.cli backtest.strategy_configs/*.json: strategy knobs for offline experiments.autopredict/_defaults/datasets/sample_markets.json: bundled sample dataset. Use this when you need a known-good backtest input.autopredict/learning/tuner.py: reusable grid-search API. Better than the stub CLI.scripts/run_paper.py,scripts/run_live.py: paper/live monitoring templates.
Decision Tree
1. Choose the workflow
If the user wants to:
- Find liquid live markets or structural event mispricing: use
predict.pyviascripts/scan_markets.sh. - Evaluate one market with a known fair probability: use
predict.py --fair .... - Compare strategy configs or produce reproducible metrics: use
python3 -m autopredict.cli backtest --dataset ...viascripts/run_backtest.sh. - Sweep parameters safely: use
scripts/tune_params.sh. Do not usepython3 -m autopredict.cli learn tune. - Inspect trade logs: use
python3 -m autopredict.cli learn analyze --log-dir ...only if JSONL logs already exist. - Discuss paper/live deployment scaffolds: read
docs/DEPLOYMENT.md,configs/*.yaml, and the Python runners before claiming the repo can trade live.
2. Choose the command surface
- Use
predict.pyfor live reads and one-off agent evaluation. - Use
python3 -m autopredict.cli ...for reproducible offline backtests. - Avoid
python3 -m autopredict.backtest.cli ...; that submodule has brittle import behavior in the current repo state.
3. Choose the data source
- For a quick smoke test: use
autopredict/_defaults/datasets/sample_markets.json. - For real research: require a user-supplied dataset of historical snapshots.
- If the user has no dataset and wants strategy performance claims, stop and say the repo cannot produce a valid backtest without one.
Setup
Preferred helper:
bash skills/autopredict/scripts/setup.sh --dir /tmp/autopredict
Manual setup:
git clone https://github.com/howdymary/autopredict.git /tmp/autopredict
cd /tmp/autopredict
python3 -m pip install -e .
python3 predict.py --help
python3 -m autopredict.cli --help
After setup, keep work inside the cloned repo when invoking upstream commands.
Opinionated Workflows
Workflow A: Fast live market triage
Use this when the user wants ideas, not a PnL claim.
cd /tmp/autopredict
python3 predict.py --top 10 --verbose
python3 predict.py --events --top 10
Interpretation:
- Prefer markets with tight spreads and visible depth.
- Treat event underround as a structural clue, not automatic free money.
- Only move to trade evaluation once you can justify a
fair_prob.
Workflow B: Evaluate a single conviction
Use this when the user already has a thesis on one market.
cd /tmp/autopredict
python3 predict.py --fair 0.60 <condition_id>
Important caveat:
predict.py --fairconstructsAutoPredictAgent(AgentConfig())directly.- That means it uses default agent parameters, not
strategy_configs/baseline.jsonor your edited JSON config. - Use it as a default-policy sanity check, not as proof that a tuned config behaves the same way.
Workflow C: Backtest a strategy config
Use this when the user wants reproducible metrics or config comparisons.
cd /tmp/autopredict
python3 -m autopredict.cli backtest \
--config strategy_configs/baseline.json \
--dataset autopredict/_defaults/datasets/sample_markets.json
python3 -m autopredict.cli score-latest
Opinionated rule:
- Always pass
--dataset. - The repo default
config.jsonsets"default_dataset": null. - Running
python3 -m autopredict.cli backtestwith no dataset currently throws aTypeError.
Workflow D: Tune parameters
Use the bundled helper instead of the stub CLI:
bash skills/autopredict/scripts/tune_params.sh \
--dir /tmp/autopredict \
--dataset autopredict/_defaults/datasets/sample_markets.json \
--param min_edge 0.03,0.05,0.08 \
--param aggressive_edge 0.10,0.12,0.15
Opinionated tuning rules:
- Start with 1-2 parameters, not 6.
- Prefer
sharpeortotal_pnlonly after sample size is reasonable. - Reject “best” configs with too few trades.
- Save every run; do not trust memory or terminal output.
Workflow E: Review learning / deployment scaffolds
Use this when the user asks about self-improvement, paper trading, or live trading.
autopredict.learning.tuner.GridSearchTuneris real and reusable.python3 -m autopredict.cli learn tuneis just a message, not a tuning engine.scripts/run_paper.pyis a monitoring loop template; it does not fetch real markets or execute the full agent logic.scripts/run_live.pyrequires confirmation and safety flags, but still usesMockVenueAdapter, so it cannot trade a real venue out of the box.
Strategy Knobs That Matter
Main JSON parameters in strategy_configs/*.json:
min_edge: minimum edge before any trade is considered.aggressive_edge: threshold for using market orders more aggressively.max_risk_fraction: position sizing as fraction of bankroll.max_position_notional: hard dollar cap per order.min_book_liquidity: minimum visible depth required.max_spread_pct: spread filter.max_depth_fraction: cap as fraction of visible depth.split_threshold_fraction: start slicing when order is too large relative to depth.
Opinionated tuning guidance:
- Lower
min_edgeonly if trade count is too low. - Raise
aggressive_edgeif slippage is the dominant problem. - Lower
max_depth_fractionbefore touching risk caps when market impact is the problem. - Do not loosen spread and liquidity filters at the same time; you will not know which one caused the regression.
Failure Modes and Edge Cases
Backtest failures
TypeErrorbefore the backtest starts: almost always because no--datasetwas passed anddefault_datasetisnull.No metrics.json found under state directory:score-latestwas run before a successful backtest.- Malformed JSON errors: invalid strategy config or dataset schema.
Learning workflow failures
learn analyzereports no logs: expected unless you created JSONL logs withTradeLoggeror a scaffold that writes them.learn tune/learn improveprints advice only: expected. Those subcommands are placeholders.- Docs mention
scripts/learn_and_improve.py: that script does not exist in the audited upstream repo.
Live / paper trading confusion
- Paper trading is not the same as live market scanning:
run_paper.pyis a loop scaffold, not an end-to-end paper execution engine over Polymarket. - Live trading docs sound complete, but adapter is mock:
run_live.pycannot place real venue orders without extra implementation. trade-liveCLI is disabled:config.jsondefaultslive_trading_enabledtofalse.
Command-path gotchas
- Do not use
python3 -m autopredict.backtest.cliin this repo state unless you are ready to debug import-path issues. - Do not assume root docs and packaged CLI are fully synchronized. The package path under
autopredict/is the safer source of truth. - Do not claim config changes affect
predict.py --fairunless you verified the code path. It currently ignoresstrategy_configs/*.json.
Helper Scripts Bundled With This Skill
scripts/setup.sh: clone, install, verify, and smoke-test the repo.scripts/scan_markets.sh: wrapper aroundpredict.pyfor live scan /--events/--fairpaths.scripts/run_backtest.sh: safe backtest wrapper that always provides a dataset or fails with a useful error.scripts/tune_params.sh: grid-search wrapper that bypasses the upstream stub tuning CLI.
Recommended Agent Behavior
When using this skill:
- Lead with the limitation that AutoPredict optimizes execution, not prediction.
- Ask where
fair_probcomes from before discussing edges as if they were alpha. - Require a dataset for any serious backtest claim.
- Separate “works in the repo” from “documented in the repo”.
- Treat paper/live trading as architecture review unless the user is explicitly asking to extend the scaffold.
Autoresearch Pairing
Use this skill with autoresearch when the user wants disciplined tuning.
Recommended setup:
- Define the target metric, usually
sharpe,total_pnl, oravg_slippage_bps. - Use
scripts/run_backtest.shorscripts/tune_params.shas the experiment workload. - Keep one hypothesis per run.
- Store configs and metrics under a dated output directory.
Good autoresearch prompt framing:
- “Optimize
aggressive_edgeandmax_depth_fractionfor lower slippage without collapsing trade count.” - “Improve Sharpe on this dataset while keeping max drawdown below 35%.”
Bad framing:
- “Make it profitable” with no dataset.
- “Tune everything” with no metric priority.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
subway-info
Get real-time NYC transit information — subway, bus, ferry, and commuter rail — via the subway-info CLI or REST API at subwayinfo.nyc. Use when asked about NYC subway status, train times, bus routes, ferry schedules, transit delays, MTA service alerts, or "what's the next train to X".
codex-advisor
Get a second opinion from OpenAI Codex CLI for plan reviews, code reviews, architecture decisions, and hard problems. Use when you need external validation, want to compare approaches, or are stuck on a difficult problem.
brainstorming
Explore user intent, requirements and design before implementation through structured dialogue and design proposals. Use when asked to: create features, build components, add functionality, modify behavior, plan projects, or when user says 'help me design X', 'what should we build', 'let's brainstorm', or starts describing a new feature without a design.
direct-mail-strategist
Expert direct mail marketing strategist for writing compelling copy, designing high-converting mail pieces, and developing measurement strategies. Use when planning direct mail campaigns, writing mailer copy, designing postcards/letters, or measuring campaign effectiveness with incremental lift analysis.
gemini-image-generator
Generate images using Google's Gemini API. Use when creating images from text prompts, editing existing images, or combining reference images for AI-generated visual content.
ui-design
Opinionated constraints for building better interfaces with agents. Use when building UI components, implementing animations, designing layouts, reviewing frontend accessibility, or working with Tailwind CSS, motion/react, or accessible primitives like Radix/Base UI.
Didn't find tool you were looking for?