quantitative forecasting · risk-sized allocation
A personal quantitative trading system. Ensemble models, position sizing, and forward-return projections for equities and crypto. One user, one account — published here to talk about the engineering decisions, not to sell access.
python · duckdb · polars · pytorch · lightgbm / xgboost · fastapi · next.js
Compression is the product. The Projector takes a dollar amount and returns ranked trades with Kelly-sized allocations and Monte Carlo P10–P90 bands across four time horizons. Underneath is a ~16,000-line codebase — models, features, pipelines, on-chain metrics. None of it is exposed to the user. Sophisticated in the stack, simple at the surface.
Walk-forward ensemble training. No lookahead bias — each prediction window uses only data available at that point in time. Nine models span statistical, tree-based, and deep learning families; a meta-ensemble calibrates their agreement into a single confidence score. Outputs are probability distributions, not point estimates.
Half-Kelly sizing. Full Kelly maximizes long-term growth but is too volatile for real portfolios. Half-Kelly captures most of the growth with dramatically less drawdown. The Projector allocates the user's amount across the top-ranked trades at half-Kelly weights.
Multi-horizon Monte Carlo. Four time horizons, four simulations — each shown as a P10–P90 band rather than a point estimate. The user sees the plausible range, not false-precision.
DuckDB write-lock between CLI and API. The CLI's training runs and the FastAPI backend were contending for the same database handle — either path could lock out the other mid-write. A busy_timeout config didn't help and got reverted. Fix: a LazyDatabase wrapper — every query opens a short-lived read-write handle, serialized through a thread lock with a CHECKPOINT after each write. WAL stays bounded; CLI and API coexist.
Walk-forward re-training cadence. Too frequent and the model overfits the most recent regime; too rare and the signal decays. Current cadence re-trains on a rolling window; a bias auditor flags any prediction whose training window overlapped its evaluation window. No silent lookahead.
Live data wiring across endpoints. The Polygon intraday overlay had to land on every read path — Projector, backtest, my-plays, ticker comparison — without duplicating fetch logic. Fix: a single live-data provider hydrated at the edge, consumed by every page through a shared hook.
The model isn't the product. The decision is.