BidOptic Market Calibration Engine

BidOptic is a locally deployed, zero-egress simulation engine that calibrates Real-Time Bidding (RTB) strategies against your actual market conditions — including bimodal network transit spikes, publisher floor dynamics, delayed attribution windows, and latency-driven bid timeouts — before any strategy reaches production traffic.

DSP teams provide a standardised 11-column log extract from their existing infrastructure (with an optional 12th column, bid_latency_ms, strongly recommended for accurate latency modelling). BidOptic ingests that data, trains a suite of seven predictive ML models and one clustering algorithm to reconstruct your market's ground truths, and returns a calibrated simulation sandbox your engineers can query repeatedly at zero marginal cost.

The core simulation engine is delivered as a high-performance C-compiled SDK within the container. The bidding agent and value-estimation models are yours—you interact with the engine entirely through clean Python interfaces (ClientEnricher and ClientStrategy). BidOptic provides the market.

How It Works

Step 1 — Data Ingestion & Validation

Your data engineering team runs the BidOptic Local Schema Validator against a historical log extract. The validator checks column presence, data types, null rates, and statistical quality gates before a single model is trained. No data leaves your environment at any point in this step.

Step 2 — Local Calibration & Docker Deployment

The encrypted Docker image is loaded into your VPC. The calibration pipeline trains seven internal ground-truth models using only your provided logs:

Model	Algorithm	What It Learns
Ghost Market (Win Rate)	XGBoost	Competitive win probability against anonymous market participants
Audience DNA	MiniBatchKMeans	Clusters the user base into distinct behavioural segments (e.g., Whale, High Intent, Toxic) based on historical intent signals and average order value.
Supply Floor Price	LightGBM Quantile	Publisher-level floor price distributions and clearing dynamics
Click-Through Rate	LightGBM Binary	CTR curves by publisher, ad size, user segment, and time-of-day
Conversion Rate	LightGBM Binary	CVR signal by segment and inventory quality
Lifetime Value	LightGBM Tweedie	Expected revenue per conversion, accounting for zero-inflation
Conversion Delay	XGBoost AFT	Post-click conversion timing using censored survival analysis
Infrastructure Latency Twin	LightGBM Tweedie	Bimodal round-trip latency distribution, including network bifurcation threshold

The resulting calibrated environment is written to your local output directory. The container exits. No network connection is opened at any stage.

Step 3 — Strategy Execution & Output Generation

Your team plugs in their own value-estimation model (ClientEnricher) and bidding logic (ClientStrategy) via the two published Python interfaces. The sandbox runs the configured simulation period and emits a structured results report containing spend curves, ROAS trajectories, latency leak rates, and segment-level performance breakdowns.

Key Capabilities

Plugin Interfaces: Estimator and Strategy. BidOptic defines two abstract interfaces — ClientEnricher for your pCTR/pCVR/LTV models, and ClientStrategy for your bid calculation logic. The simulation measures and penalises your actual inference latency, so performance figures account for real compute overhead.

Strategy Win Rate vs. Market Win Rate. The calibration report shows your historical market win rate — the fraction of auctions you won in the training data. The simulation's strategy win rate will differ, and intentionally so: it reflects what your bidding logic actually bids, not the market's capacity to be won. A low strategy win rate means your bids are not competitive against the calibrated floor and ghost market — that is the signal the simulation is designed to surface.

Creative Scenario Testing. Each simulation run can specify a named creative with a configured CTR multiplier, ad fatigue rate, and segment-level appeal boost. This allows you to model the impact of creative improvements — for example, a premium video unit with a 1.5× base CTR lift — without touching live campaigns.

User Segment Intelligence. The calibration pipeline clusters your user base into behavioural segments (Whale, High Intent, Engaged, High Value, Volume, Toxic, Standard). Simulation output is broken down by segment, allowing you to identify which cohorts drive disproportionate ROAS and prioritise inventory accordingly.

First-Price / Second-Price Switching. The auction type distribution is a configurable parameter. You can test the same strategy across pure first-price, pure second-price, or any blended market structure.

Delayed Attribution Modelling. A censored Accelerated Failure Time (AFT) model is trained on your conversion_timestamp data. The sandbox correctly attributes conversions that fall outside a naive same-session window, producing ROAS figures that reflect your actual purchase cycle rather than a truncated attribution window. Conversions with realistic multi-day delays are bridged across simulation steps automatically.