Evaluation & Deliverables

The Evaluation Period Structure

A BidOptic Evaluation Agreement is structured as a time-limited engagement. The exact Evaluation Period duration is negotiated and agreed in writing between the Parties before the evaluation begins. The deliverable is a signed Evaluation Comparison Report that you prepare and deliver to BidOptic, presenting three primary metrics from your own run against your calibrated baseline. These metrics form the basis of the commercial conversation at the end of the Evaluation Period.

What the Container Produces

At the end of each run, BidOptic writes three files to your client/output/ directory. All files are written locally — they never leave your VPC.

client/output/
├── market_intelligence.json   ← Segment profiles, economic viability verdict, dataset summary
├── latency_profile.json       ← Bid latency distribution, timeout analysis, forfeited revenue
└── simulation_history.csv     ← Per-step spend, wins, clicks, conversions, and ROAS (Captured from the first seed run, Seed 42)

market_intelligence.json is the primary calibration output. It contains three sections:

dataset_summary — row count, historical win rate, conversion count, days of data, and average daily impression volume. Note: historical_win_rate_pct reflects your market's observed win rate from the training data. Your strategy's win rate in simulation will differ based on your bidding logic. This discrepancy is expected and represents the exact signal the simulation is designed to surface.
economic_viability — a plain-English viability verdict, global CTR and CVR rates, the implied auction-to-conversion rate, and your calibrated daily spend and suggested 7-day budget guardrail.
segment_intelligence — one entry per user segment, sorted by average order value descending. Each entry includes a heuristic label (Whale, High Intent, Engaged, High Value, Volume, Toxic, or Standard) and a one-sentence strategy recommendation — both derived from the relative metrics within your dataset — as well as population share, CTR, CVR, and average order value.

latency_profile.json contains the latency audit results: total auctions simulated, bids exceeding the timeout threshold, latency leak percentage, and forfeited revenue in your campaign currency.

simulation_history.csv contains the full per-step telemetry from Seed 42, suitable for plotting spend curves and ROAS trajectories over the simulation period.

What You Learn During the Evaluation"

At the end of the Evaluation Period, the Client delivers an Evaluation Comparison Report to BidOptic containing projected vs. actual ROAS, the calculated ROAS Projection Error, the win rate sanity check, and Client's determination of whether the Accuracy Threshold was met.

1. Latency Leak Percentage

The fraction of technically winnable auctions lost because your bidding stack's response time exceeded the exchange timeout threshold.

The simulation measures your actual enrich + bid wall-clock time at every step. It then applies the calibrated Infrastructure Latency Twin (trained on your bid_latency_ms column, or synthesised from market priors if absent) to derive a total round-trip time distribution. Auctions where the modelled total latency exceeds the BID_TIMEOUT_MS threshold are counted as latency losses, even if your bid price would have won.

A Latency Leak of 3–8% is common in production DSP deployments. Reducing it is often the highest-leverage engineering intervention available. The forfeited revenue figure in latency_profile.json translates this percentage into a monetary figure in your campaign currency.

2. Strategy Funnel Accuracy

The simulation reconstructs your bidding funnel (Strategy Win Rate, CTR, CVR, and resulting CPA/ROAS) based on the decisions made by your Strategy.

The pre-calibration baseline compares these simulated funnel metrics against your actual DSP logs. The post-tuning figure reflects the funnel improvement after one or more iterations informed by segment-level calibration data. This quantifies exactly where your existing strategy is leaking value (e.g., high win rate but zero clicks, or winning expensive inventory with low CVR).

3. Calibrated ROAS Projection

The ROAS trajectory produced by the simulation under your configured budget, KPI mode, and bidding strategy — expressed as a mean and variance across the 3-seed multi-reality run.

The multi-seed design intentionally introduces market variance between runs. A tight variance band indicates a robust strategy; a wide band indicates strategies that are sensitive to market noise and warrant further tuning before production deployment.

This figure is a calibrated estimate grounded in your own data. It is not a guarantee of production performance.

The Evaluation Comparison Report

Commercial Trigger

A signed Evaluation Agreement includes a defined, objective commercial trigger based on simulation accuracy — not a subjective benchmark comparison.

The evaluation protocol is:

Calibrate the simulation on your first 8 weeks of historical auction data.
Implement your production Enricher and Strategy using the exact same models and logic you ran during that 9th week of live traffic.
Run the simulation against the calibrated environment for the equivalent of week 9.
Compare the simulation's projected ROAS to your actual week-9 ROAS. Win rate is reported as a secondary diagnostic metric in the Evaluation Comparison Report and does not constitute a pass/fail criterion

If the error between the simulation's predictions and your actual week-9 results is within the pre-agreed accuracy threshold (defined in writing before the evaluation begins), client may convert to a paid licence by sending written confirmation to BidOptic. We offer flexible quarterly, annual, or multi-year subscription terms, with significant early-adopter discounts applied for our Design Partners.

If the error exceeds the threshold, the evaluation closes. You provide BidOptic with a small error-summary file (containing only the delta metrics, not your raw logs) so the calibration models can be improved. No licence fee is charged.

The trigger threshold and licence terms are agreed in writing before the evaluation starts. Nothing in the evaluation output is shared with BidOptic — you run the container, you own the results.

Post-Commercial Metrics (ECS Programme — Optional)

Design Partners who have elected to participate in the Enhanced Case Study Programme agree to deliver aggregate performance metrics to BidOptic within 120 days of the Commercial Licence Start Date. This window accounts for the full 90-day post-adoption measurement period plus 30 days to compile and deliver the report. These metrics are the delta between the 90 days immediately before and after going live with BidOptic, specifically average ROAS, average CPA, and average bid win rate expressed as before/after differences. Raw auction logs, absolute figures, and individual-level data are never shared. Delivery of these delta metrics is the condition of retaining the ECS pricing for subsequent renewals. This programme is entirely optional, clients who do not elect to participate proceed on the standard Design Partner pricing.

Case Study Participation

Upon conversion to a commercial licence, BidOptic's case study rights fall into two categories.

Evaluation delta (all Design Partners).

Regardless of ECS election, BidOptic publishes the delta between the simulation's projected ROAS for the Holdout Week and the client's actual Holdout Week ROAS — i.e., the accuracy proof. This is the core case study: it shows the gap between prediction and ground truth, not the client's absolute business figures. Individual quotes from named contacts require that person's written approval before publication.

Commercial delta (ECS-elected Design Partners only).

Clients who elect the ECS Programme additionally share the before/after delta — the difference between the 90 days preceding BidOptic adoption and the 90 days following it. BidOptic may publish these as relative differences (e.g., "ROAS improved by X%"). Absolute figures, conversion rates, and bid volumes are never disclosed without the client's separate written consent.

Frequently Asked Questions

Do we need to share our bidding logic with BidOptic? No. Your Enricher and Strategy implementations run entirely inside your environment. BidOptic never sees your code.

Can we run multiple simulation episodes during the evaluation? Yes. The container is not episode-limited. You can re-run with different config parameters, different creative scenarios, or revised bidding logic as many times as you choose within the evaluation period. Calibration is cached after the first run, so subsequent runs go straight to simulation.

What happens to the trained models after the evaluation? They remain in your client/output/ directory. BidOptic has no copy. If you choose not to proceed to a licence, you can delete them. If you proceed, they form the baseline for the production deployment.

Can we test a strategy we have not yet built? Yes. You can implement a stub Strategy that encodes any bid logic you want to evaluate, including approaches that are not yet integrated into your production stack.

Why is the simulation win rate lower than the calibration report win rate? The calibration win rate reflects your historical market data — the fraction of auctions you won across all bids in your logs. The simulation win rate reflects what your specific Strategy bids in the simulated market. If your strategy bids below the calibrated floor price on many auctions, it will win fewer. This discrepancy is diagnostic information, not a model error. Use market_intelligence.json segment data to identify which publishers and segments your current bids are losing, then adjust your enricher or bid logic accordingly.

Ready to Proceed?

Once you have run the Open-Source Schema Validator against a sample of your logs and generated a passing receipt, you are ready to begin the evaluation.

Next Steps

Once your dataset passes validation, follow these steps to begin your evaluation:

Submit your Receipt: Send your bidoptic_receipt_*.json file to berlik@bidoptic.com.
Provide Host Specs: Include your target Linux host specifications (machine-id and CPU core count). This allows us to mint your hardware-locked Evaluation License.
Initialize Sandbox: We will securely deliver your bidoptic.tar.gz container and Python SDK bundle. You can begin your Evaluation Period immediately upon receipt.

Technical Support & Inquiries

If you encounter issues during validation or have questions regarding the architecture:

Email: berlik@bidoptic.com
LinkedIn: Csanád Berlik
Main Site: bidoptic.com