Skip to content

Conversation

@grzesir
Copy link
Contributor

@grzesir grzesir commented Oct 29, 2025

DO NOT MERGE – ThetaData cache diagnostics

Summary

  • add a reusable tooltip builder so trades in terminal statuses (cash_settled, assigned, etc.) render markers/tooltips in performance reports
  • capture broker diagnostics when quotes are missing and add a scripted ThetaData vs Polygon coverage report with accompanying investigation notes
  • update parity/backtest artefacts while preserving the temporary [THETA][DEBUG] logging used for cache validation

Evidence

  • Weekly Momentum cold vs warm runs: logs/WeeklyMomentumOptionsStrategy_2025-10-28_22-25_gYiGRT_logs.csv, logs/WeeklyMomentumOptionsStrategy_2025-10-28_22-27_lhcW2o_logs.csv
  • Cache verification: run_cache_validation.shlogs/pandas_verification_results.json
  • Data coverage script: python3 scripts/compare_option_data_coverage.py
  • Polygon parity commands: pytest -s tests/backtest/test_thetadata_vs_polygon.py -k "stock_price_comparison or option_price_comparison"

Test Status (2025-10-28 @ 23:30 EDT)

  • python3 -m pytest (local) → FAILED after 24m03s
    • cash/price drift assertions in tests/backtest/test_backtesting_broker_processing.py (market & trailing-stop fills now using different bar values)
    • env auto-select coverage in tests/test_backtesting_data_source_env.py (log expectations not met when suite runs end-to-end)
    • polygon legacy backtest parity (tests/backtest/test_polygon.py::TestPolygonBacktestFull::test_polygon_restclient) – option fill price now 6.30 vs expected 6.10
    • ThetaData process health check trio – pass in isolation but failed in full run; needs follow-up triage
  • GitHub Actions: LintAndTest #18895406385 FAILED with the same backtesting_broker cash assertions plus strategy log AttributeErrors (order id missing in env auto-select tests)

Outstanding

  • investigate the market/trailing-stop fill deltas introduced by the new diagnostics before trimming logs
  • resolve the env auto-select assertions so logs show ThetaData/Polygon selection deterministically when the full suite runs
  • retest locally + CI once fixes land; only after green runs should we remove [THETA][DEBUG] instrumentation and prune investigative artefacts

Description by Korbit AI

What change is being made?

Introduce and wire in the new ThetaData pandas-based backtesting path, enhance cache/parity diagnostics, and broaden internal utilities to support richer time-shift and data-caching behavior across backtesting.

Why are these changes being made?

Enable pandas-only ThetaData backtesting with improved cache validation, richer diagnostics, and better handling of time shifts and asset lookups to improve parity with other data feeds and aid troubleshooting. This work lays groundwork for Polars parity later, and adds instrumentation to surface cache coverage, data availability, and ordering behavior during backtests.

Is this description stale? Ask me to generate a new description by commenting /korbit-generate-pr-description

@korbit-ai
Copy link
Contributor

korbit-ai bot commented Oct 29, 2025

Korbit doesn't automatically review large (3000+ lines changed) pull requests such as this one. If you want me to review anyway, use /korbit-review.

@grzesir grzesir changed the title [DO NOT MERGE] ThetaData cache hardening for pandas backtests ThetaData cache diagnostics Oct 29, 2025
@grzesir grzesir merged commit 44f3d22 into dev Oct 29, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants