This repository contains code and data artifacts for "Hybrid Mixed Integer Linear Programming for Large-Scale Join Order Optimisation", accepted at VLDB 2026.
HybridMILPConvBenchmarks.py and HybridMILPScalabilityBenchmarks.py respectively apply our MILP method on queries contained in conventional query optimisation benchmarks (TPC-H [1], TPC-DS [2], LDBC [3], Join Order Benchmark [4]), as well as large tree queries generated by Neumann and Radke [5] to benchmark scalability to large query loads. We rely on the Gurobi MILP solver, which requires user access data to be specified in config.py. The Experiments folder contains all queries and results yielded by our approach, including (1) the raw results produced by the MILP step of our method, (2) subproblems extracted from raw solutions to complete the optimisation of raw MILP results, as well as corresponding subproblem solutions, and finally (3) the complete results combining the raw MILP solutions and subproblem solutions. The Experiments/Thresholds folder includes cost approximation thresholds derived from the preliminary adaptive solution, in accordance to our hybrid method. Finally, the Scripts folder contains utility scripts, such as code to convert problems into the format required by the complementary adaptive method by Neumann and Radke [5], applied by our hybrid algorithm.
[1] Transaction Processing Performance Council. 2023. TPC Benchmark H. https: //www.tpc.org
[2] Transaction Processing Performance Council. 2023. TPC Benchmark DS. https: //www.tpc.org
[3] Renzo Angles, Peter Boncz, Josep-Lluis Larriba-Pey, Irini Fundulaki, Thomas Neumann, Orri Erling, Peter Neubauer, Norbert Martínez-Bazan, Venelin Kotsev, and Ioan Toma. 2014. The Linked Data Benchmark Council: A graph and RDF industry benchmarking effort. ACM SIGMOD Record 43 (05 2014), 27–31. https: //doi.org/10.1145/2627692.2627697
[4] Viktor Leis, Bernhard Radke, Andrey Gubichev, Atanas Mirchev, Peter Boncz, Alfons Kemper, and Thomas Neumann. 2018. Query optimization through the looking glass, and what we found running the join order benchmark. The VLDB Journal 27 (2018), 643–668.
[5] Thomas Neumann and Bernhard Radke. 2018. Adaptive Optimization of Very Large Join Queries. In Proceedings of the 2018 International Conference on Management of Data (Houston, TX, USA) (SIGMOD ’18). Association for Computing Machinery, New York, NY, USA, 677–692. https://doi.org/10.1145/3183713.3183733