Skip to content

Conversation

@Ubospica
Copy link
Collaborator

@Ubospica Ubospica commented Nov 30, 2025

This PR aims to simplify the logic of all builders and provide a unified abstraction. It also provides a simpler abstraction for Runnable. It also:

  1. Renames CUDABuilder to TorchBuilder and adds CPP support to it
  2. Enhances the CI unit test to reduce execution time
  3. Removes get_builder_registry and provides class-level singleton BuilderRegistry.get_instance() instead
  4. Provides a hash function for Solution.
  5. Provides utilities for tests

Signed-off-by: Ubospica [email protected]

Summary by CodeRabbit

  • New Features

    • Added TorchBuilder for compiling CUDA solutions via PyTorch extensions.
    • Introduced Solution hashing for deterministic caching and entry-point extraction utilities.
  • Bug Fixes

    • Improved runtime rank validation and error handling with better error messages.
    • Removed legacy CUDA builder; replaced with TorchBuilder.
  • Improvements

    • Simplified builder registry access pattern.
    • Enhanced metadata tracking and cleanup workflows.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Nov 30, 2025

Warning

Rate limit exceeded

@Ubospica has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 1 minutes and 45 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 734657f and a5e1233.

📒 Files selected for processing (3)
  • flashinfer_bench/bench/utils.py (6 hunks)
  • flashinfer_bench/compile/builder.py (1 hunks)
  • flashinfer_bench/compile/builders/torch_builder.py (1 hunks)

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

This PR performs a major architectural refactor of the build system, replacing a function-based registry pattern with a class-based singleton, introducing a metadata model for runtime tracking, refactoring builder lifecycle methods, removing CUDABuilder while adding TorchBuilder, and standardizing variable names throughout (defndefinition, solsolution).

Changes

Cohort / File(s) Summary
Core Builder Architecture
flashinfer_bench/compile/builder.py, flashinfer_bench/compile/registry.py, flashinfer_bench/compile/__init__.py, flashinfer_bench/compile/utils.py
Introduced BuildError exception; refactored Builder base class with new public API (is_available(), can_build(), build(), get_package_name_and_build_path()); removed per-solution caching; added BuilderRegistry singleton with priority-based builder selection and build_reference() for definition code compilation; added utility functions write_sources_to_path() and create_package_name()
Builder Implementations
flashinfer_bench/compile/builders/__init__.py, flashinfer_bench/compile/builders/python_builder.py, flashinfer_bench/compile/builders/triton_builder.py, flashinfer_bench/compile/builders/tvm_ffi_builder.py
Updated all builder classes to extend new Builder base with standardized lifecycle; PythonBuilder now manages cleanup via _get_cleaner(); TritonBuilder delegates to PythonBuilder with metadata override; TVMFFIBuilder refactored to use RunnableMetadata and cleaner-based resource management
New Torch Builder
flashinfer_bench/compile/builders/torch_builder.py
Added TorchBuilder class for CUDA kernel compilation via PyTorch C++/CUDA extensions with DependencyManager for cuBLAS, cuDNN, CUTLASS discovery; includes source filtering, cleaner creation, and error handling
Removed CUDA Builder
flashinfer_bench/compile/builders/cuda_builder.py
Entire CUDA builder implementation removed (replaced by TorchBuilder); eliminated CUDA dependency discovery, source validation, compilation via PyTorch extensions, and per-build cache management
Runnable & Metadata
flashinfer_bench/compile/runnable.py
Introduced RunnableMetadata Pydantic model storing build_type, definition, solution, misc; refactored Runnable with new constructor signature (callable, metadata, cleaner); added call_value_returning() for DPS-oriented output allocation and device management; updated cleanup() method for idempotent resource cleanup
Data Models
flashinfer_bench/data/solution.py, flashinfer_bench/data/trace_set.py
Added SupportedLanguages.CPP; introduced hash() method and entry-point extraction helpers (get_entry_path(), get_entry_symbol()) to Solution and BuildSpec; renamed loop variables defndefinition in trace_set.py
Apply Layer
flashinfer_bench/apply/key.py, flashinfer_bench/apply/runtime.py, flashinfer_bench/apply/table.py
Flattened nested Union in ApplyKey.feats type; renamed parameter defndefinition throughout; updated ApplyKeyFactory.specialize() and runtime to use BuilderRegistry.get_instance() instead of function-based registry accessor; added runtime rank validation in _materialize_axes()
Benchmark & Evaluation
flashinfer_bench/bench/benchmark.py, flashinfer_bench/bench/runner/*.py, flashinfer_bench/bench/evaluators/*.py, flashinfer_bench/bench/utils.py
Standardized variable naming across all files (defndefinition, solsolution); migrated registry usage to BuilderRegistry.get_instance(); added traceset_root optional parameter to baseline building; updated evaluator signatures and DeviceBaseline field (defndefinition); refactored utils functions with consistent parameter naming
CLI & Examples
flashinfer_bench/cli/main.py, examples/fi_gqa_e2e_example.py, examples/win_at_p.py
Updated variable names for clarity (tstrace_set, defndefinition, solsolution); adjusted BuildSpec target_hardware from "gpu" to "cuda"; removed blank lines in docstrings
Test Refactoring
tests/compile/test_*.py, tests/bench/test_*.py, tests/data/test_*.py, tests/conftest.py
Added autouse fixtures for temporary cache directory setup; migrated tests to new builder API (build() instead of _build()); updated Runnable construction with new parameter names; added test_torch_builder.py with CPU/CUDA kernel tests; removed test_cuda_builder.py; added pytest marker requires_torch_cuda; introduced helper fixture detection for CUDA availability; updated test data (BuildSpec target_hardware "gpu"→"cuda")
Configuration
pyproject.toml
Updated torch dependency spec from "==" to ">="; added "apache-tvm-ffi>=0.1.2"; added pytest marker definition for requires_torch_cuda; added setuptools_scm fallback version

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~70 minutes

Key areas requiring detailed attention:

  • Builder base class refactoring (flashinfer_bench/compile/builder.py): Verify new abstract methods and constructor signature are correctly implemented across all subclasses; confirm get_package_name_and_build_path() logic is sound
  • TorchBuilder implementation (flashinfer_bench/compile/builders/torch_builder.py): New complex feature with CUDA dependency discovery and build orchestration; verify error handling, path resolution, and cleaner resource management
  • Registry singleton pattern (flashinfer_bench/compile/registry.py): Validate priority-based builder selection, cache management, and build_reference() logic with pseudo-solution creation
  • RunnableMetadata and Runnable refactor (flashinfer_bench/compile/runnable.py): Confirm metadata field population across all builders and call_value_returning() device/shape inference logic
  • TVMFFIBuilder refactoring (flashinfer_bench/compile/builders/tvm_ffi_builder.py): Complex builder with file locking and multi-stage build; verify new metadata construction and cleaner integration
  • Comprehensive variable renaming: Spot-check representative files across layers to ensure defn/sol renaming is consistent and no references were missed
  • Test coverage: Verify new test_torch_builder.py and test_utils.py adequately cover the new functionality; confirm removal of test_cuda_builder.py doesn't leave gaps

Possibly related PRs

  • PR #68: Directly aligned with this PR's module reorganization and import path updates; shares builder architecture changes
  • PR #117: Concurrent modifications to TVMFFIBuilder implementation and registry imports; tightly coupled changes
  • PR #111: Related through new TorchBuilder and TVMFFIBuilder exposure in compile/builders/__init__.py

Poem

🐇 Hops through the refactored halls,
Where builders now march single-file,
Metadata flows, registry calls—
From CUDA's end to Torch's new style!
Definitions named, solutions tracked,
The pipeline gleams—refactoring packed!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 54.76% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'refactor: Builder System' directly summarizes the main structural change in the PR—a comprehensive refactoring of the builder system architecture, including new builder classes, API modernization, and integration updates across the codebase.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Ubospica, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a comprehensive refactoring of the builder system, which is central to how solution implementations are compiled and executed. The changes aim to enhance modularity, improve extensibility, and optimize performance through intelligent caching. Key updates include the introduction of a dedicated TorchBuilder for PyTorch-based CUDA extensions, a redesigned BuilderRegistry for automatic builder discovery and caching, and a more robust Runnable interface. These modifications streamline the integration and execution of diverse solution types, making the system more maintainable and efficient.

Highlights

  • Builder System Refactoring: The core builder system has undergone a significant refactoring, introducing a more modular and extensible architecture for compiling and loading solution implementations.
  • New TorchBuilder: A new TorchBuilder has been added to specifically handle CUDA solutions that leverage PyTorch's C++/CUDA extension system, replacing the previous CUDABuilder.
  • Builder Class Enhancements: The abstract Builder class has been updated with a new constructor, a public build method, and a get_package_name_and_build_path helper, streamlining how builders are defined and interact with the system.
  • Singleton BuilderRegistry with Caching: The BuilderRegistry is now a singleton, automatically discovering available builders based on priority and implementing a robust caching mechanism to prevent redundant builds of identical solutions.
  • Runnable Interface Redesign: The Runnable class has been refactored to use a more consistent API with callable, cleaner, and a new RunnableMetadata model for structured information. It also introduces a call_dps method for destination-passing style functions.
  • Solution Metadata and Hashing: The Solution class now includes methods to extract entry file paths and symbols, supports a new CPP language type, and provides a deterministic hash method crucial for the new caching strategy.
  • Centralized Utility Functions: A new flashinfer_bench/compile/utils.py file has been introduced to house common utility functions like write_sources_to_path and create_package_name, promoting code reuse and consistency across builders.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a major and well-executed refactoring of the builder system. The new design centralizes caching in a BuilderRegistry, simplifies the Builder interface, and improves the structure of concrete builder implementations. The introduction of RunnableMetadata and a unified Runnable class with call_dps support is a great improvement.

I've found a few critical issues that will cause runtime errors: a typo in the Runnable.cleanup method and a missing value in the BuildType literal that will break TorchBuilder. I also noticed that several tests seem to be broken due to the refactoring, as they call a non-existent build_with_cache method. Additionally, I have a couple of suggestions to improve code style and maintainability in TritonBuilder and TVMFFIBuilder.

Overall, this is a high-quality refactoring. Once the identified issues are addressed, the builder system will be much more robust and maintainable.

from flashinfer_bench.data import Definition
from flashinfer_bench.utils import dtype_str_to_torch_dtype

BuildType = Literal["cuda", "tvm_ffi", "python", "triton"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The BuildType literal is defined as Literal["cuda", "tvm_ffi", "python", "triton"]. However, the new TorchBuilder sets the build_type to "torch". This will cause a pydantic validation error at runtime. The BuildType should be updated to include "torch". Since CUDABuilder has been removed and replaced by TorchBuilder, "cuda" should probably be replaced with "torch".

Suggested change
BuildType = Literal["cuda", "tvm_ffi", "python", "triton"]
BuildType = Literal["torch", "tvm_ffi", "python", "triton"]

Comment on lines 168 to 172
if self._closer:
try:
self._closer()
finally:
self._closer = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There's a typo in the cleanup method. The __init__ method sets the cleaner function to self._cleaner, but this method attempts to access self._closer. This will result in an AttributeError when cleanup is called.

Suggested change
if self._closer:
try:
self._closer()
finally:
self._closer = None
if self._cleaner:
try:
self._cleaner()
finally:
self._cleaner = None

Comment on lines 48 to 49
r1 = b.build_with_cache(d, s)
r2 = b.build_with_cache(d, s)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This test, along with others in test_python_builder.py, test_triton_builder.py, and test_tvm_ffi_builder.py, has been updated to call build_with_cache. However, this method does not exist on the Builder class after the refactoring. The caching logic has been moved to BuilderRegistry, and the build method on Builder no longer performs caching.

These tests appear to be broken and will fail. They need to be updated to reflect the new design, likely by testing the BuilderRegistry for caching behavior or by adapting the builder tests to call the build method directly without expecting caching at the builder level.

Additionally, DummyBuilder in this file implements _build but the Builder abstract base class now requires build to be implemented, so DummyBuilder is no longer a valid subclass of Builder.

Comment on lines +54 to 85
def build(self, definition: Definition, solution: Solution) -> Runnable:
"""Build a solution into a runnable implementation.
This method compiles/loads the solution's source code and returns a Runnable
object that can be executed with the interface specified by the definition.
Parameters
----------
definition : Definition
The problem definition that specifies the expected interface.
solution : Solution
The solution implementation to build.
Returns
-------
Runnable
An executable wrapper around the built implementation.
Raises
------
BuildError
If the build fails for any reason (compilation errors, missing dependencies, etc.).
"""
...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The build method is now abstract, but some test implementations like DummyBuilder in tests/compile/test_builder.py still implement _build, which will cause an error. Please ensure all concrete builder implementations, including those in tests, are updated to implement the public build method.

super().__init__()
self._py_builder = py_builder
def __init__(self) -> None:
Builder.__init__(self, self._KEY_PREFIX, self._BUILD_DIR_NAME)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The __init__ method directly calls Builder.__init__, bypassing the __init__ of its direct parent, PythonBuilder. While this works currently because PythonBuilder.__init__ only calls its superclass's __init__, it's a fragile design. If PythonBuilder.__init__ is modified in the future to include additional setup logic, TritonBuilder would not inherit it. It's more robust to use super().__init__() to respect the method resolution order and make the inheritance chain clearer.

Suggested change
Builder.__init__(self, self._KEY_PREFIX, self._BUILD_DIR_NAME)
super().__init__()

Comment on lines +235 to +237
import tvm_ffi
from tvm_ffi.utils import FileLock
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The imports for tvm_ffi and tvm_ffi.utils.FileLock are inside the build method. It's generally better practice to place imports at the top of the file for clarity and to avoid repeated import overhead. Since the availability of tvm_ffi is already checked by is_available() before this builder is used, these imports can be safely moved to the top level of the file.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (12)
flashinfer_bench/compile/__init__.py (1)

16-27: Exports look correct; consider sorting __all__ alphabetically.

The new RunnableMetadata export is appropriate. Static analysis flags that __all__ is not sorted (RUF022). If the project enforces isort-style sorting, apply this diff:

 __all__ = [
     "Builder",
     "BuildError",
     "BuilderRegistry",
+    "get_builder_registry",
     "Runnable",
     "RunnableMetadata",
-    "get_builder_registry",
 ]
flashinfer_bench/compile/builders/__init__.py (1)

1-8: LGTM! TorchBuilder correctly replaces CUDABuilder.

The module docstring is helpful and the export list is updated correctly. If the project enforces sorted __all__ (RUF022), consider alphabetical ordering:

-__all__ = ["TorchBuilder", "PythonBuilder", "TritonBuilder", "TVMFFIBuilder"]
+__all__ = ["PythonBuilder", "TorchBuilder", "TritonBuilder", "TVMFFIBuilder"]
flashinfer_bench/compile/utils.py (2)

39-54: Security checks use assertions - consider using explicit exceptions.

Assertions can be disabled with python -O, which would bypass these security checks. The SourceFile model already validates paths at creation time (per the relevant snippet from flashinfer_bench/data/solution.py), so these are truly defensive. However, for defense-in-depth, explicit exceptions are safer:

-        assert not src_path_obj.is_absolute(), f"Absolute path detected: {src.path}"
-        assert ".." not in src_path_obj.parts, f"Path traversal detected: {src.path}"
+        if src_path_obj.is_absolute():
+            raise ValueError(f"Absolute path detected: {src.path}")
+        if ".." in src_path_obj.parts:
+            raise ValueError(f"Path traversal detected: {src.path}")

28-31: Minor: Docstring says "absolute paths" but returned paths may not be resolved.

The docstring states the function returns "List of absolute paths" but the code appends path / src.path without calling .resolve(). If path is relative, the returned paths will also be relative.

-        paths.append(src_path)
+        paths.append(src_path.resolve())

Alternatively, update the docstring to say "paths relative to the provided root" if that's the intended behavior.

Also applies to: 53-53

flashinfer_bench/compile/builders/torch_builder.py (1)

96-131: Robust error handling; consider using spread operator for list construction.

The broad Exception catch (line 125) is appropriate here since dependency discovery is best-effort. For the Windows linker flags (line 123), using the spread operator is cleaner per RUF005:

-                        ldflags = [f"/LIBPATH:{lib_path}"] + lib_names
+                        ldflags = [f"/LIBPATH:{lib_path}", *lib_names]
flashinfer_bench/compile/builder.py (1)

7-7: Unused import Dict.

Dict is imported but not used in this file.

-from typing import Dict, Tuple
+from typing import Tuple
flashinfer_bench/compile/builders/python_builder.py (2)

6-6: Unused import os.

os is imported but not used in this file.

 import importlib
-import os
 import shutil

41-58: get_key and _get_build_path duplicate base class logic.

These methods replicate functionality already provided by get_package_name_and_build_path in the base class. Consider removing them to reduce code duplication, unless there's a specific reason to keep them separate.

flashinfer_bench/compile/registry.py (1)

122-124: Variable hash shadows builtin.

Using hash as a variable name shadows the Python builtin. Consider renaming to sol_hash or solution_hash.

-        hash = sol.hash()
-        if hash in self._cache:
-            return self._cache[hash]
+        sol_hash = sol.hash()
+        if sol_hash in self._cache:
+            return self._cache[sol_hash]

And update the cache assignment on line 130:

-                self._cache[hash] = runnable
+                self._cache[sol_hash] = runnable
flashinfer_bench/compile/builders/triton_builder.py (2)

28-29: Direct call to Builder.__init__ bypasses PythonBuilder.__init__.

Calling Builder.__init__(self, ...) directly instead of super().__init__() is fragile. If PythonBuilder.__init__ ever adds initialization logic beyond calling its parent, TritonBuilder will miss it.

Since PythonBuilder.__init__ currently just calls super().__init__(), this works, but it's a maintenance hazard.

Consider overriding the class variables and letting the parent chain handle initialization:

 class TritonBuilder(PythonBuilder):
     ...
     _KEY_PREFIX: ClassVar[str] = "fib_triton_"
     _BUILD_DIR_NAME: ClassVar[str] = "triton"

     def __init__(self) -> None:
-        Builder.__init__(self, self._KEY_PREFIX, self._BUILD_DIR_NAME)
+        super().__init__()

But this requires PythonBuilder.__init__ to use self._KEY_PREFIX and self._BUILD_DIR_NAME instead of hardcoded class variables:

# In PythonBuilder.__init__:
def __init__(self) -> None:
    super().__init__(self._KEY_PREFIX, self._BUILD_DIR_NAME)

68-69: Mutating metadata.build_type after construction.

While this works because RunnableMetadata isn't frozen, mutating the returned Runnable's metadata is somewhat surprising. Consider passing the correct build_type during construction instead.

An alternative approach would be to override the metadata construction in the build method rather than mutating it post-hoc. However, since this would require more significant refactoring of PythonBuilder.build(), the current approach is acceptable.

flashinfer_bench/compile/builders/tvm_ffi_builder.py (1)

119-122: Consider using explicit checks instead of assertions for path validation.

Assertions can be disabled with python -O. While the comment notes this is defensive (relying on upstream validation), if this code could run in optimized mode, these security checks would be bypassed.

             # Defensive assertion: the path in the solution should be validated by the Solution
             # model validator, but we add this defensive assertion to be safe.
             src_path_obj = Path(src.path)
-            assert not src_path_obj.is_absolute(), f"Absolute path detected: {src.path}"
-            assert ".." not in src_path_obj.parts, f"Path traversal detected: {src.path}"
+            if src_path_obj.is_absolute():
+                raise BuildError(f"Absolute path detected: {src.path}")
+            if ".." in src_path_obj.parts:
+                raise BuildError(f"Path traversal detected: {src.path}")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2a7154c and d9c9328.

📒 Files selected for processing (22)
  • examples/win_at_p.py (0 hunks)
  • flashinfer_bench/apply/key.py (1 hunks)
  • flashinfer_bench/compile/__init__.py (1 hunks)
  • flashinfer_bench/compile/builder.py (1 hunks)
  • flashinfer_bench/compile/builders/__init__.py (1 hunks)
  • flashinfer_bench/compile/builders/cuda_builder.py (0 hunks)
  • flashinfer_bench/compile/builders/python_builder.py (1 hunks)
  • flashinfer_bench/compile/builders/torch_builder.py (1 hunks)
  • flashinfer_bench/compile/builders/triton_builder.py (1 hunks)
  • flashinfer_bench/compile/builders/tvm_ffi_builder.py (8 hunks)
  • flashinfer_bench/compile/registry.py (2 hunks)
  • flashinfer_bench/compile/runnable.py (2 hunks)
  • flashinfer_bench/compile/utils.py (1 hunks)
  • flashinfer_bench/data/solution.py (5 hunks)
  • flashinfer_bench/data/trace_set.py (0 hunks)
  • tests/apply/test_runtime.py (2 hunks)
  • tests/compile/test_builder.py (2 hunks)
  • tests/compile/test_python_builder.py (2 hunks)
  • tests/compile/test_runnable.py (2 hunks)
  • tests/compile/test_triton_builder.py (3 hunks)
  • tests/compile/test_tvm_ffi_builder.py (8 hunks)
  • tests/data/test_solution.py (0 hunks)
💤 Files with no reviewable changes (4)
  • flashinfer_bench/data/trace_set.py
  • examples/win_at_p.py
  • flashinfer_bench/compile/builders/cuda_builder.py
  • tests/data/test_solution.py
🧰 Additional context used
🧬 Code graph analysis (8)
flashinfer_bench/compile/__init__.py (3)
flashinfer_bench/compile/builder.py (2)
  • Builder (20-93)
  • BuildError (16-17)
flashinfer_bench/compile/registry.py (2)
  • BuilderRegistry (25-167)
  • get_builder_registry (170-180)
flashinfer_bench/compile/runnable.py (2)
  • Runnable (33-172)
  • RunnableMetadata (16-30)
flashinfer_bench/compile/utils.py (1)
flashinfer_bench/data/solution.py (2)
  • SourceFile (30-61)
  • hash (186-213)
flashinfer_bench/compile/builders/python_builder.py (6)
flashinfer_bench/compile/builder.py (2)
  • build (54-77)
  • get_package_name_and_build_path (79-93)
flashinfer_bench/compile/runnable.py (2)
  • Runnable (33-172)
  • RunnableMetadata (16-30)
flashinfer_bench/compile/utils.py (2)
  • create_package_name (58-91)
  • write_sources_to_path (12-55)
flashinfer_bench/data/definition.py (1)
  • Definition (95-363)
flashinfer_bench/data/solution.py (4)
  • Solution (100-213)
  • SupportedLanguages (13-27)
  • get_entry_path (145-156)
  • get_entry_symbol (158-170)
flashinfer_bench/env.py (1)
  • get_fib_cache_path (46-57)
flashinfer_bench/compile/builders/triton_builder.py (5)
flashinfer_bench/compile/builder.py (3)
  • Builder (20-93)
  • can_build (38-51)
  • build (54-77)
flashinfer_bench/data/solution.py (2)
  • Solution (100-213)
  • SupportedLanguages (13-27)
flashinfer_bench/compile/builders/python_builder.py (2)
  • can_build (37-39)
  • build (97-162)
flashinfer_bench/compile/builders/torch_builder.py (3)
  • is_available (135-147)
  • can_build (149-151)
  • build (241-311)
flashinfer_bench/compile/registry.py (1)
  • build (97-132)
flashinfer_bench/compile/runnable.py (5)
flashinfer_bench/data/definition.py (2)
  • get_var_values (239-279)
  • get_output_shapes (343-363)
flashinfer_bench/utils.py (1)
  • dtype_str_to_torch_dtype (39-45)
flashinfer_bench/compile/builders/python_builder.py (1)
  • cleaner (79-93)
flashinfer_bench/compile/builders/torch_builder.py (1)
  • cleaner (236-237)
flashinfer_bench/compile/builders/tvm_ffi_builder.py (1)
  • cleaner (201-202)
flashinfer_bench/compile/builders/__init__.py (4)
flashinfer_bench/compile/builders/python_builder.py (1)
  • PythonBuilder (19-162)
flashinfer_bench/compile/builders/torch_builder.py (1)
  • TorchBuilder (31-311)
flashinfer_bench/compile/builders/triton_builder.py (1)
  • TritonBuilder (14-70)
flashinfer_bench/compile/builders/tvm_ffi_builder.py (1)
  • TVMFFIBuilder (23-296)
tests/compile/test_builder.py (2)
flashinfer_bench/compile/builders/python_builder.py (2)
  • get_key (41-43)
  • cleaner (79-93)
flashinfer_bench/compile/runnable.py (1)
  • Runnable (33-172)
tests/apply/test_runtime.py (5)
flashinfer_bench/compile/builders/python_builder.py (2)
  • PythonBuilder (19-162)
  • build (97-162)
flashinfer_bench/compile/builders/torch_builder.py (1)
  • build (241-311)
flashinfer_bench/compile/registry.py (1)
  • build (97-132)
flashinfer_bench/data/solution.py (1)
  • Solution (100-213)
flashinfer_bench/compile/runnable.py (1)
  • Runnable (33-172)
🪛 GitHub Actions: .github/workflows/linting.yaml
flashinfer_bench/compile/builders/tvm_ffi_builder.py

[error] 59-59: ruff: F401 'tvm_ffi' imported but unused. Remove unused import.

🪛 Ruff (0.14.6)
flashinfer_bench/compile/__init__.py

20-27: __all__ is not sorted

Apply an isort-style sorting to __all__

(RUF022)

flashinfer_bench/data/solution.py

202-202: Probable use of insecure hash functions in hashlib: sha1

(S324)

flashinfer_bench/compile/builders/python_builder.py

85-86: try-except-pass detected, consider logging the exception

(S110)


85-85: Do not catch blind exception: Exception

(BLE001)


125-125: Avoid specifying long messages outside the exception class

(TRY003)


141-141: Avoid specifying long messages outside the exception class

(TRY003)


147-149: Avoid specifying long messages outside the exception class

(TRY003)


153-153: Avoid specifying long messages outside the exception class

(TRY003)

flashinfer_bench/compile/registry.py

58-58: Avoid specifying long messages outside the exception class

(TRY003)


72-73: try-except-pass detected, consider logging the exception

(S110)


72-72: Do not catch blind exception: Exception

(BLE001)


132-132: Avoid specifying long messages outside the exception class

(TRY003)

flashinfer_bench/compile/builders/torch_builder.py

123-123: Consider [f"/LIBPATH:{lib_path}", *lib_names] instead of concatenation

Replace with [f"/LIBPATH:{lib_path}", *lib_names]

(RUF005)


125-125: Do not catch blind exception: Exception

(BLE001)


211-214: Avoid specifying long messages outside the exception class

(TRY003)


269-272: Avoid specifying long messages outside the exception class

(TRY003)


295-295: Avoid specifying long messages outside the exception class

(TRY003)


300-300: Avoid specifying long messages outside the exception class

(TRY003)

flashinfer_bench/compile/runnable.py

128-131: Avoid specifying long messages outside the exception class

(TRY003)

flashinfer_bench/compile/builders/__init__.py

8-8: __all__ is not sorted

Apply an isort-style sorting to __all__

(RUF022)

tests/compile/test_builder.py

27-27: Unused method argument: definition

(ARG002)


27-27: Unused method argument: solution

(ARG002)

flashinfer_bench/compile/builders/tvm_ffi_builder.py

267-269: Avoid specifying long messages outside the exception class

(TRY003)


293-293: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.10
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.12
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.11
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.13
🔇 Additional comments (36)
flashinfer_bench/apply/key.py (1)

15-15: Clean type annotation refactoring.

Flattening the nested Union to Union[int, float, bool] is more idiomatic and improves readability without affecting runtime behavior.

flashinfer_bench/data/solution.py (2)

24-27: LGTM! New CPP language variant properly added.

The new CPP enum value follows the existing pattern and documentation style.


145-170: LGTM! Clean helper methods for entry point parsing.

Both get_entry_path() and get_entry_symbol() properly extract components from the validated entry point format. Using split("::")[-1] for the symbol is a good defensive choice.

tests/compile/test_python_builder.py (2)

41-41: LGTM! Correctly updated to use cache-aware build API.

The test properly sets up an isolated cache directory and uses the new build_with_cache method.


80-80: LGTM! Consistent API migration.

Same pattern applied correctly with proper cache cleanup via b.clear_cache().

tests/compile/test_builder.py (3)

21-22: LGTM! Method renamed to match updated builder interface.

The get_key naming is consistent with the production PythonBuilder.get_key method shown in the relevant code snippets.


48-50: LGTM! Cache behavior correctly tested.

The test verifies that build_with_cache returns the same instance on repeated calls (r1 is r2), confirming cache hit behavior.


27-30: Verify metadata parameter type compatibility.

The Runnable constructor expects metadata: RunnableMetadata, but the test passes a plain dict {"dummy": True}. Confirm whether RunnableMetadata accepts dict input (e.g., via TypedDict or dataclass supporting dict initialization). If RunnableMetadata is a TypedDict, this usage is valid; if it's a stricter type, adjust the test accordingly.

tests/compile/test_triton_builder.py (3)

57-57: LGTM! Import guard test correctly uses cache-aware API.

The test verifies that BuildError is raised when Triton is unavailable.


77-77: LGTM! Minimum test updated consistently.

Proper cache directory setup and availability state reset via monkeypatch.setattr.


141-141: LGTM! Vector add test with correct API usage.

The Triton kernel test properly exercises the cached build path and verifies execution correctness.

tests/compile/test_tvm_ffi_builder.py (8)

123-123: LGTM! CPU build test migrated to cache-aware API.

Test correctly uses the isolated_cache fixture and verifies kernel execution.


170-170: LGTM! CUDA GPU test updated consistently.


254-262: LGTM! Builder-level caching test properly validates cache behavior.

The test measures build time vs cache load time, demonstrating the caching benefit.


303-311: LGTM! Cross-builder caching test correctly validates shared cache.

Tests that different TVMFFIBuilder instances can share cached artifacts.


353-353: LGTM! call_dest test updated.


404-404: LGTM! Error handling test for invalid entry point.


436-436: LGTM! Error handling test for invalid sources.


469-469: LGTM! Subdirectory source handling test updated.

flashinfer_bench/compile/__init__.py (1)

1-14: LGTM! Clear and helpful docstring.

The module docstring provides a good overview of the package components and a concise workflow example.

tests/apply/test_runtime.py (1)

246-272: LGTM! Correctly updated to patch the public build method.

The change from _build to build aligns with the refactored builder API. The patch/restore logic is correct.

Note: This test is currently skipped (line 163). Ensure the underlying issue is tracked for resolution.

tests/compile/test_runnable.py (2)

5-5: LGTM! Import updated correctly.


17-27: Test correctly uses the new Runnable API, but there's a potential bug in the Runnable.cleanup() method.

The test construction and cleanup calls are correct. However, the Runnable class may have an inconsistency where the constructor assigns to self._cleaner but the cleanup() method references self._closer. This mismatch would cause the test to fail at runtime with an AttributeError since _closer would be undefined.

flashinfer_bench/compile/utils.py (1)

58-91: LGTM! Package name generation is sound.

The normalization handles edge cases well (digit prefix, special characters). Using the first 6 characters of the SHA1 hash provides reasonable uniqueness for cache keys.

flashinfer_bench/compile/builders/torch_builder.py (2)

50-68: LGTM! Clean dependency discovery pattern.

The lazy discovery of CUDA dependencies at init time with graceful fallback is appropriate.


284-311: Build logic is correct and well-structured.

The compilation flow using torch.utils.cpp_extension.load() is appropriate, with proper error handling and metadata construction. The cleaner pattern ensures resources can be released.

flashinfer_bench/compile/builder.py (1)

20-93: Well-structured abstract base class.

The Builder abstraction is clean and provides a good foundation for concrete builders. The get_package_name_and_build_path helper centralizes path computation, reducing duplication across subclasses.

flashinfer_bench/compile/builders/python_builder.py (2)

79-95: Cleaner implementation is sound.

The cleaner correctly handles module unloading, sys.path cleanup, and directory removal. The broad except Exception: pass is acceptable here since cleanup failures shouldn't propagate and prevent further cleanup steps.


97-162: Build method is well-structured with proper error handling.

The build flow correctly validates the entry file, writes sources, imports the module, and constructs the Runnable with appropriate metadata. Error handling properly invokes the cleaner before raising BuildError.

flashinfer_bench/compile/registry.py (3)

62-74: Clear method correctly handles cleanup failures.

The broad exception catch is appropriate here to ensure all runnables are processed even if some cleanup operations fail.


134-167: build_reference creates a well-formed pseudo-solution.

The method correctly constructs a Solution object with appropriate defaults for building reference implementations.


89-94: Missing is_available() in Builder interface.

builder_type.is_available() is called on line 92, but is_available() is not defined as an abstract method in the Builder base class. If any builder subclass doesn't implement this method, this will raise an AttributeError at runtime.

Consider adding is_available() as an abstract static method in Builder, or add a default implementation that returns True:

# In builder.py
+    @staticmethod
+    def is_available() -> bool:
+        """Check if this builder is available in the current environment.
+        
+        Returns True by default. Subclasses should override if they have
+        optional dependencies.
+        """
+        return True
flashinfer_bench/compile/builders/triton_builder.py (1)

31-48: is_available() and can_build() are correctly implemented.

The availability check properly handles import failures, and can_build correctly filters by language type.

flashinfer_bench/compile/runnable.py (2)

16-31: RunnableMetadata model is well-defined.

The Pydantic model provides good structure for metadata with appropriate fields and documentation.


93-159: call_dps implementation is solid.

The DPS wrapper correctly:

  • Validates the presence of the definition in metadata
  • Derives output shapes from input shapes
  • Handles device placement
  • Normalizes return values (None/single/tuple)
flashinfer_bench/compile/builders/tvm_ffi_builder.py (1)

206-296: LGTM!

The build method is well-implemented with proper multi-process safety through the FileLock and double-check pattern after acquiring the lock. The caching logic correctly handles concurrent builds, and the Runnable construction with metadata and cleaner follows the new builder pattern.

Comment on lines 186 to 213
def hash(self) -> str:
"""Compute a deterministic hash of the solution content.
The hash is computed from all fields that affect the solution's behavior:
name, definition, language, entry point, dependencies, and all source file
paths and contents. This ensures that any meaningful change to the solution
results in a different hash.
The hash is used for caching build artifacts - solutions with the same hash
can reuse the same cached build result.
Returns
-------
bool
True if the solution requires building (has build commands or uses CUDA),
False otherwise.
str
A SHA1 hash (40 hex characters) uniquely identifying this solution's content.
"""
return self.spec.language == SupportedLanguages.CUDA
h = hashlib.sha1()
for s in (
self.name,
self.definition,
self.spec.language,
self.spec.entry_point,
*self.spec.dependencies,
*(part for src in self.sources for part in (src.path, src.content)),
):
h.update(s.encode())

return h.hexdigest()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential TypeError if dependencies is None.

The dependencies field is typed as Optional[List[NonEmptyString]], meaning it can be explicitly set to None. If that happens, line 208 will raise TypeError: 'NoneType' object is not iterable when unpacking with *self.spec.dependencies.

         h = hashlib.sha1()
         for s in (
             self.name,
             self.definition,
             self.spec.language,
             self.spec.entry_point,
-            *self.spec.dependencies,
+            *(self.spec.dependencies or []),
             *(part for src in self.sources for part in (src.path, src.content)),
         ):
             h.update(s.encode())

Regarding the static analysis hint about SHA1 (S324): SHA1 is appropriate here since it's used for content-based cache keying, not cryptographic security. The concern is a false positive in this context.

🧰 Tools
🪛 Ruff (0.14.6)

202-202: Probable use of insecure hash functions in hashlib: sha1

(S324)

🤖 Prompt for AI Agents
In flashinfer_bench/data/solution.py around lines 186 to 213, the hash() method
unpacks self.spec.dependencies with *self.spec.dependencies which will raise
TypeError if dependencies is None; change to iterate over self.spec.dependencies
or an empty iterable (e.g. self.spec.dependencies or []) when building the tuple
of fields so None is treated as no dependencies, and ensure each item appended
is a str before calling .encode() (cast or skip None) so h.update(...) always
receives bytes.

@Ubospica Ubospica force-pushed the main-dev/2025-11-30-builders branch from d9c9328 to 37a813f Compare December 2, 2025 20:24
Signed-off-by: Ubospica <[email protected]>
@Ubospica Ubospica force-pushed the main-dev/2025-11-30-builders branch from 37a813f to d11e917 Compare December 2, 2025 20:25
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/compile/test_builder.py (1)

17-30: DummyBuilder does not correctly implement the Builder interface.

The Builder base class now requires implementing the abstract build(definition, solution) method, but DummyBuilder implements _build instead. Additionally:

  1. get_key() (line 21) is not part of the new Builder interface
  2. The Runnable constructor expects metadata: RunnableMetadata, but a plain dict {"dummy": True} is passed (line 29)

Apply this diff to align with the new API:

 class DummyBuilder(Builder):
+    def __init__(self):
+        super().__init__(key_prefix="dummy", build_dir_name="dummy_build")
+
     def can_build(self, solution: Solution) -> bool:
         return True

-    def get_key(self, solution: Solution) -> str:
-        return f"dummy::{solution.name}"
-
-    def _make_closer(self):
-        return lambda: None
-
-    def _build(self, definition: Definition, solution: Solution) -> Runnable:
+    def build(self, definition: Definition, solution: Solution) -> Runnable:
+        from flashinfer_bench.compile.runnable import RunnableMetadata
         return Runnable(
-            callable=lambda **kw: kw, cleaner=self._make_closer(), metadata={"dummy": True}
+            callable=lambda **kw: kw,
+            cleaner=lambda: None,
+            metadata=RunnableMetadata(solution_hash=solution.hash(), misc={"dummy": True}),
         )
♻️ Duplicate comments (5)
flashinfer_bench/compile/builders/python_builder.py (1)

108-112: Redundant build_path computation still present.

The build_path returned from get_package_name_and_build_path on line 108 is immediately overwritten by _get_build_path on line 111. Additionally, _get_build_path doesn't appear to be defined in this class or the base Builder class (based on the provided context), which would cause an AttributeError.

Either use the build_path from get_package_name_and_build_path directly, or define _get_build_path if different logic is needed:

         package_name, build_path = self.get_package_name_and_build_path(solution)
         module_name = package_name + "." + ".".join(Path(entry_file).with_suffix("").parts)

-        build_path = self._get_build_path(package_name)
         write_sources_to_path(build_path, solution.sources)
flashinfer_bench/compile/runnable.py (1)

12-13: BuildType is still missing "torch".

The TorchBuilder sets build_type="torch" (per torch_builder.py), but BuildType only includes "cuda", "tvm_ffi", "python", "triton". This will cause Pydantic validation errors at runtime. The docstring on line 24 mentions "torch" but the type union doesn't include it.

-BuildType = Literal["cuda", "tvm_ffi", "python", "triton"]
+BuildType = Literal["python", "torch", "triton", "tvm_ffi"]

Note: If "cuda" is no longer used (since CUDABuilder was removed per the summary), it should also be removed.

flashinfer_bench/compile/builders/triton_builder.py (1)

28-29: Use super().__init__() instead of bypassing PythonBuilder.__init__.

Directly calling Builder.__init__(self, ...) bypasses the MRO and PythonBuilder.__init__. If PythonBuilder adds initialization logic in the future, TritonBuilder won't inherit it, making the inheritance fragile.

     def __init__(self) -> None:
-        Builder.__init__(self, self._KEY_PREFIX, self._BUILD_DIR_NAME)
+        super().__init__()

Note: This requires PythonBuilder to accept and forward key_prefix and build_dir_name parameters, or TritonBuilder should call super().__init__() if PythonBuilder already does the right thing with its own class-level constants.

flashinfer_bench/data/solution.py (1)

202-211: Potential TypeError if dependencies is explicitly set to None.

The dependencies field is typed as Optional[List[NonEmptyString]], allowing explicit None values. If that happens, line 208 will raise TypeError: 'NoneType' object is not iterable when unpacking.

         h = hashlib.sha1()
         for s in (
             self.name,
             self.definition,
             self.spec.language,
             self.spec.entry_point,
-            *self.spec.dependencies,
+            *(self.spec.dependencies or []),
             *(part for src in self.sources for part in (src.path, src.content)),
         ):
             h.update(s.encode())

Note: The SHA1 static analysis warning (S324) is a false positive—SHA1 is appropriate for content-based cache keying where collision resistance (not cryptographic security) is the goal.

tests/compile/test_builder.py (1)

48-50: build_with_cache method does not exist on Builder.

The Builder base class no longer provides build_with_cache. Caching is now handled by BuilderRegistry. The test should either:

  1. Use builder.build(d, s) directly (no caching at builder level), or
  2. Use BuilderRegistry to test caching behavior

Apply this diff to use the correct API:

-    r1 = b.build_with_cache(d, s)
-    r2 = b.build_with_cache(d, s)
-    assert r1 is r2  # cache hit via get_key
-    b.clear_cache()
+    r1 = b.build(d, s)
+    r2 = b.build(d, s)
+    # Note: Builder no longer caches; each call returns a new Runnable
+    # To test caching, use BuilderRegistry instead

Alternatively, if testing registry-level caching is intended:

from flashinfer_bench.compile.registry import BuilderRegistry

registry = BuilderRegistry([b])
r1 = registry.build_impl(d, s)
r2 = registry.build_impl(d, s)
assert r1 is r2  # cache hit via solution hash
registry.clear()
🧹 Nitpick comments (9)
flashinfer_bench/compile/__init__.py (1)

1-20: LGTM! Clean module reorganization.

The updated docstring accurately describes the new workflow, and the public API surface is well-defined with Builder, BuildError, BuilderRegistry, Runnable, and RunnableMetadata.

The static analysis hint about sorting __all__ (RUF022) is a minor style preference. You may optionally sort it alphabetically for consistency:

-__all__ = ["Builder", "BuildError", "BuilderRegistry", "Runnable", "RunnableMetadata"]
+__all__ = ["BuildError", "Builder", "BuilderRegistry", "Runnable", "RunnableMetadata"]
flashinfer_bench/compile/runnable.py (1)

49-68: Consider renaming callable parameter to avoid shadowing builtin.

The parameter callable on line 51 shadows Python's builtin callable() function. While this works, it can cause confusion and linter warnings.

     def __init__(
         self,
-        callable: Callable[..., Any],
+        func: Callable[..., Any],
         metadata: RunnableMetadata,
         cleaner: Optional[Callable[[], None]] = None,
     ) -> None:
         ...
-        self._callable = callable
+        self._callable = func
flashinfer_bench/compile/builders/tvm_ffi_builder.py (1)

56-63: Remove unused noqa directive.

Static analysis indicates that the # noqa: F401 directive is unused because the F401 rule is not enabled in your Ruff configuration. The import is used for availability checking via the exception handler, so it's not actually unused.

         try:
-            import tvm_ffi  # noqa: F401
+            import tvm_ffi
         except ImportError:
             return False
flashinfer_bench/compile/builders/__init__.py (1)

8-8: Consider sorting __all__ alphabetically.

The static analysis tool flags that __all__ is not sorted. While the current order may be intentional (e.g., priority-based), alphabetical sorting is the conventional approach and improves maintainability.

-__all__ = ["TorchBuilder", "PythonBuilder", "TritonBuilder", "TVMFFIBuilder"]
+__all__ = ["PythonBuilder", "TorchBuilder", "TritonBuilder", "TVMFFIBuilder"]
flashinfer_bench/compile/builders/torch_builder.py (1)

223-240: Consider unloading the compiled module on cleanup.

Unlike PythonBuilder._get_cleaner which removes loaded modules from sys.modules, this cleaner only deletes the build directory. The compiled extension module loaded via torch.utils.cpp_extension.load() will remain in sys.modules, potentially causing issues if the same solution is rebuilt with different code.

     def _get_cleaner(self, build_dir: Path) -> Callable[[], None]:
-        """Create a cleaner function that removes the build directory.
+        """Create a cleaner function that unloads the module and removes the build directory.
 
         Parameters
         ----------
         build_dir : Path
             The directory to delete.
 
         Returns
         -------
         Callable[[], None]
             A function that performs the cleanup.
         """
+        package_name = build_dir.name
 
         def cleaner() -> None:
+            # Unload compiled extension module
+            try:
+                to_delete = [m for m in sys.modules if m == package_name or m.startswith(package_name + ".")]
+                for m in to_delete:
+                    sys.modules.pop(m, None)
+            except Exception:
+                pass
             shutil.rmtree(build_dir, ignore_errors=True)
 
         return cleaner
flashinfer_bench/compile/registry.py (3)

109-111: Variable hash shadows the built-in function.

Using hash as a variable name shadows Python's built-in hash() function, which can cause confusion and subtle bugs.

-        hash = sol.hash()
-        if hash in self._cache:
-            return self._cache[hash]
+        sol_hash = sol.hash()
+        if sol_hash in self._cache:
+            return self._cache[sol_hash]

And update line 117:

-                self._cache[hash] = runnable
+                self._cache[sol_hash] = runnable

163-167: Consider logging exceptions in clear() instead of silently ignoring them.

While catching exceptions during cleanup is appropriate to ensure all runnables are processed, silently swallowing them makes debugging difficult. Consider logging at debug/warning level.

+import logging
+
+logger = logging.getLogger(__name__)
+
     def clear(self) -> None:
         for runnable in self._cache.values():
             try:
                 runnable.cleanup()
-            except Exception:
-                pass
+            except Exception as e:
+                logger.debug("Cleanup failed for runnable: %s", e)
         self._cache.clear()

68-74: Singleton initialization is not thread-safe.

The get_instance() check-then-set pattern has a race condition in multithreaded environments. Two threads could simultaneously see _instance is None and create separate instances.

If thread safety is required, consider using a lock:

+import threading
+
+_instance_lock = threading.Lock()
+
     @classmethod
     def get_instance(cls) -> "BuilderRegistry":
-        if cls._instance is None:
-            builders = []
-            for builder_type in _BUILDER_PRIORITY:
-                if builder_type.is_available():
-                    builders.append(builder_type())
-            cls._instance = BuilderRegistry(builders)
+        if cls._instance is None:
+            with _instance_lock:
+                if cls._instance is None:  # Double-checked locking
+                    builders = []
+                    for builder_type in _BUILDER_PRIORITY:
+                        if builder_type.is_available():
+                            builders.append(builder_type())
+                    cls._instance = BuilderRegistry(builders)
         return cls._instance
flashinfer_bench/compile/builder.py (1)

7-7: Unused import Dict.

Dict is imported but not used in this file.

-from typing import Dict, Tuple
+from typing import Tuple
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d9c9328 and d11e917.

📒 Files selected for processing (29)
  • examples/win_at_p.py (0 hunks)
  • flashinfer_bench/apply/key.py (1 hunks)
  • flashinfer_bench/apply/runtime.py (2 hunks)
  • flashinfer_bench/apply/table.py (4 hunks)
  • flashinfer_bench/bench/benchmark.py (2 hunks)
  • flashinfer_bench/bench/evaluators/default.py (2 hunks)
  • flashinfer_bench/bench/evaluators/sampling.py (2 hunks)
  • flashinfer_bench/bench/runner/isolated_runner.py (2 hunks)
  • flashinfer_bench/bench/runner/persistent_runner.py (3 hunks)
  • flashinfer_bench/compile/__init__.py (1 hunks)
  • flashinfer_bench/compile/builder.py (1 hunks)
  • flashinfer_bench/compile/builders/__init__.py (1 hunks)
  • flashinfer_bench/compile/builders/cuda_builder.py (0 hunks)
  • flashinfer_bench/compile/builders/python_builder.py (1 hunks)
  • flashinfer_bench/compile/builders/torch_builder.py (1 hunks)
  • flashinfer_bench/compile/builders/triton_builder.py (1 hunks)
  • flashinfer_bench/compile/builders/tvm_ffi_builder.py (8 hunks)
  • flashinfer_bench/compile/registry.py (2 hunks)
  • flashinfer_bench/compile/runnable.py (2 hunks)
  • flashinfer_bench/compile/utils.py (1 hunks)
  • flashinfer_bench/data/solution.py (5 hunks)
  • flashinfer_bench/data/trace_set.py (0 hunks)
  • tests/apply/test_runtime.py (2 hunks)
  • tests/compile/test_builder.py (2 hunks)
  • tests/compile/test_python_builder.py (2 hunks)
  • tests/compile/test_runnable.py (2 hunks)
  • tests/compile/test_triton_builder.py (3 hunks)
  • tests/compile/test_tvm_ffi_builder.py (8 hunks)
  • tests/data/test_solution.py (0 hunks)
💤 Files with no reviewable changes (4)
  • tests/data/test_solution.py
  • flashinfer_bench/data/trace_set.py
  • flashinfer_bench/compile/builders/cuda_builder.py
  • examples/win_at_p.py
🚧 Files skipped from review as they are similar to previous changes (6)
  • flashinfer_bench/apply/key.py
  • tests/compile/test_triton_builder.py
  • tests/compile/test_python_builder.py
  • tests/compile/test_runnable.py
  • tests/apply/test_runtime.py
  • flashinfer_bench/compile/utils.py
🧰 Additional context used
🧬 Code graph analysis (14)
flashinfer_bench/compile/builders/torch_builder.py (7)
flashinfer_bench/compile/runnable.py (2)
  • Runnable (33-172)
  • RunnableMetadata (16-30)
flashinfer_bench/compile/builder.py (5)
  • Builder (20-104)
  • BuildError (16-17)
  • can_build (46-59)
  • build (62-85)
  • get_package_name_and_build_path (87-104)
flashinfer_bench/compile/utils.py (1)
  • write_sources_to_path (12-55)
flashinfer_bench/data/solution.py (4)
  • Solution (100-213)
  • SupportedLanguages (13-27)
  • get_entry_path (145-156)
  • get_entry_symbol (158-170)
flashinfer_bench/env.py (1)
  • get_fib_cache_path (46-57)
flashinfer_bench/compile/builders/triton_builder.py (3)
  • is_available (32-44)
  • can_build (46-60)
  • build (62-82)
flashinfer_bench/compile/builders/python_builder.py (4)
  • can_build (37-39)
  • _get_cleaner (41-76)
  • cleaner (60-74)
  • build (78-143)
flashinfer_bench/bench/evaluators/default.py (2)
flashinfer_bench/compile/registry.py (3)
  • BuilderRegistry (17-168)
  • get_instance (56-74)
  • build_reference (121-154)
flashinfer_bench/compile/runnable.py (1)
  • Runnable (33-172)
flashinfer_bench/apply/table.py (1)
flashinfer_bench/compile/registry.py (2)
  • BuilderRegistry (17-168)
  • get_instance (56-74)
flashinfer_bench/compile/builders/tvm_ffi_builder.py (3)
flashinfer_bench/compile/builder.py (2)
  • build (62-85)
  • get_package_name_and_build_path (87-104)
flashinfer_bench/compile/runnable.py (2)
  • Runnable (33-172)
  • RunnableMetadata (16-30)
flashinfer_bench/compile/utils.py (1)
  • write_sources_to_path (12-55)
flashinfer_bench/bench/benchmark.py (1)
flashinfer_bench/compile/registry.py (2)
  • BuilderRegistry (17-168)
  • get_instance (56-74)
tests/compile/test_builder.py (1)
flashinfer_bench/compile/runnable.py (1)
  • Runnable (33-172)
flashinfer_bench/bench/runner/isolated_runner.py (2)
flashinfer_bench/compile/registry.py (2)
  • BuilderRegistry (17-168)
  • get_instance (56-74)
flashinfer_bench/compile/runnable.py (1)
  • Runnable (33-172)
flashinfer_bench/apply/runtime.py (1)
flashinfer_bench/compile/registry.py (3)
  • BuilderRegistry (17-168)
  • get_instance (56-74)
  • build (77-82)
tests/compile/test_tvm_ffi_builder.py (1)
flashinfer_bench/compile/builders/tvm_ffi_builder.py (1)
  • TVMFFIBuilder (23-297)
flashinfer_bench/bench/evaluators/sampling.py (2)
flashinfer_bench/compile/registry.py (3)
  • BuilderRegistry (17-168)
  • get_instance (56-74)
  • build_reference (121-154)
flashinfer_bench/compile/runnable.py (1)
  • Runnable (33-172)
flashinfer_bench/compile/builders/python_builder.py (6)
flashinfer_bench/compile/builder.py (5)
  • Builder (20-104)
  • BuildError (16-17)
  • can_build (46-59)
  • build (62-85)
  • get_package_name_and_build_path (87-104)
flashinfer_bench/compile/runnable.py (2)
  • Runnable (33-172)
  • RunnableMetadata (16-30)
flashinfer_bench/compile/utils.py (2)
  • create_package_name (58-91)
  • write_sources_to_path (12-55)
flashinfer_bench/data/solution.py (4)
  • Solution (100-213)
  • SupportedLanguages (13-27)
  • get_entry_path (145-156)
  • get_entry_symbol (158-170)
flashinfer_bench/env.py (1)
  • get_fib_cache_path (46-57)
flashinfer_bench/compile/builders/torch_builder.py (4)
  • can_build (150-152)
  • _get_cleaner (223-240)
  • cleaner (237-238)
  • build (242-312)
flashinfer_bench/compile/runnable.py (5)
flashinfer_bench/data/definition.py (2)
  • get_var_values (239-279)
  • get_output_shapes (343-363)
flashinfer_bench/utils.py (1)
  • dtype_str_to_torch_dtype (39-45)
flashinfer_bench/compile/builders/python_builder.py (1)
  • cleaner (60-74)
flashinfer_bench/compile/builders/torch_builder.py (1)
  • cleaner (237-238)
flashinfer_bench/compile/builders/tvm_ffi_builder.py (1)
  • cleaner (202-203)
flashinfer_bench/compile/__init__.py (3)
flashinfer_bench/compile/builder.py (2)
  • Builder (20-104)
  • BuildError (16-17)
flashinfer_bench/compile/registry.py (1)
  • BuilderRegistry (17-168)
flashinfer_bench/compile/runnable.py (2)
  • Runnable (33-172)
  • RunnableMetadata (16-30)
flashinfer_bench/compile/builders/__init__.py (1)
flashinfer_bench/compile/builders/torch_builder.py (1)
  • TorchBuilder (31-312)
🪛 Ruff (0.14.7)
flashinfer_bench/compile/builders/torch_builder.py

124-124: Consider [f"/LIBPATH:{lib_path}", *lib_names] instead of concatenation

Replace with [f"/LIBPATH:{lib_path}", *lib_names]

(RUF005)


126-126: Do not catch blind exception: Exception

(BLE001)


212-215: Avoid specifying long messages outside the exception class

(TRY003)


270-273: Avoid specifying long messages outside the exception class

(TRY003)


296-296: Avoid specifying long messages outside the exception class

(TRY003)


301-301: Avoid specifying long messages outside the exception class

(TRY003)

flashinfer_bench/data/solution.py

202-202: Probable use of insecure hash functions in hashlib: sha1

(S324)

flashinfer_bench/compile/builders/tvm_ffi_builder.py

60-60: Unused noqa directive (non-enabled: F401)

Remove unused noqa directive

(RUF100)


268-270: Avoid specifying long messages outside the exception class

(TRY003)


294-294: Avoid specifying long messages outside the exception class

(TRY003)

tests/compile/test_builder.py

27-27: Unused method argument: definition

(ARG002)


27-27: Unused method argument: solution

(ARG002)

flashinfer_bench/compile/builders/python_builder.py

66-67: try-except-pass detected, consider logging the exception

(S110)


66-66: Do not catch blind exception: Exception

(BLE001)


106-106: Avoid specifying long messages outside the exception class

(TRY003)


122-122: Avoid specifying long messages outside the exception class

(TRY003)


128-130: Avoid specifying long messages outside the exception class

(TRY003)


134-134: Avoid specifying long messages outside the exception class

(TRY003)

flashinfer_bench/compile/registry.py

51-51: Avoid specifying long messages outside the exception class

(TRY003)


119-119: Avoid specifying long messages outside the exception class

(TRY003)


166-167: try-except-pass detected, consider logging the exception

(S110)


166-166: Do not catch blind exception: Exception

(BLE001)

flashinfer_bench/compile/runnable.py

128-131: Avoid specifying long messages outside the exception class

(TRY003)

flashinfer_bench/compile/__init__.py

20-20: __all__ is not sorted

Apply an isort-style sorting to __all__

(RUF022)

flashinfer_bench/compile/builders/__init__.py

8-8: __all__ is not sorted

Apply an isort-style sorting to __all__

(RUF022)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.10
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.11
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.12
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.13
🔇 Additional comments (30)
flashinfer_bench/compile/builders/python_builder.py (2)

34-39: LGTM! Clean builder initialization.

The class properly delegates to the base Builder with appropriate prefix and directory name constants. The can_build method correctly filters for Python solutions.


136-143: LGTM! Well-structured metadata and runnable creation.

The RunnableMetadata is properly constructed with all relevant fields, and the Runnable is returned with the cleaner attached for proper resource management.

flashinfer_bench/compile/runnable.py (1)

161-172: LGTM! Idempotent cleanup implementation.

The cleanup method correctly handles the optional cleaner and ensures idempotence by clearing the reference after use.

flashinfer_bench/compile/builders/tvm_ffi_builder.py (2)

239-270: LGTM! Robust caching with double-check locking.

The implementation correctly uses FileLock for multi-process synchronization and employs the double-check pattern after acquiring the lock to handle race conditions. The cache validation via _check_sources ensures content integrity.


279-297: LGTM! Clean metadata construction and runnable return.

The metadata correctly includes the definition object in misc for DPS-style calls via call_value_returning(). The cleaner is properly attached for resource management.

flashinfer_bench/bench/runner/persistent_runner.py (3)

19-19: LGTM! Import updated for new registry API.

The import correctly uses BuilderRegistry from flashinfer_bench.compile, aligning with the singleton pattern introduced in this PR.


68-68: LGTM! Registry initialization uses singleton pattern.

Correctly uses BuilderRegistry.get_instance() to obtain the shared registry instance.


651-651: LGTM! Worker process correctly initializes registry.

The spawned worker process correctly obtains its own registry instance via BuilderRegistry.get_instance(). Since the registry is initialized per-process, this ensures proper isolation in the multiprocessing context.

flashinfer_bench/apply/table.py (2)

10-10: LGTM! Import updated for new registry API.

The import correctly brings in BuilderRegistry and Runnable from the compile module.


118-118: LGTM! Consistent registry usage across all methods.

All three locations (load_or_build, _build, and _prewarm_aot) correctly use BuilderRegistry.get_instance() to obtain the singleton registry before calling build(). The usage pattern is consistent throughout the file.

Also applies to: 172-172, 270-270

tests/compile/test_tvm_ffi_builder.py (2)

123-123: LGTM! Consistent migration to cache-enabled build API.

The update from build() to build_with_cache() aligns with the PR's refactoring objectives. The test correctly exercises the new cache-aware build pathway while maintaining the same verification logic for the runnable output.


254-260: Good test coverage for caching behavior.

The caching tests appropriately verify that build_with_cache reuses cached artifacts both within the same builder instance and across different builder instances. The timing printouts help confirm cache hits during manual debugging.

flashinfer_bench/bench/benchmark.py (1)

6-6: LGTM! Clean migration to singleton registry pattern.

The import change and subsequent usage of BuilderRegistry.get_instance() on line 49 correctly implements the new singleton-based registry access pattern. The benchmark logic remains unchanged.

flashinfer_bench/bench/evaluators/default.py (1)

19-19: LGTM! Consolidated import and singleton access.

The import correctly brings in both BuilderRegistry and Runnable from the compile module. The usage on line 48 properly accesses the singleton to build the reference runnable.

flashinfer_bench/bench/evaluators/sampling.py (1)

21-21: LGTM! Consistent with the refactoring pattern.

The import and usage changes mirror those in default.py, maintaining consistency across evaluators. The Runnable type import is appropriately included for the type annotations used in method signatures.

flashinfer_bench/compile/builders/__init__.py (1)

1-2: Good addition of module docstring.

The docstring clearly describes the module's purpose. This improves documentation for the builders package.

flashinfer_bench/apply/runtime.py (2)

13-13: LGTM!

Import correctly updated to use the new singleton BuilderRegistry class instead of the removed get_builder_registry function.


178-178: LGTM!

Both build invocation sites correctly migrated to use BuilderRegistry.get_instance().build(defn, sol). The singleton pattern ensures consistent builder registration and caching across all call sites.

Also applies to: 186-186

flashinfer_bench/bench/runner/isolated_runner.py (2)

17-17: LGTM!

Import correctly updated to use BuilderRegistry and Runnable from the compile module.


233-244: LGTM!

The worker correctly obtains a fresh BuilderRegistry instance in the spawned subprocess. Since multiprocessing with spawn context creates an isolated process, this properly initializes a new registry per worker, which aligns with the strong isolation design of SubprocessWorker.

flashinfer_bench/compile/builders/triton_builder.py (2)

31-44: LGTM!

The is_available() method correctly detects Triton availability via import check, with proper exception handling for missing dependency.


62-82: LGTM!

The build() method correctly delegates to PythonBuilder.build() and tags the result with build_type = "triton" for proper identification in metadata.

flashinfer_bench/compile/builders/torch_builder.py (4)

51-54: LGTM!

Proper initialization using super().__init__() to correctly invoke the Builder base class with the key prefix and build directory name.


56-133: LGTM!

The dependency discovery logic is well-structured:

  • Handles both Linux and Windows platforms
  • Gracefully continues on discovery failures (line 126-131)
  • Properly discovers include paths and library paths for cuBLAS, cuDNN, and CUTLASS

The broad except Exception on line 126 is appropriate here since discovery failures should not be fatal—dependencies are only required if a solution explicitly declares them.


298-312: LGTM!

The symbol extraction, metadata creation, and runnable wrapping follow the same pattern as other builders. The metadata correctly includes the binary path for debugging purposes.


285-294: No change needed — build_directory correctly accepts Path objects.

torch.utils.cpp_extension.load() officially supports both string and os.PathLike objects (including pathlib.Path) for the build_directory parameter. Converting to str() is unnecessary.

Likely an incorrect or invalid review comment.

flashinfer_bench/data/solution.py (2)

24-25: LGTM!

New CPP language option correctly added to support pure C++ solutions alongside CUDA.


145-170: LGTM!

Well-documented helper methods that encapsulate entry point parsing logic, making the code more maintainable. The [-1] index in get_entry_symbol() is safe since the entry point format is validated by _validate_entry_point.

flashinfer_bench/compile/builder.py (2)

87-104: LGTM!

The get_package_name_and_build_path method provides a clean, consistent way to generate build paths. The documentation clearly explains the path structure and return values.


20-59: Missing is_available() classmethod required by BuilderRegistry.

The BuilderRegistry.get_instance() in registry.py (line 71) calls builder_type.is_available() on each builder type. However, the Builder base class does not define this method, which will cause an AttributeError if a concrete builder doesn't implement it.

Consider adding an abstract or default is_available() classmethod:

+    @classmethod
+    def is_available(cls) -> bool:
+        """Check if this builder is available in the current environment.
+
+        Returns
+        -------
+        bool
+            True if the builder can be used, False otherwise.
+            Subclasses should override this to check for required dependencies.
+        """
+        return True
+
     @abstractmethod
     def can_build(self, solution: Solution) -> bool:

Verify that all concrete builders implement is_available().

sources=[SourceFile(path="main.py", content=defn.reference)],
description="reference",
)
return self.build(defn, pseudo)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Instance method build_reference calls classmethod build, creating potential confusion.

build_reference is an instance method that calls self.build(defn, pseudo), but build is a classmethod that retrieves the singleton via get_instance(). If self is not the singleton instance, this creates an inconsistency where the instance's _cache isn't used.

Consider using build_impl directly for consistency:

-        return self.build(defn, pseudo)
+        return self.build_impl(defn, pseudo)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return self.build(defn, pseudo)
return self.build_impl(defn, pseudo)
🤖 Prompt for AI Agents
In flashinfer_bench/compile/registry.py around line 154, the instance method
build_reference currently calls the classmethod build via self.build(defn,
pseudo), which fetches the singleton and bypasses the current instance's _cache;
change the call to use the instance implementation directly (call
self.build_impl(defn, pseudo)) so the instance's cache/context is used
consistently, ensuring build_reference operates on this object's state rather
than the singleton.

Signed-off-by: Ubospica <[email protected]>
@Ubospica
Copy link
Collaborator Author

Ubospica commented Dec 3, 2025

/gemini review

Copy link
Contributor

@YiyanZhai YiyanZhai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and well-executed refactoring of the builder system. The changes greatly improve the architecture by centralizing build caching in a BuilderRegistry singleton, simplifying the Builder interface, and standardizing the Runnable object. The introduction of solution hashing for caching is a major improvement for performance and correctness. The codebase is now more modular, extensible, and easier to maintain. I have one suggestion to improve the flexibility of the new TorchBuilder.

Comment on lines +158 to +162
arg_order = list(definition.inputs.keys())

def kwarg_adapter(**kwargs):
args = [kwargs[name] for name in arg_order]
return callable(*args)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation of kwarg_adapter only passes input arguments to the compiled function. This limits the TorchBuilder to value-returning kernels and makes it inconsistent with TVMFFIBuilder, which supports destination-passing style (DPS) kernels.

To improve flexibility and align with the Runnable.call_value_returning API, I suggest modifying the adapter to also handle output arguments. This would allow TorchBuilder to be used for both value-returning and DPS kernels, making the builder system more consistent.

Suggested change
arg_order = list(definition.inputs.keys())
def kwarg_adapter(**kwargs):
args = [kwargs[name] for name in arg_order]
return callable(*args)
arg_order = list(definition.inputs.keys()) + list(definition.outputs.keys())
def kwarg_adapter(**kwargs):
args = [kwargs[name] for name in arg_order if name in kwargs]
return callable(*args)

Signed-off-by: Ubospica <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (3)
flashinfer_bench/compile/builders/triton_builder.py (1)

28-29: Use super().__init__() to respect the MRO.

The current implementation directly calls Builder.__init__, bypassing PythonBuilder.__init__. While this works now because PythonBuilder.__init__ only calls its superclass, this is fragile. If PythonBuilder adds initialization logic in the future, TritonBuilder would miss it.

     def __init__(self) -> None:
-        Builder.__init__(self, self._PACKAGE_PREFIX, self._BUILD_DIR_NAME)
+        super().__init__()

Note: This requires adjusting how the prefix and build dir are passed, or keeping the class variables as overrides that PythonBuilder respects.

flashinfer_bench/compile/registry.py (1)

154-154: Use build_impl for consistency.

build_reference is an instance method but calls self.build(), which is a classmethod that retrieves the singleton via get_instance(). While this works when self is the singleton, it's conceptually inconsistent. Call build_impl directly for clarity.

-        return self.build(definition, pseudo)
+        return self.build_impl(definition, pseudo)
flashinfer_bench/data/solution.py (1)

202-213: Potential TypeError if dependencies is None.

The dependencies field is typed as Optional[List[NonEmptyString]] with default=[], but it can still be explicitly set to None. Line 208 will raise TypeError: 'NoneType' object is not iterable when unpacking.

Apply this diff to handle None safely:

         h = hashlib.sha1()
         for s in (
             self.name,
             self.definition,
             self.spec.language,
             self.spec.entry_point,
-            *self.spec.dependencies,
+            *(self.spec.dependencies or []),
             *(part for src in self.sources for part in (src.path, src.content)),
         ):
             h.update(s.encode())

Note: The SHA1 usage is appropriate here since it's for content-based cache keying, not cryptographic security.

🧹 Nitpick comments (17)
flashinfer_bench/cli/main.py (1)

86-89: Tighten solutions loop and address unused def_name

def_name in for def_name, solutions in trace_set.solutions.items(): is unused in the loop body (Ruff B007). You can simplify and silence the warning by iterating over values only:

-    for def_name, solutions in trace_set.solutions.items():
-        for solution in solutions:
-            out_path = output_dir / "solutions" / f"{solution.name}.json"
-            save_json_file(solution, out_path)
+    for solutions in trace_set.solutions.values():
+        for solution in solutions:
+            out_path = output_dir / "solutions" / f"{solution.name}.json"
+            save_json_file(solution, out_path)

This keeps the existing behavior (file path based solely on solution.name) while cleaning up the unused variable.

tests/apply/test_runtime.py (1)

137-139: Build-method patching aligns with new public API

Switching the test hook from _build to build and restoring it in a try/finally block keeps the registry-based cache test valid under the new Builder API and avoids leaks across tests. If you ever want tighter scoping, this could also be expressed via monkeypatch.setattr, but it’s not required here.

Also applies to: 245-273

tests/data/test_definition.py (1)

8-12: Shared sample_reference_code fixture improves test clarity

Centralizing the minimal reference function in a fixture and renaming locals to definition keeps the tests aligned with the refactored API and reduces duplication. You might optionally add a return type hint (-> str) to the fixture for type-checker friendliness, but functionally this looks solid.

Also applies to: 46-50, 73-118

tests/compile/test_utils.py (1)

12-103: Utils test coverage looks good and matches intended semantics

The tests exercise write_sources_to_path (file count, locations, and contents) and create_package_name (prefix, normalization, and deterministic hashing) in a way that closely matches the described behavior. You could optionally also assert that the list of returned paths equals the expected concrete Path objects to make the write-source test a bit stricter, but the current checks are already effective.

tests/bench/test_persistent_runner.py (1)

83-103: Persistent runner tests correctly track definition and CUDA targets

Updating baselines to assert on baseline.definition, using target_hardware=["cuda"], and passing Solution objects consistently into run_solution keep these tests in sync with the refactored runner and builder APIs. The registry-cache timing test is a nice functional check; if it ever proves flaky, you might relax the assertion slightly (e.g., allow a small epsilon or <=) while still verifying that caching is effective.

Also applies to: 117-139, 166-188, 228-343, 408-467

tests/bench/test_evaluator.py (1)

61-139: Evaluator tests are in sync with the new definition-centric API

Switching all evaluator calls to definition=... and updating sampling expectations to use the expected_probs key keeps the tests aligned with the refactored evaluator contracts. If you ever want these tests to mirror real sampling shapes more closely, you could use a 2D expected_probs tensor, but for the targeted error paths being exercised this isn’t necessary.

Also applies to: 141-195, 197-255

tests/compile/test_torch_builder.py (1)

20-142: TorchBuilder CPU/CUDA tests exercise realistic build paths

These tests provide strong coverage of TorchBuilder: simple CPU bindings, a many-arg overhead scenario, and CUDA kernels (including a cuBLAS-backed matmul) all built and exercised via the same build(definition, solution) entrypoint. The _use_tmp_cache_dir autouse fixture neatly ensures a per-test cache. The only mismatch is naming vs. assertions in test_cpu_overhead: it currently just prints average time without asserting any bound; you could either add a loose upper-bound assertion or rename the test to reflect that it’s mainly a smoke test for many-arg call overhead.

Also applies to: 144-303

flashinfer_bench/apply/table.py (1)

329-329: Consider using sha256 for consistency.

While sha1 is acceptable here for content fingerprinting (non-security context), the final digest computation at line 366 already uses sha256. Using sha256 consistently throughout would eliminate the static analysis warning and provide uniformity.

-                            "sha1": hashlib.sha1(sf["content"].encode("utf-8")).hexdigest(),
+                            "sha256": hashlib.sha256(sf["content"].encode("utf-8")).hexdigest(),
flashinfer_bench/apply/runtime.py (1)

184-186: Redundant definition check in miss policy path.

At line 185, the check if definition and solution is partially redundant. The definition variable was already confirmed to be non-None at line 160 (otherwise we'd have returned at line 163). Only the solution check is needed here.

                 solution = self._trace_set.get_solution(best_sol_name)
-                if definition and solution:
+                if solution:
                     runnable = BuilderRegistry.get_instance().build(definition, solution)
flashinfer_bench/compile/registry.py (2)

109-109: Variable name shadows builtin hash.

The variable hash shadows Python's builtin function. Consider renaming to solution_hash or cache_key.

-        hash = solution.hash()
-        if hash in self._cache:
-            return self._cache[hash]
+        cache_key = solution.hash()
+        if cache_key in self._cache:
+            return self._cache[cache_key]
 
         for builder in self._builders:
             # Choose the first builder that can build the solution
             if builder.can_build(solution):
                 runnable = builder.build(definition, solution)
-                self._cache[hash] = runnable
+                self._cache[cache_key] = runnable
                 return runnable

163-167: Consider logging cleanup failures.

While silently ignoring cleanup errors is reasonable to ensure all runnables are processed, completely silent failures may hide bugs. Consider logging at debug level for diagnosability.

+import logging
+
+logger = logging.getLogger(__name__)
+
         for runnable in self._cache.values():
             try:
                 runnable.cleanup()
-            except Exception:
-                pass
+            except Exception as e:
+                logger.debug("Cleanup failed for runnable: %s", e)
         self._cache.clear()
tests/compile/test_builder.py (1)

35-62: Test assertions don't verify caching behavior.

The test is named test_builder_cache_and_key but only asserts that both build results are not None. Since caching is now handled by BuilderRegistry (not individual builders), consider either:

  1. Renaming the test to reflect what it actually tests (e.g., test_builder_build_returns_runnable)
  2. Or updating assertions to verify the Runnable properties (e.g., metadata fields)
     r1 = builder.build(definition, solution)
     r2 = builder.build(definition, solution)
-    # Both builds return Runnable objects
-    assert r1 is not None
-    assert r2 is not None
+    # Both builds return valid Runnable objects
+    assert r1.metadata.build_type == "python"
+    assert r1.metadata.definition == "test_def"
+    assert r1.metadata.solution == "s1"
+    assert r2 is not None
flashinfer_bench/compile/builders/torch_builder.py (2)

158-175: Consider type safety for raw_callable attribute.

Line 174 dynamically assigns result.raw_callable = callable, but this attribute isn't defined on the Runnable class. While this works in Python, it bypasses type checking and may confuse IDE tooling.

Consider adding raw_callable to RunnableMetadata.misc instead:

         metadata = RunnableMetadata(
             build_type="torch",
             definition=definition.name,
             solution=solution.name,
-            misc={"entry_symbol": symbol, "binary": getattr(ext, "__file__", None)},
+            misc={"entry_symbol": symbol, "binary": getattr(ext, "__file__", None), "raw_callable": callable},
         )
 
         cleaner = self._get_cleaner(build_dir)
 
         result = Runnable(callable=kwarg_adapter, metadata=metadata, cleaner=cleaner)
-        result.raw_callable = callable
         return result

269-270: Consider using list unpacking for clarity.

Minor style improvement as suggested by static analysis.

-                    elif sys.platform == "win32":
-                        ldflags = [f"/LIBPATH:{lib_path}"] + lib_names
+                    elif sys.platform == "win32":
+                        ldflags = [f"/LIBPATH:{lib_path}", *lib_names]
flashinfer_bench/bench/runner/isolated_runner.py (1)

171-174: Consider using logging.exception for consistency.

Line 172 uses LOGGER.error with exc_info=True, while line 174 does the same pattern. Per static analysis, LOGGER.exception is semantically clearer when logging exceptions with tracebacks.

-            LOGGER.error("Worker crashed (EOF) running %s: %s", solution.name, e)
+            LOGGER.exception("Worker crashed (EOF) running %s", solution.name)
         except Exception:
-            LOGGER.error("Unknown error running %s", solution.name, exc_info=True)
+            LOGGER.exception("Unknown error running %s", solution.name)
flashinfer_bench/compile/runnable.py (1)

47-66: Parameter callable shadows Python builtin.

Using callable as a parameter name shadows the builtin callable() function. While this works, it can cause confusion or issues if the builtin is needed within the class.

     def __init__(
         self,
-        callable: Callable[..., Any],
+        fn: Callable[..., Any],
         metadata: RunnableMetadata,
         cleaner: Optional[Callable[[], None]] = None,
     ) -> None:
         """Constructor for the Runnable class.

         Parameters
         ----------
-        callable : Callable[..., Any]
+        fn : Callable[..., Any]
             The callable that is wrapped by the runnable.
         metadata : RunnableMetadata
             The metadata for the runnable.
         cleaner : Optional[Callable[[], None]]
             The cleaner function for the runnable. It will clean up the build artifacts/resources.
         """
-        self._callable = callable
+        self._callable = fn
         self.metadata = metadata
         self._cleaner = cleaner
flashinfer_bench/compile/builders/python_builder.py (1)

46-81: Consider logging the exception for debuggability.

The try-except-pass pattern at lines 71-72 silently swallows module unload failures. While acceptable for cleanup, logging at debug level would help diagnose issues.

+import logging
+
+logger = logging.getLogger(__name__)
+
 ...
         def cleaner() -> None:
             try:
                 # Unload module and submodules
                 to_delete = [m for m in sys.modules if m == package or m.startswith(package + ".")]
                 for m in to_delete:
                     sys.modules.pop(m, None)
-            except Exception:
-                pass
+            except Exception as e:
+                logger.debug("Failed to unload module %s: %s", package, e)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d11e917 and 98356f2.

📒 Files selected for processing (42)
  • examples/fi_gqa_e2e_example.py (4 hunks)
  • flashinfer_bench/apply/key.py (5 hunks)
  • flashinfer_bench/apply/runtime.py (2 hunks)
  • flashinfer_bench/apply/table.py (7 hunks)
  • flashinfer_bench/bench/benchmark.py (5 hunks)
  • flashinfer_bench/bench/evaluators/default.py (6 hunks)
  • flashinfer_bench/bench/evaluators/evaluator.py (4 hunks)
  • flashinfer_bench/bench/evaluators/lowbit.py (1 hunks)
  • flashinfer_bench/bench/evaluators/registry.py (1 hunks)
  • flashinfer_bench/bench/evaluators/sampling.py (9 hunks)
  • flashinfer_bench/bench/runner/isolated_runner.py (16 hunks)
  • flashinfer_bench/bench/runner/persistent_runner.py (17 hunks)
  • flashinfer_bench/bench/runner/runner.py (2 hunks)
  • flashinfer_bench/bench/utils.py (6 hunks)
  • flashinfer_bench/cli/main.py (1 hunks)
  • flashinfer_bench/compile/builder.py (1 hunks)
  • flashinfer_bench/compile/builders/python_builder.py (1 hunks)
  • flashinfer_bench/compile/builders/torch_builder.py (1 hunks)
  • flashinfer_bench/compile/builders/triton_builder.py (1 hunks)
  • flashinfer_bench/compile/builders/tvm_ffi_builder.py (9 hunks)
  • flashinfer_bench/compile/registry.py (1 hunks)
  • flashinfer_bench/compile/runnable.py (2 hunks)
  • flashinfer_bench/compile/utils.py (1 hunks)
  • flashinfer_bench/data/solution.py (5 hunks)
  • flashinfer_bench/data/trace_set.py (2 hunks)
  • pyproject.toml (2 hunks)
  • tests/apply/test_runtime.py (3 hunks)
  • tests/bench/test_benchmark.py (3 hunks)
  • tests/bench/test_evaluator.py (14 hunks)
  • tests/bench/test_isolated_runner.py (3 hunks)
  • tests/bench/test_persistent_runner.py (11 hunks)
  • tests/compile/test_builder.py (2 hunks)
  • tests/compile/test_cuda_builder.py (0 hunks)
  • tests/compile/test_python_builder.py (4 hunks)
  • tests/compile/test_torch_builder.py (1 hunks)
  • tests/compile/test_triton_builder.py (4 hunks)
  • tests/compile/test_tvm_ffi_builder.py (18 hunks)
  • tests/compile/test_utils.py (1 hunks)
  • tests/conftest.py (1 hunks)
  • tests/data/test_definition.py (6 hunks)
  • tests/data/test_read_traces.py (1 hunks)
  • tests/data/test_solution.py (1 hunks)
💤 Files with no reviewable changes (1)
  • tests/compile/test_cuda_builder.py
🚧 Files skipped from review as they are similar to previous changes (4)
  • flashinfer_bench/bench/benchmark.py
  • tests/data/test_solution.py
  • tests/compile/test_triton_builder.py
  • flashinfer_bench/compile/utils.py
🧰 Additional context used
🧬 Code graph analysis (19)
tests/bench/test_evaluator.py (3)
tests/bench/test_persistent_runner.py (1)
  • _simple_def (42-50)
flashinfer_bench/bench/evaluators/sampling.py (1)
  • SamplingEvaluator (27-212)
flashinfer_bench/bench/evaluators/evaluator.py (1)
  • evaluate (65-106)
tests/conftest.py (2)
flashinfer_bench/compile/builder.py (1)
  • is_available (48-59)
flashinfer_bench/compile/builders/torch_builder.py (1)
  • is_available (42-54)
flashinfer_bench/bench/evaluators/evaluator.py (3)
flashinfer_bench/bench/evaluators/default.py (2)
  • can_evaluate (35-36)
  • build_baseline (39-91)
flashinfer_bench/bench/evaluators/lowbit.py (1)
  • can_evaluate (21-22)
flashinfer_bench/bench/evaluators/sampling.py (2)
  • can_evaluate (31-32)
  • build_baseline (36-82)
flashinfer_bench/cli/main.py (1)
flashinfer_bench/data/json_utils.py (1)
  • save_json_file (11-26)
flashinfer_bench/bench/evaluators/default.py (5)
flashinfer_bench/compile/registry.py (3)
  • BuilderRegistry (17-168)
  • get_instance (56-74)
  • build_reference (121-154)
flashinfer_bench/compile/runnable.py (1)
  • Runnable (31-170)
flashinfer_bench/bench/evaluators/sampling.py (1)
  • can_evaluate (31-32)
flashinfer_bench/data/trace.py (1)
  • Workload (56-70)
flashinfer_bench/utils.py (1)
  • dtype_str_to_torch_dtype (39-45)
flashinfer_bench/compile/registry.py (7)
flashinfer_bench/data/solution.py (5)
  • BuildSpec (64-97)
  • Solution (100-213)
  • SourceFile (30-61)
  • SupportedLanguages (13-27)
  • hash (186-213)
flashinfer_bench/compile/builder.py (5)
  • Builder (20-120)
  • BuildError (16-17)
  • is_available (48-59)
  • build (78-101)
  • can_build (62-75)
flashinfer_bench/compile/builders/python_builder.py (4)
  • PythonBuilder (19-150)
  • is_available (38-40)
  • build (83-150)
  • can_build (42-44)
flashinfer_bench/compile/builders/torch_builder.py (4)
  • TorchBuilder (20-175)
  • is_available (42-54)
  • build (97-175)
  • can_build (56-61)
flashinfer_bench/compile/builders/triton_builder.py (4)
  • TritonBuilder (14-82)
  • is_available (32-44)
  • build (62-82)
  • can_build (46-60)
flashinfer_bench/compile/runnable.py (2)
  • Runnable (31-170)
  • cleanup (159-170)
tests/compile/test_builder.py (3)
  • is_available (19-20)
  • build (25-32)
  • can_build (22-23)
flashinfer_bench/bench/runner/runner.py (1)
flashinfer_bench/data/definition.py (1)
  • Definition (95-363)
flashinfer_bench/bench/evaluators/sampling.py (6)
flashinfer_bench/compile/registry.py (3)
  • BuilderRegistry (17-168)
  • get_instance (56-74)
  • build_reference (121-154)
flashinfer_bench/compile/runnable.py (1)
  • Runnable (31-170)
flashinfer_bench/bench/evaluators/default.py (2)
  • can_evaluate (35-36)
  • build_baseline (39-91)
flashinfer_bench/bench/evaluators/evaluator.py (2)
  • can_evaluate (26-26)
  • build_baseline (30-37)
flashinfer_bench/bench/runner/runner.py (1)
  • DeviceBaseline (26-32)
flashinfer_bench/utils.py (1)
  • dtype_str_to_torch_dtype (39-45)
tests/bench/test_isolated_runner.py (3)
flashinfer_bench/data/solution.py (2)
  • SupportedLanguages (13-27)
  • Solution (100-213)
flashinfer_bench/bench/runner/isolated_runner.py (3)
  • SubprocessWorker (27-204)
  • run_ref (44-60)
  • run_solution (62-198)
flashinfer_bench/bench/config.py (1)
  • BenchmarkConfig (8-64)
flashinfer_bench/compile/builders/tvm_ffi_builder.py (5)
flashinfer_bench/compile/builder.py (2)
  • build (78-101)
  • get_package_name_and_build_path (103-120)
flashinfer_bench/compile/runnable.py (2)
  • Runnable (31-170)
  • RunnableMetadata (13-28)
flashinfer_bench/compile/utils.py (1)
  • write_sources_to_path (12-55)
flashinfer_bench/data/definition.py (1)
  • Definition (95-363)
flashinfer_bench/data/solution.py (1)
  • Solution (100-213)
flashinfer_bench/bench/evaluators/registry.py (4)
flashinfer_bench/bench/evaluators/default.py (1)
  • can_evaluate (35-36)
flashinfer_bench/bench/evaluators/evaluator.py (1)
  • can_evaluate (26-26)
flashinfer_bench/bench/evaluators/lowbit.py (1)
  • can_evaluate (21-22)
flashinfer_bench/bench/evaluators/sampling.py (1)
  • can_evaluate (31-32)
flashinfer_bench/compile/builders/torch_builder.py (4)
flashinfer_bench/compile/builder.py (4)
  • Builder (20-120)
  • can_build (62-75)
  • build (78-101)
  • get_package_name_and_build_path (103-120)
flashinfer_bench/compile/runnable.py (2)
  • Runnable (31-170)
  • RunnableMetadata (13-28)
flashinfer_bench/compile/utils.py (1)
  • write_sources_to_path (12-55)
flashinfer_bench/data/solution.py (3)
  • SupportedLanguages (13-27)
  • get_entry_path (145-156)
  • get_entry_symbol (158-170)
flashinfer_bench/apply/table.py (1)
flashinfer_bench/compile/registry.py (3)
  • BuilderRegistry (17-168)
  • get_instance (56-74)
  • build (77-82)
flashinfer_bench/bench/runner/persistent_runner.py (5)
flashinfer_bench/compile/registry.py (3)
  • BuilderRegistry (17-168)
  • get_instance (56-74)
  • build (77-82)
flashinfer_bench/data/definition.py (1)
  • Definition (95-363)
flashinfer_bench/bench/evaluators/registry.py (1)
  • resolve_evaluator (18-24)
flashinfer_bench/compile/builders/python_builder.py (1)
  • build (83-150)
flashinfer_bench/compile/builders/torch_builder.py (1)
  • build (97-175)
flashinfer_bench/bench/evaluators/lowbit.py (1)
flashinfer_bench/bench/evaluators/evaluator.py (2)
  • can_evaluate (26-26)
  • check_correctness (41-50)
tests/bench/test_benchmark.py (1)
flashinfer_bench/data/json_utils.py (1)
  • save_json_file (11-26)
flashinfer_bench/bench/runner/isolated_runner.py (3)
flashinfer_bench/compile/registry.py (3)
  • BuilderRegistry (17-168)
  • get_instance (56-74)
  • build (77-82)
flashinfer_bench/compile/runnable.py (1)
  • Runnable (31-170)
flashinfer_bench/data/solution.py (1)
  • Solution (100-213)
tests/bench/test_persistent_runner.py (3)
flashinfer_bench/data/solution.py (2)
  • SourceFile (30-61)
  • Solution (100-213)
flashinfer_bench/bench/runner/persistent_runner.py (2)
  • run_solution (264-400)
  • run_workload (508-646)
flashinfer_bench/bench/runner/runner.py (1)
  • run_workload (39-46)
tests/compile/test_python_builder.py (5)
tests/compile/test_torch_builder.py (1)
  • _use_tmp_cache_dir (21-22)
flashinfer_bench/data/solution.py (3)
  • Solution (100-213)
  • BuildSpec (64-97)
  • SourceFile (30-61)
flashinfer_bench/compile/builders/python_builder.py (2)
  • PythonBuilder (19-150)
  • build (83-150)
flashinfer_bench/compile/registry.py (1)
  • build (77-82)
tests/compile/test_builder.py (1)
  • build (25-32)
🪛 GitHub Actions: .github/workflows/linting.yaml
examples/fi_gqa_e2e_example.py

[error] 1-1: isort: files were modified by this hook during pre-commit. Fixing the import order.


[error] 1-1: pre-commit isort hook failed (exit code 1) after making changes.

🪛 Ruff (0.14.7)
tests/conftest.py

23-23: Unused function argument: config

(ARG001)

flashinfer_bench/cli/main.py

86-86: Loop control variable def_name not used within loop body

(B007)

flashinfer_bench/bench/evaluators/default.py

35-35: Unused class method argument: definition

(ARG003)


96-96: Unused class method argument: definition

(ARG003)

flashinfer_bench/compile/registry.py

51-51: Avoid specifying long messages outside the exception class

(TRY003)


119-119: Avoid specifying long messages outside the exception class

(TRY003)


166-167: try-except-pass detected, consider logging the exception

(S110)


166-166: Do not catch blind exception: Exception

(BLE001)

tests/compile/test_builder.py

22-22: Unused method argument: solution

(ARG002)

flashinfer_bench/compile/builders/tvm_ffi_builder.py

60-60: Unused noqa directive (non-enabled: F401)

Remove unused noqa directive

(RUF100)


183-185: Avoid specifying long messages outside the exception class

(TRY003)


268-270: Avoid specifying long messages outside the exception class

(TRY003)


294-294: Avoid specifying long messages outside the exception class

(TRY003)

flashinfer_bench/bench/evaluators/registry.py

24-24: Avoid specifying long messages outside the exception class

(TRY003)

flashinfer_bench/compile/builders/torch_builder.py

125-128: Avoid specifying long messages outside the exception class

(TRY003)


151-151: Avoid specifying long messages outside the exception class

(TRY003)


156-156: Avoid specifying long messages outside the exception class

(TRY003)


270-270: Consider [f"/LIBPATH:{lib_path}", *lib_names] instead of concatenation

Replace with [f"/LIBPATH:{lib_path}", *lib_names]

(RUF005)


272-272: Do not catch blind exception: Exception

(BLE001)


309-312: Avoid specifying long messages outside the exception class

(TRY003)

flashinfer_bench/apply/table.py

329-329: Probable use of insecure hash functions in hashlib: sha1

(S324)

flashinfer_bench/data/solution.py

202-202: Probable use of insecure hash functions in hashlib: sha1

(S324)

flashinfer_bench/bench/runner/persistent_runner.py

269-269: Avoid specifying long messages outside the exception class

(TRY003)


393-393: Do not catch blind exception: Exception

(BLE001)

flashinfer_bench/bench/utils.py

173-173: Do not catch blind exception: Exception

(BLE001)


174-174: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


174-174: Avoid specifying long messages outside the exception class

(TRY003)

flashinfer_bench/compile/runnable.py

126-129: Avoid specifying long messages outside the exception class

(TRY003)

flashinfer_bench/apply/key.py

85-85: Avoid specifying long messages outside the exception class

(TRY003)

flashinfer_bench/compile/builders/python_builder.py

71-72: try-except-pass detected, consider logging the exception

(S110)


71-71: Do not catch blind exception: Exception

(BLE001)


111-111: Avoid specifying long messages outside the exception class

(TRY003)


129-129: Avoid specifying long messages outside the exception class

(TRY003)


135-137: Avoid specifying long messages outside the exception class

(TRY003)


141-141: Avoid specifying long messages outside the exception class

(TRY003)

flashinfer_bench/bench/runner/isolated_runner.py

172-172: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

flashinfer_bench/apply/runtime.py

162-162: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.12
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.11
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.10
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.13
🔇 Additional comments (103)
flashinfer_bench/cli/main.py (1)

75-85: Variable rename in definitions loop looks good

Using definition instead of a shorter name improves readability, and writing to definitions/{definition.name}.json remains consistent with the existing TraceSet / save_json_file contract. No functional change or regression here.

flashinfer_bench/data/trace_set.py (1)

376-377: LGTM! Naming consistency improvement.

The variable renames from defn to definition align with the project-wide naming convention updates and improve code readability.

Also applies to: 469-470

tests/data/test_read_traces.py (1)

93-93: LGTM! Consistent filename update.

The filename change from "sol.json" to "solution.json" aligns with the broader naming convention updates in the PR.

Also applies to: 99-99

pyproject.toml (4)

27-27: LGTM! More flexible version constraint.

Changing from torch==2.8.0 to torch>=2.8.0 allows compatibility with newer PyTorch versions while maintaining the minimum required version.


31-31: LGTM! New dependency for TVM-FFI support.

The addition of apache-tvm-ffi>=0.1.2 aligns with the expanded builder ecosystem introduced in this PR.


97-97: LGTM! New pytest marker for CUDA tests.

The requires_torch_cuda marker enables conditional test execution based on CUDA availability, improving test reliability across different environments.


101-101: LGTM! Deterministic versioning fallback.

The fallback_version configuration ensures deterministic versioning in CI/test environments where git tags may not be available.

examples/fi_gqa_e2e_example.py (2)

22-22: LGTM! Improved variable naming.

The rename from ts to trace_set improves code clarity and makes the variable purpose more explicit.

Also applies to: 26-26, 83-87


42-42: LGTM! Aligned with CUDA-first testing strategy.

The change from "gpu" to "cuda" aligns with the broader shift to CUDA-specific device targeting throughout the codebase.

tests/bench/test_isolated_runner.py (2)

194-194: LGTM! CUDA-specific device targeting.

The change from ["gpu"] to ["cuda"] aligns with the project-wide shift to CUDA-specific device targeting and is consistent with the new requires_torch_cuda pytest marker.

Also applies to: 224-224


234-234: LGTM! Improved variable naming.

The rename from sol to solution enhances code readability and consistency with the broader naming convention updates.

Also applies to: 241-241

tests/conftest.py (3)

7-21: LGTM! CUDA availability detection.

The _torch_cuda_available() helper properly handles ImportError and checks CUDA availability, enabling conditional test execution.


23-31: LGTM! Automatic CUDA test skipping.

The pytest collection hook automatically skips tests marked with requires_torch_cuda when CUDA is unavailable, improving test reliability across different environments.

Note: The static analysis warning about unused config parameter is a false positive—pytest hooks require this signature.


34-42: LGTM! Test isolation via temporary cache.

The tmp_cache_dir fixture ensures each test uses an isolated cache directory, preventing cache pollution between tests.

tests/bench/test_benchmark.py (2)

218-218: LGTM! CUDA-specific device targeting.

The change from ["gpu"] to ["cuda"] aligns with the project-wide CUDA-first testing strategy and is consistent with the new pytest marker infrastructure.

Also applies to: 233-233


245-246: LGTM! Improved variable naming.

The rename from sol to solution in the loop enhances code readability and consistency.

flashinfer_bench/bench/evaluators/registry.py (1)

18-24: LGTM! Parameter naming consistency.

The rename from defn to definition improves code clarity and aligns with the project-wide naming convention updates.

Note: The static analysis suggestion about exception message length is stylistic and doesn't affect functionality.

flashinfer_bench/bench/runner/runner.py (1)

25-32: Rename to definition is consistent with upstream API

The dataclass field and abstract method parameter renames align Runner and DeviceBaseline with the rest of the codebase (definition terminology) without changing behavior. Call sites (e.g., persistent runner) now read more clearly and type hints remain correct.

Also applies to: 38-46

flashinfer_bench/bench/evaluators/lowbit.py (1)

18-23: Evaluator signature now matches base interface

Renaming the parameters to definition in can_evaluate and check_correctness keeps LowBitEvaluator consistent with the abstract Evaluator API and other evaluators; behavior remains unchanged.

Also applies to: 26-35

flashinfer_bench/apply/table.py (4)

10-10: LGTM on the import update.

The import correctly aligns with the new public API surface, importing BuilderRegistry and Runnable from the compile module.


118-125: LGTM on the cached load path.

The registry usage is correct: retrieving the singleton instance once and reusing it for all builds. The null checks for definition and solution before calling build prevent potential errors.


172-197: LGTM on the build method.

The registry is correctly obtained once and reused for all builds within the method. The conditional build at line 196 properly respects the on_miss_policy setting.


270-293: LGTM on the prewarm AOT method.

The registry usage follows the same efficient pattern. The dual-path warming (ranked solutions by cutoff and def_best fallback) is correctly implemented with proper null checks.

flashinfer_bench/compile/builders/triton_builder.py (2)

31-44: LGTM on availability check.

Standard pattern for checking optional dependency availability via import.


62-81: LGTM on build delegation.

The build method correctly delegates to PythonBuilder.build() and customizes the metadata with build_type="triton". This is an appropriate pattern for specializing the parent builder's behavior.

tests/compile/test_tvm_ffi_builder.py (7)

75-78: LGTM on autouse fixture.

The autouse fixture for tmp_cache_dir correctly ensures test isolation by using a temporary cache directory for all tests in this module.


80-118: LGTM on CPU test.

The test correctly validates the CPU kernel build path with DPS-style invocation. The use of {"type": "var"} for dynamic axis is appropriate.


121-160: LGTM on CUDA test.

The test is correctly marked with requires_torch_cuda and validates the CUDA kernel build and execution path.


163-207: LGTM on can_build tests.

Good coverage of the builder's capability detection, including the new C++ solution test case.


210-310: LGTM on caching tests.

The caching tests validate both same-builder and cross-builder cache reuse, which is important for verifying the new registry's caching behavior.


313-354: LGTM on call_value_returning test.

This test validates the alternative value-returning invocation style, which is useful for kernels that allocate their own outputs.


356-467: LGTM on error handling tests.

Good coverage of error cases: invalid entry point, missing sources, and subdirectory handling.

flashinfer_bench/apply/runtime.py (2)

13-13: LGTM on import update.

The import correctly reflects the new BuilderRegistry API.


166-170: LGTM on key builder caching.

The key builder caching correctly uses definition.name for the cache key, consistent with the variable naming updates.

flashinfer_bench/compile/registry.py (3)

13-14: LGTM on builder priority list.

The priority order (TritonBuilder → PythonBuilder → TVMFFIBuilder → TorchBuilder) correctly routes specialized builders before general ones.


37-53: LGTM on initialization.

Proper validation ensures the registry always has at least one builder. The defensive copy of the builders list (list(builders)) prevents external mutation.


55-74: LGTM on singleton accessor.

The lazy initialization correctly filters builders by availability. If no builders are available, the __init__ validation will raise a clear error, which is appropriate fail-fast behavior.

flashinfer_bench/data/solution.py (2)

24-25: LGTM!

The addition of CPP language variant aligns well with the new TorchBuilder that handles both CPP and CUDA languages.


145-170: LGTM!

The get_entry_path() and get_entry_symbol() helper methods provide clean entry-point introspection. The implementation correctly parses the {file_path}::{function_name} format.

tests/compile/test_python_builder.py (3)

19-22: LGTM!

The autouse fixture pattern is clean and ensures consistent cache directory setup across all tests in the module.


24-55: LGTM!

The test properly validates the Python builder workflow: creating a Definition and Solution, building via PythonBuilder, and invoking the returned runnable with torch tensors.


58-95: LGTM!

Good coverage of tensor addition with proper dtype specification and torch.allclose assertion.

tests/compile/test_builder.py (1)

17-32: LGTM!

DummyBuilder correctly implements the new Builder interface with is_available(), can_build(), and build() methods. The Runnable construction with RunnableMetadata follows the expected pattern.

flashinfer_bench/bench/runner/persistent_runner.py (5)

19-19: LGTM!

The import correctly switches to the new BuilderRegistry singleton pattern.


68-68: LGTM!

Registry initialization correctly uses the singleton pattern via BuilderRegistry.get_instance().


246-262: LGTM!

The run_ref method correctly uses the renamed definition parameter and passes it through to the evaluator.


285-294: LGTM!

The message payload correctly uses the new field names (definition, solution, solution_name) for IPC with the worker process.


691-725: LGTM!

The worker main loop correctly extracts definition and solution from the message and uses registry.build(definition, solution) to build the runnable.

flashinfer_bench/compile/builders/torch_builder.py (3)

41-54: LGTM!

The is_available() check properly verifies both PyTorch installation and CUDA availability.


56-61: LGTM!

The can_build() method correctly identifies solutions with CUDA or CPP languages.


178-183: LGTM on the TODO marker.

The DependencyManager is appropriately marked as disabled with a TODO for future work. This is a reasonable approach for incremental development.

flashinfer_bench/bench/evaluators/evaluator.py (2)

26-50: LGTM! Parameter naming is now consistent with subclass implementations.

The rename from defn to definition aligns with the concrete implementations in lowbit.py, sampling.py, and default.py shown in the relevant snippets, improving API clarity and consistency.


64-106: LGTM!

The evaluate method correctly propagates the renamed definition parameter to check_correctness.

flashinfer_bench/apply/key.py (5)

15-15: LGTM! Type annotation simplified.

Flattening Union[int, Union[float, bool]] to Union[int, float, bool] is cleaner and semantically equivalent.


42-45: LGTM!

Parameter and attribute rename from defn to definition improves readability.


62-86: LGTM!

Method parameter and internal references consistently renamed.


88-102: Good defensive check added.

The new bounds validation at lines 97-100 prevents an IndexError when dim_idx >= len(shape), providing a clearer error message instead of a cryptic index error.


147-149: LGTM!

specialize method parameter consistently renamed to definition.

flashinfer_bench/bench/runner/isolated_runner.py (5)

17-17: LGTM!

Import updated to use the new BuilderRegistry singleton class instead of the removed get_builder_registry function.


44-60: LGTM!

Parameter and call sites consistently renamed from defn to definition.


62-92: LGTM!

Parameter renamed from sol to solution with docstring and internal references updated accordingly.


207-248: LGTM!

Worker main function updated to use:

  • Renamed parameters (definition, solution)
  • New BuilderRegistry.get_instance() singleton access pattern
  • Updated registry.build(definition, solution) API call

414-488: LGTM!

run_workload method consistently updated with renamed definition parameter and propagated through baseline building and solution evaluation flows.

flashinfer_bench/compile/runnable.py (4)

13-28: LGTM! Well-structured metadata model.

The RunnableMetadata class provides clear documentation and type safety for build metadata. The Union[Literal[...], str] for build_type allows extensibility while still documenting known types.


68-89: LGTM!

The __call__ method correctly executes the underlying callable and unpacks single-element tuples for convenience.


91-157: LGTM! Well-designed DPS interface.

The call_value_returning method:

  • Validates that definition exists in metadata
  • Allocates output tensors based on the definition's output shapes and dtypes
  • Enforces single-device constraint for inputs
  • Returns appropriately typed results (None, single tensor, or tuple)

159-170: LGTM!

The cleanup method is correctly implemented as idempotent with proper exception handling.

flashinfer_bench/compile/builder.py (3)

46-59: LGTM!

The is_available static abstract method provides a clean pattern for environment capability checks.


61-101: LGTM!

The can_build and build abstract methods are well-documented with clear parameter descriptions and expected behavior.


103-120: LGTM!

The get_package_name_and_build_path method provides a consistent way to derive deterministic package names and build paths for solutions.

flashinfer_bench/bench/utils.py (6)

119-120: LGTM!

Parameter renamed from defn to definition for consistency. The logic correctly checks for op_type attribute.


123-128: LGTM!

Parameter renamed consistently to definition. Function signature now aligns with other utilities in this file.


143-146: LGTM!

Internal references updated correctly to use definition.outputs for extracting output names and dtypes.


194-194: LGTM!

Internal reference updated correctly to use definition.inputs[name].dtype.


206-216: LGTM!

Function signature and internal usage updated consistently. The parameter ordering (definition, wl, device, stensors) is clear and the body correctly references definition.get_input_shapes and definition.inputs.


230-230: LGTM!

Correctly passes definition to the renamed is_sampling_operation function.

flashinfer_bench/compile/builders/python_builder.py (3)

19-35: LGTM!

Class docstring and constants are well-documented. The _PACKAGE_PREFIX with fib_ prefix correctly avoids Python import collisions.


37-44: LGTM!

is_available() returns True (Python is always available) and can_build() correctly checks for Python language.


83-150: LGTM!

The build() method is well-structured:

  • Validates .py extension
  • Uses helper methods consistently (get_package_name_and_build_path, write_sources_to_path)
  • Properly cleans up on failure in all error paths
  • Creates RunnableMetadata with appropriate fields
  • Returns Runnable with keyword arguments

The past review comment about redundant build_path computation has been addressed - the value from get_package_name_and_build_path is now used directly.

flashinfer_bench/bench/evaluators/default.py (6)

19-19: LGTM!

Import updated to use the new centralized BuilderRegistry and Runnable from flashinfer_bench.compile.


34-36: LGTM!

Parameter renamed to definition. The static analysis hint about unused argument is a false positive - this is an interface method where subclasses may use the parameter.


38-55: LGTM!

The build_baseline method correctly:

  • Uses definition.outputs for dtype extraction
  • Calls BuilderRegistry.get_instance().build_reference(definition)
  • Passes definition to load_safetensors

60-73: LGTM!

Internal calls updated consistently to pass definition to gen_inputs and use definition.outputs.keys() for output names.


84-91: LGTM!

DeviceBaseline construction updated to use definition=definition, aligning with the renamed field in the dataclass.


93-103: LGTM!

Parameter renamed to definition in check_correctness. The static analysis hint about unused argument is a false positive - definition is part of the interface signature and may be used by subclasses (as seen in SamplingEvaluator).

flashinfer_bench/bench/evaluators/sampling.py (7)

21-21: LGTM!

Import updated to use centralized BuilderRegistry and Runnable.


29-32: LGTM!

can_evaluate correctly delegates to is_sampling_op(definition).


36-82: LGTM!

build_baseline correctly updated:

  • Uses BuilderRegistry.get_instance().build_reference(definition)
  • Passes definition to load_safetensors, gen_inputs, and _detect_thresholding_method
  • DeviceBaseline uses definition=definition

84-108: LGTM!

check_correctness signature and internal usage correctly updated to use definition. Output names and dtypes correctly derived from definition.outputs.


169-172: LGTM!

Correctly passes definition to _sample_token_distributions.


215-228: LGTM!

Helper functions is_sampling_op and _detect_thresholding_method correctly renamed parameter to definition.


291-346: LGTM!

_sample_token_distributions correctly updated:

  • Parameter renamed to definition
  • Internal usage correctly references definition.outputs
flashinfer_bench/compile/builders/tvm_ffi_builder.py (11)

1-16: LGTM!

Module docstring and imports are well-organized. The builder now uses the centralized Builder base class and Runnable/RunnableMetadata types.


19-20: LGTM!

Extension mappings made private with underscore prefix, following Python conventions.


43-54: LGTM!

Class variables and constructor properly initialize the builder with prefix and directory name. Calls super().__init__() correctly.


65-82: LGTM!

can_build() correctly checks for CUDA or C++ language support.


84-136: LGTM!

_check_sources() implements robust cache validation:

  • Checks build directory existence
  • Verifies .so file presence
  • Validates source file content matches
  • Includes defensive path traversal assertions

138-161: LGTM!

_filter_sources() correctly categorizes source files by extension into C++ and CUDA lists.


163-186: LGTM!

_get_entry_symbol() correctly extracts the symbol from entry_point format with proper validation.


188-205: LGTM!

_get_cleaner() returns a simple cleanup function that removes the build directory. Using ignore_errors=True is appropriate for cleanup operations.


236-270: Imports inside build() are intentional for conditional loading.

The tvm_ffi imports are placed inside build() rather than at module level because is_available() guards the builder instantiation. This is the correct pattern for optional dependencies. The past review comment suggesting moving imports to top level would break when tvm_ffi is unavailable.


279-304: LGTM!

The runnable construction is well-designed:

  • RunnableMetadata includes the full Definition object in misc for DPS support via call_value_returning()
  • kwargs_adapter correctly maps keyword arguments to positional arguments based on definition input/output order
  • Cleaner is attached for proper resource cleanup

56-63: Verify linting configuration before removing noqa: F401.

The suggestion to remove this directive assumes F401 is not enabled, but this should be confirmed against the project's actual ruff configuration in pyproject.toml or similar files before making changes. If the directive was added to suppress a CI failure, removing it could reintroduce that failure.

Comment on lines 1 to 3
import torch

import flashinfer
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix import order to resolve pipeline failure.

The isort pre-commit hook detected incorrect import ordering. Ensure imports are sorted according to the project's configuration (standard library, third-party, local imports).

Run the following command to automatically fix the import order:

#!/bin/bash
# Fix import order using isort
isort examples/fi_gqa_e2e_example.py
🧰 Tools
🪛 GitHub Actions: .github/workflows/linting.yaml

[error] 1-1: isort: files were modified by this hook during pre-commit. Fixing the import order.


[error] 1-1: pre-commit isort hook failed (exit code 1) after making changes.

🤖 Prompt for AI Agents
In examples/fi_gqa_e2e_example.py around lines 1 to 3, the import order is
incorrect per the project's isort rules (standard library, third-party, local
imports) which breaks the pre-commit pipeline; reorder the imports accordingly
(put standard library imports first, then third-party like torch, then
local/project imports like flashinfer) or simply run isort on the file (isort
examples/fi_gqa_e2e_example.py) to automatically fix the ordering.

Signed-off-by: Ubospica <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
flashinfer_bench/compile/builder.py (1)

32-44: Docstring parameter name mismatch (already flagged).

Line 37 references key_prefix in the docstring, but the actual parameter name is package_prefix. This has already been noted in a previous review.

🧹 Nitpick comments (3)
tests/data/test_solution.py (1)

64-72: Consider clarifying test intent or removing unused code.

The CUDA Solution is instantiated but not assigned or asserted against. While this implicitly tests that creation doesn't raise an exception, the test intent is unclear—especially given the comment "# CUDA requires build" which suggests specific validation was expected.

Consider either:

  1. Adding a comment explaining what behavior is being validated (e.g., "# Verify CUDA solutions can be created without error"), or
  2. Removing this code block if it no longer serves a testing purpose after the requires_build() removal
flashinfer_bench/compile/builders/python_builder.py (1)

63-79: Consider more specific exception handling in cleanup.

The cleanup logic correctly unloads modules and removes build artifacts. However, the bare except Exception: pass at lines 69-70 silently suppresses all errors during module cleanup. While this might be acceptable in cleanup code, consider being more specific about which exceptions to catch or logging suppressed errors for debugging.

Optional improvement:

         def cleaner() -> None:
             try:
                 # Unload module and submodules
                 to_delete = [m for m in sys.modules if m == package or m.startswith(package + ".")]
                 for m in to_delete:
                     sys.modules.pop(m, None)
-            except Exception:
-                pass
+            except (KeyError, AttributeError):
+                # Module already unloaded or sys.modules modified
+                pass
flashinfer_bench/bench/timing.py (1)

11-38: Clarify multiprocessing lock scope vs intended “multiprocess-safe” behavior

The per-device lock registry uses multiprocessing.Lock stored in module-level globals. This gives you serialization within a single process and, on fork-based start methods, for locks created before forking. However, under spawn (default on Windows/macOS and common with torch.multiprocessing), each process will import this module separately and build its own _device_locks registry and Lock() instances, so benchmarks on the same device across different processes will not be serialized.

If true cross-process serialization is required on all start methods, you may want to:

  • Create and share the locks from the parent (e.g., via multiprocessing.Manager or by passing shared Lock instances into workers), or
  • Document that this locking is per-process and that determinism across processes depends on using fork.

Please double-check that this aligns with how your benchmark workers are launched and your platform targets.

Also applies to: 41-71

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 98356f2 and d75ce86.

📒 Files selected for processing (5)
  • flashinfer_bench/bench/timing.py (1 hunks)
  • flashinfer_bench/compile/builder.py (1 hunks)
  • flashinfer_bench/compile/builders/python_builder.py (1 hunks)
  • tests/compile/test_utils.py (1 hunks)
  • tests/data/test_solution.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/compile/test_utils.py
🧰 Additional context used
🪛 GitHub Actions: .github/workflows/linting.yaml
flashinfer_bench/bench/timing.py

[error] 1-1: isort: Import order issues detected and auto-fixed by pre-commit hook.

🪛 Ruff (0.14.7)
flashinfer_bench/compile/builders/python_builder.py

69-70: try-except-pass detected, consider logging the exception

(S110)


69-69: Do not catch blind exception: Exception

(BLE001)


109-109: Avoid specifying long messages outside the exception class

(TRY003)


127-127: Avoid specifying long messages outside the exception class

(TRY003)


133-135: Avoid specifying long messages outside the exception class

(TRY003)


139-139: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.11
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.13
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.12
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.10
🔇 Additional comments (8)
tests/data/test_solution.py (1)

60-61: LGTM: Improved variable naming.

The renaming from sol to solution improves code readability without changing logic.

flashinfer_bench/compile/builder.py (3)

1-18: LGTM!

The module docstring is clear, and the imports align well with the new Builder API. The BuildError exception provides a specific exception type for build failures.


46-101: LGTM!

The abstract methods provide a clean API surface:

  • is_available() allows environment checks before attempting builds
  • can_build() enables builder selection based on solution compatibility
  • build() has comprehensive documentation with parameters, returns, and raises sections

The design follows good abstraction principles.


103-120: LGTM!

The helper method provides a clean abstraction for package naming and path computation, promoting consistency across builder implementations.

flashinfer_bench/compile/builders/python_builder.py (4)

1-33: LGTM!

The class structure is well-organized:

  • ClassVar constants provide clear configuration
  • __init__ properly delegates to the parent Builder class
  • Imports are appropriate for the new API

35-42: LGTM!

The availability and capability checks are straightforward and correct. Python is always available in the current environment, and the language check properly validates solution compatibility.


81-112: LGTM!

The build method setup is well-structured:

  • Comprehensive docstring with clear error cases
  • Proper validation of the entry file extension
  • Correct package and module name construction

114-148: LGTM!

The build implementation is robust:

  • Proper error handling with cleanup on all failure paths
  • Module stays importable via sys.path while the Runnable is alive
  • Metadata captures essential build information
  • The cleaner ensures resources are released when the Runnable is disposed

Signed-off-by: Ubospica <[email protected]>
Signed-off-by: Ubospica <[email protected]>
Signed-off-by: Ubospica <[email protected]>
@Ubospica Ubospica merged commit fa236b6 into flashinfer-ai:main Dec 3, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants