Branch Elimination + Construction Utils #2201

ThrudPrimrose · 2025-10-29T12:10:44Z

I am working on auto-vectorization. As part of the auto-vectorization, I have written the Branch Elimination pass, which also uses construction utilities. I am trying to make them into the atoms for SDFG manipulation.

I want to discuss the placement of the possible atoms.

I have implemented many helper functions for the passes I have been writing, and I extensively use them as helpers in the offloading and auto-vectorization passes I have been writing. They have been tested on the CloudSC SDFG, and I will write unit tests for most of them later.

I want to discuss where such atoms should live

I have SDFG-manipulation atoms - these functions aim to provide reusable functions when constructing and changing SDFGs. What I have so far:
move_branch_cfg_up_discard_conditions, copy_state_contents, copy_graph_contents, generate_assignment_as_tasklet_in_state, insert_non_transient_data_through_parent_scopes, get_parent_map_and_loop_scopes, replace_length_one_arrays_with_scalars

I think they should be in the file sdfg/construction_utils.py

Existing string replacements usually fail and error out in bigger SDFGs like CloudsC or VelocityTendencies
For that, I have multiple tasklet and interesting edge utilities written.

In tasklet utilities, I have stuff that analyzes tasklets and tasklet codes (tasklet blackbox is not possible for some passes):
classify_tasklet, which returns information on the code of the tasklet, which is essential for auto-vectorization or implementation of the BlockedFP (microscaling) formats. Some of the functions used for tasklets, like tasklet_has_symbol, replace_tasklet_code, are currently in the construction utilities, but should be moved there.

I think they should go to the file sdfg/tasklet_utils.py

For interstate edges, I have ended up writing string utilities: token_replace_dict, token_split_variable_names, tasklet_has_symbol, replace_symbol, extract_bracket_tokens, and remove_bracket_tokens. I guess we can enforce that interstate edges use string expressions that can be parsed into symbolic expressions, and then move these functions to dace/symbolic.py. Or should we make these into something like string_utils.py or code_utils.py ?

I am also open to comments on which of them are necessary.

The PR is a draft, and I still have tasks to do:

Berke-Ates · 2025-10-29T12:21:57Z

dace/transformation/passes/eliminate_branches.py

+
+@properties.make_properties
+@explicit_cf_compatible
+class EliminateBranches(ppl.Pass):


I would suggest renaming this pass to BranchEliminationPass to avoid confusion between EliminateBranches and BranchElimination.

alexnick83

I just reviewed the construction_utils file. I will do the rest later. In general, I like this PR.

alexnick83 · 2025-10-30T14:51:40Z

dace/sdfg/construction_utils.py

@@ -0,0 +1,764 @@
+import re


In b4 @phschaad missing copyright comment

alexnick83 · 2025-10-30T14:55:23Z

dace/sdfg/construction_utils.py

+from dace.transformation.passes import FuseStates
+
+
+class BracketFunctionPrinter(PythonCodePrinter):


Where is this used?

I was trying some stuff with the python printers, I might have forgotten it.

alexnick83 · 2025-10-30T15:47:37Z

dace/sdfg/construction_utils.py

+
+
+def move_branch_cfg_up_discard_conditions(if_block: ConditionalBlock, body_to_take: ControlFlowRegion):
+    # Sanity check the ensure apssed arguments are correct


Suggested change

# Sanity check the ensure apssed arguments are correct

# Sanity check the ensure passed arguments are correct

alexnick83 · 2025-10-30T15:50:11Z

dace/sdfg/construction_utils.py

+def copy_graph_contents(old_graph: ControlFlowRegion,
+                        new_graph: ControlFlowRegion) -> Dict[dace.nodes.Node, dace.nodes.Node]:
+    """
+    Deep-copies all nodes and edges from one SDFG state into another.


My understanding is that this method works for ControlFlowRegions in general. Perhaps the docstring should indicate that instead of naming only SDFGStates.

You are right

alexnick83 · 2025-10-30T15:51:45Z

dace/sdfg/construction_utils.py

+    # Sanity check the ensure apssed arguments are correct
+    bodies = {b for _, b in if_block.branches}
+    assert body_to_take in bodies
+    assert isinstance(if_block, ConditionalBlock)


If if_block is not a ConditionalBlock, won't the attribute access if_block.branches already fail in line 104?

This is correct, that check is redundant.

alexnick83 · 2025-10-30T16:12:40Z

dace/sdfg/construction_utils.py

+    # Split while keeping delimiters
+    tokens = re.split(r'(\s+|[()\[\]])', string_to_check)
+
+    # Replace tokens that exactly match src


Comment seems to have been copied from the previous method but I don't think it makes sense here.

alexnick83 · 2025-10-30T16:17:46Z

dace/sdfg/construction_utils.py

+
+    array_name_dict = dict()
+    for state in sdfg.all_states():
+        for node in state.nodes():


There is a method that iterates over access nodes only; I think data_nodes.

alexnick83 · 2025-10-30T16:19:39Z

dace/sdfg/construction_utils.py

+                local_arr = state.sdfg.arrays[node.data]
+                print(local_arr.storage)
+                if local_arr.storage == local_storage:
+                    assert len(state.in_edges(node)) <= 1


Why would this hold?

alexnick83 · 2025-10-30T16:26:46Z

dace/sdfg/construction_utils.py

+    sdfg.validate()
+
+
+def tasklet_has_symbol(tasklet: dace.nodes.Tasklet, symbol_str: str) -> bool:


Maybe check the methods related to tasklets, symbols, and code replacement against ScalarToSymbolPromotion's helper methods and code

I am also unsure whether they belong in this file. They look to be general utilities, not something specific to SDFG construction.

I think tasklet-related and common python-ast related functionality should go to a file called tasklet_utils. I think I could move some stuff from ScalarToSymbol from that class to the utility file too,

alexnick83 · 2025-10-30T16:30:12Z

dace/sdfg/construction_utils.py

+
+    # Check parent nsdfg
+    parent_nsdfg_node = parent_state.sdfg.parent_nsdfg_node
+    parent_nsdfg_parent_state = _find_parent_state(root_sdfg, parent_nsdfg_node)


Shouldn't this already be stored in parent_state.sdfg.parent_state (or something similar?

I see the attributed parent_graph, parent_nsdfg_node and parent_sdfg. In this case parent_graph should be the same as parent state. I did not know it existed, I will use that.

dace/codegen/codegen.py

phschaad · 2025-11-03T10:13:05Z

dace/sdfg/construction_utils.py

+    new_start_block = None
+    new_end_block = None
+
+    for node in body_to_take.nodes():


I like the lambda / more complex predicate function idea

phschaad · 2025-11-03T10:15:19Z

dace/sdfg/construction_utils.py

+
+def move_branch_cfg_up_discard_conditions(if_block: ConditionalBlock, body_to_take: ControlFlowRegion):
+    """
+    Moves a branch of a conditional block up in the control flow graph (CFG),


I think for these methods in general it would be beneficial to have some example in the doc comment. These are going to be central and heavily used functions if we go down the atoms route as intended, so they should be crystal clear to understand. And (maybe that's just me) I was a bit confused first reading the description here as to what CFG gets put into what other CFG from the parameters. (Description is good, but an example would help)

An example is good to add, I agree.
I also think that we need a documentation page that lists all atoms and in which file they are, like FAQ

Like in the developer guide, "frequently used utilities". And then there we list memset path, memset tree and add mapped tasklets, etc.

Fully agree :-)

phschaad · 2025-11-03T10:15:38Z

dace/sdfg/construction_utils.py

+    """
+    Inserts non-transient data containers into all relevant parent scopes (through all map scopes).
+
+    This function connect data from top-level data


Same comment about the examples then :)

phschaad · 2025-11-03T10:16:11Z

dace/sdfg/construction_utils.py

+                    replace_length_one_arrays_with_scalars(node.sdfg, recursive=True, transient_only=True)
+
+
+def generate_assignment_as_tasklet_in_state(state: dace.SDFGState, lhs: str, rhs: str):


Needs a doc comment

I will not repeat, I think you are right on asking for comments, examples. Assume I approve all of them ;)

phschaad · 2025-11-03T10:20:29Z

dace/sdfg/utils.py

+    # -> then create a [tasklet] that uses the scalar_val as a constant value inside
+    import re
+
+    def _token_replace(code: str, src: str, dst: str) -> str:


Same comment about the tokenization vs. AST - why?

So the issue is, certain expressions make sympy crash.
One example:
(a == 1) == 0
In sympy a relational object can't be compared against integers. There is a very wide assumption (at least in all fortran SDFGs) that boolean true is equal to one and boolean flame is equal to zero. While this holds for almost all programming languages I know, this does not hold for symbolic math and sympy. This means all the SDFGs I work with have expressions that make sympy crash and I ended up adding a token based approach as fallback.

In my opinion, an SDFG should not involve comparisons to 1s and 0s for booleans (only if we really care about integers). I would really define true and false false and use boolean support which exists both in Candy C++.

@acalotoiu I am pulling you in in this one, because I want to ask, the comparison against integers, is it necessary? Does python frontend generate similar stuff too?

Btw. In the functions I aways try something like this:

If Python try to parse into symbolicexpr analyze that (e.g. on interstate edges)

If anything fails try token based approach

If not python try token based approach (I really really do not want to support CPP tasklets and I would rather crash than implement support for it)

dace/sdfg/utils.py

phschaad · 2025-11-03T10:22:42Z

dace/transformation/interstate/branch_elimination.py

@@ -0,0 +1,1985 @@
+import ast


👀 I'm not going to say the words :D

phschaad · 2025-11-03T10:24:15Z

dace/config_schema.yml

                        title: nvcc Arguments
                        description: Compiler argument flags for CUDA
-                        default: '-Xcompiler -march=native --use_fast_math -Xcompiler -Wno-unused-parameter'
+                        default: '--expt-relaxed-constexpr -Xcompiler -march=native --use_fast_math -Xcompiler -Wno-unused-parameter'


What is the issue when not passing this arg? A warning?

Actually DaCe needs this flag to compile, because it uses constexpr functions in functions marked as host and device. nvidia compiler is graceful enough to compile it as long we don't really call those functions (it is rare).
I think this happens because unused headers are not necessarily need to be included and compilers optimize this out.

We get a lot of warnings about this everytime we compile GPU code.

About the openmp flag, I have some configuration problem,.openmp flag shouldn't be normally needed

phschaad · 2025-11-03T10:24:52Z

dace/config_schema.yml

                        title: Arguments
                        description: Compiler argument flags
-                        default: '-std=c++14 -fPIC -Wall -Wextra -O3 -march=native -ffast-math -Wno-unused-parameter -Wno-unused-label'
+                        default: '-fopenmp -fPIC -Wall -Wextra -O3 -march=native -ffast-math -Wno-unused-parameter -Wno-unused-label'


Wait, why is this needed? We already run with OMP.. right?

Yes, my I have some config problem in my system, I forgot the flag... Consexpr flag should be default tho imho

ThrudPrimrose added 5 commits October 29, 2025 12:42

prepare

fc58b3e

Rm explicit vectorize things

f019f98

Fix

eadeda5

Stuff

6a603c4

Add classify tasklet

358aef3

ThrudPrimrose requested review from Berke-Ates, acalotoiu, affifboudaoud, alexnick83, pratyai, tbennun and tim0s and removed request for alexnick83 October 29, 2025 12:14

Berke-Ates reviewed Oct 29, 2025

View reviewed changes

ThrudPrimrose added 5 commits October 29, 2025 14:03

Rm use of | from type hints

60f7f9d

Add stuff

ce95213

Minor refactor

1913054

Refactor copy/reuse ont ransformations

e504b9d

Try something to fix spurious fails in the runner

f5bdd91

phschaad self-requested a review October 30, 2025 09:29

alexnick83 reviewed Oct 30, 2025

View reviewed changes

ThrudPrimrose added 6 commits October 31, 2025 15:19

Merge

d0e6f2c

Merge branch 'main' into branch-elimination

d4c7f70

Major refactor

e7e0c85

Pull improvements to the tasklet classification

9268704

Try fix serialization failing but how is that me?

c6f15af

Fix colliding sdfg names

47ba690

phschaad reviewed Nov 3, 2025

View reviewed changes

Add no autoopt marker and mark branch elimination tests

2db5a92

Use wrapper instead

38aa67c

		from dace.transformation.passes import FuseStates


		class BracketFunctionPrinter(PythonCodePrinter):



		def move_branch_cfg_up_discard_conditions(if_block: ConditionalBlock, body_to_take: ControlFlowRegion):
		# Sanity check the ensure apssed arguments are correct

	# Sanity check the ensure apssed arguments are correct
	# Sanity check the ensure passed arguments are correct

		sdfg.validate()


		def tasklet_has_symbol(tasklet: dace.nodes.Tasklet, symbol_str: str) -> bool:

		replace_length_one_arrays_with_scalars(node.sdfg, recursive=True, transient_only=True)


		def generate_assignment_as_tasklet_in_state(state: dace.SDFGState, lhs: str, rhs: str):

Branch Elimination + Construction Utils #2201

Are you sure you want to change the base?

Branch Elimination + Construction Utils #2201

Conversation

ThrudPrimrose commented Oct 29, 2025

Uh oh!

Berke-Ates Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alexnick83 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Berke-Ates Oct 29, 2025 •

edited

Loading