Skip to content

Conversation

@ThrudPrimrose
Copy link
Collaborator

I am working on auto-vectorization. As part of the auto-vectorization, I have written the Branch Elimination pass, which also uses construction utilities. I am trying to make them into the atoms for SDFG manipulation.

I want to discuss the placement of the possible atoms.

I have implemented many helper functions for the passes I have been writing, and I extensively use them as helpers in the offloading and auto-vectorization passes I have been writing. They have been tested on the CloudSC SDFG, and I will write unit tests for most of them later.

I want to discuss where such atoms should live

I have SDFG-manipulation atoms - these functions aim to provide reusable functions when constructing and changing SDFGs. What I have so far:
move_branch_cfg_up_discard_conditions, copy_state_contents, copy_graph_contents, generate_assignment_as_tasklet_in_state, insert_non_transient_data_through_parent_scopes, get_parent_map_and_loop_scopes, replace_length_one_arrays_with_scalars

I think they should be in the file sdfg/construction_utils.py

Existing string replacements usually fail and error out in bigger SDFGs like CloudsC or VelocityTendencies
For that, I have multiple tasklet and interesting edge utilities written.

In tasklet utilities, I have stuff that analyzes tasklets and tasklet codes (tasklet blackbox is not possible for some passes):
classify_tasklet, which returns information on the code of the tasklet, which is essential for auto-vectorization or implementation of the BlockedFP (microscaling) formats. Some of the functions used for tasklets, like tasklet_has_symbol, replace_tasklet_code, are currently in the construction utilities, but should be moved there.

I think they should go to the file sdfg/tasklet_utils.py

For interstate edges, I have ended up writing string utilities: token_replace_dict, token_split_variable_names, tasklet_has_symbol, replace_symbol, extract_bracket_tokens, and remove_bracket_tokens. I guess we can enforce that interstate edges use string expressions that can be parsed into symbolic expressions, and then move these functions to dace/symbolic.py. Or should we make these into something like string_utils.py or code_utils.py ?

I am also open to comments on which of them are necessary.

The PR is a draft, and I still have tasks to do:

  • Fix imports for BranchElimination and EliminateBranches
  • Write tests for copy_state_contents
  • Write tests for copy_graph_contents
  • Write tests for move_branch_cfg_up_discard_conditions
  • Write tests for insert_non_transient_data_through_parent_scopes
  • Write tests for token_replace_dict, token_split_variable_names, tasklet_has_symbol, replace_symbol, extract_bracket_tokens, remove_bracket_tokens
  • Write tests for replace_length_one_arrays_with_scalars
  • Write tests for connect_array_names
  • Write tests for generate_assignment_as_tasklet_in_state
  • Write tests for get_parent_map_and_loop_scopes


@properties.make_properties
@explicit_cf_compatible
class EliminateBranches(ppl.Pass):
Copy link
Collaborator

@Berke-Ates Berke-Ates Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest renaming this pass to BranchEliminationPass to avoid confusion between EliminateBranches and BranchElimination.

@phschaad phschaad self-requested a review October 30, 2025 09:29
Copy link
Contributor

@alexnick83 alexnick83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just reviewed the construction_utils file. I will do the rest later. In general, I like this PR.

@@ -0,0 +1,764 @@
import re
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In b4 @phschaad missing copyright comment

from dace.transformation.passes import FuseStates


class BracketFunctionPrinter(PythonCodePrinter):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this used?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying some stuff with the python printers, I might have forgotten it.



def move_branch_cfg_up_discard_conditions(if_block: ConditionalBlock, body_to_take: ControlFlowRegion):
# Sanity check the ensure apssed arguments are correct
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Sanity check the ensure apssed arguments are correct
# Sanity check the ensure passed arguments are correct

def copy_graph_contents(old_graph: ControlFlowRegion,
new_graph: ControlFlowRegion) -> Dict[dace.nodes.Node, dace.nodes.Node]:
"""
Deep-copies all nodes and edges from one SDFG state into another.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that this method works for ControlFlowRegions in general. Perhaps the docstring should indicate that instead of naming only SDFGStates.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right

# Sanity check the ensure apssed arguments are correct
bodies = {b for _, b in if_block.branches}
assert body_to_take in bodies
assert isinstance(if_block, ConditionalBlock)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If if_block is not a ConditionalBlock, won't the attribute access if_block.branches already fail in line 104?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct, that check is redundant.

# Split while keeping delimiters
tokens = re.split(r'(\s+|[()\[\]])', string_to_check)

# Replace tokens that exactly match src
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment seems to have been copied from the previous method but I don't think it makes sense here.


array_name_dict = dict()
for state in sdfg.all_states():
for node in state.nodes():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a method that iterates over access nodes only; I think data_nodes.

local_arr = state.sdfg.arrays[node.data]
print(local_arr.storage)
if local_arr.storage == local_storage:
assert len(state.in_edges(node)) <= 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would this hold?

sdfg.validate()


def tasklet_has_symbol(tasklet: dace.nodes.Tasklet, symbol_str: str) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe check the methods related to tasklets, symbols, and code replacement against ScalarToSymbolPromotion's helper methods and code

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also unsure whether they belong in this file. They look to be general utilities, not something specific to SDFG construction.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think tasklet-related and common python-ast related functionality should go to a file called tasklet_utils. I think I could move some stuff from ScalarToSymbol from that class to the utility file too,


# Check parent nsdfg
parent_nsdfg_node = parent_state.sdfg.parent_nsdfg_node
parent_nsdfg_parent_state = _find_parent_state(root_sdfg, parent_nsdfg_node)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this already be stored in parent_state.sdfg.parent_state (or something similar?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the attributed parent_graph, parent_nsdfg_node and parent_sdfg. In this case parent_graph should be the same as parent state. I did not know it existed, I will use that.

new_start_block = None
new_end_block = None

for node in body_to_take.nodes():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the lambda / more complex predicate function idea


def move_branch_cfg_up_discard_conditions(if_block: ConditionalBlock, body_to_take: ControlFlowRegion):
"""
Moves a branch of a conditional block up in the control flow graph (CFG),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for these methods in general it would be beneficial to have some example in the doc comment. These are going to be central and heavily used functions if we go down the atoms route as intended, so they should be crystal clear to understand. And (maybe that's just me) I was a bit confused first reading the description here as to what CFG gets put into what other CFG from the parameters. (Description is good, but an example would help)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An example is good to add, I agree.
I also think that we need a documentation page that lists all atoms and in which file they are, like FAQ

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like in the developer guide, "frequently used utilities". And then there we list memset path, memset tree and add mapped tasklets, etc.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fully agree :-)

"""
Inserts non-transient data containers into all relevant parent scopes (through all map scopes).
This function connect data from top-level data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about the examples then :)

replace_length_one_arrays_with_scalars(node.sdfg, recursive=True, transient_only=True)


def generate_assignment_as_tasklet_in_state(state: dace.SDFGState, lhs: str, rhs: str):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a doc comment

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will not repeat, I think you are right on asking for comments, examples. Assume I approve all of them ;)

# -> then create a [tasklet] that uses the scalar_val as a constant value inside
import re

def _token_replace(code: str, src: str, dst: str) -> str:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about the tokenization vs. AST - why?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the issue is, certain expressions make sympy crash.
One example:
(a == 1) == 0
In sympy a relational object can't be compared against integers. There is a very wide assumption (at least in all fortran SDFGs) that boolean true is equal to one and boolean flame is equal to zero. While this holds for almost all programming languages I know, this does not hold for symbolic math and sympy. This means all the SDFGs I work with have expressions that make sympy crash and I ended up adding a token based approach as fallback.

In my opinion, an SDFG should not involve comparisons to 1s and 0s for booleans (only if we really care about integers). I would really define true and false false and use boolean support which exists both in Candy C++.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@acalotoiu I am pulling you in in this one, because I want to ask, the comparison against integers, is it necessary? Does python frontend generate similar stuff too?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw. In the functions I aways try something like this:

  1. If Python try to parse into symbolicexpr analyze that (e.g. on interstate edges)
  2. If anything fails try token based approach
  3. If not python try token based approach (I really really do not want to support CPP tasklets and I would rather crash than implement support for it)

@@ -0,0 +1,1985 @@
import ast
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀 I'm not going to say the words :D

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:)

title: nvcc Arguments
description: Compiler argument flags for CUDA
default: '-Xcompiler -march=native --use_fast_math -Xcompiler -Wno-unused-parameter'
default: '--expt-relaxed-constexpr -Xcompiler -march=native --use_fast_math -Xcompiler -Wno-unused-parameter'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the issue when not passing this arg? A warning?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually DaCe needs this flag to compile, because it uses constexpr functions in functions marked as host and device. nvidia compiler is graceful enough to compile it as long we don't really call those functions (it is rare).
I think this happens because unused headers are not necessarily need to be included and compilers optimize this out.

We get a lot of warnings about this everytime we compile GPU code.

About the openmp flag, I have some configuration problem,.openmp flag shouldn't be normally needed

title: Arguments
description: Compiler argument flags
default: '-std=c++14 -fPIC -Wall -Wextra -O3 -march=native -ffast-math -Wno-unused-parameter -Wno-unused-label'
default: '-fopenmp -fPIC -Wall -Wextra -O3 -march=native -ffast-math -Wno-unused-parameter -Wno-unused-label'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, why is this needed? We already run with OMP.. right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, my I have some config problem in my system, I forgot the flag... Consexpr flag should be default tho imho

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants