Skip to content

Conversation

@aschackmull
Copy link
Contributor

Unused so far, but this is intended as a step towards a unified SSA api, and a way to share more code by removing the need for individual languages to cache the SSA predicates and add wrapper classes.

@aschackmull aschackmull requested a review from a team as a code owner October 15, 2025 08:57
Copilot AI review requested due to automatic review settings October 15, 2025 08:57
@aschackmull aschackmull added the no-change-note-required This PR does not need a change note label Oct 15, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a shared signature for SSA (Static Single Assignment) and a module to implement it, as a step towards a unified SSA API. The change enables code sharing across languages by providing standardized signatures and implementations that eliminate the need for individual languages to cache SSA predicates and add wrapper classes.

Key changes:

  • Added a comprehensive SsaSig signature defining the complete SSA class hierarchy and predicates
  • Introduced a MakeSsa module that implements the SsaSig signature using existing SSA infrastructure
  • Updated the InputSig to use generic type signatures instead of specific BasicBlockSig

Copy link
Contributor

@hvitved hvitved left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, some comments.

Comment on lines +91 to +92
* For SSA definitions occurring at the beginning of a basic block, such as
* phi definitions, this will get the first control flow node of the basic block.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that is the case for existing languages. Would it be problematic for this predicate to not return a value for phi nodes (as well as SsaImplicitEntryDefinitions)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that this is a change for several languages, but Java depends on this in a couple of places, and I actually also made the change independently for C# over in #20558, since I needed it for nullness.
And in a vacuum I think it does make sense to always return a value - not returning a value for phis and entries present a bit of a gotcha, I think.

* variable would be visible.
*/
class SsaPhiDefinition extends SsaDefinition {
/** Holds if `inp` is an input to the phi definition along the edge originating in `bb`. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the phi -> this phi

* point in the flow graph where otherwise two or more definitions for the
* variable would be visible.
*/
class SsaPhiDefinition extends SsaDefinition {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest copying the example from

* A phi node. For example, in
*
* ```rb
* if b
* x = 0
* else
* x = 1
* end
* puts x
* ```
*
* a phi node for `x` is inserted just before the call `puts x`.
.

/** Holds if `inp` is an input to the phi definition along the edge originating in `bb`. */
predicate hasInputFromBlock(SsaDefinition inp, BasicBlock bb);

/** Gets an input of this phi definition. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy example from

final Definition getAnInput() { this.hasInputFromBlock(result, _) }
?

*/
Expr getValue();

/** Holds if this write is an initialization of a parameter. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a parameter -> parameter p.

* pseudo-read of the captured variable at the point of capture.
*/
predicate variableCapture(
SourceVariable captured, SourceVariable closureVar, BasicBlock bb, int i
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that in C# and Java, we tag local variables with a callable to account for captures, but I forgot why that is needed? In both Ruby and Rust, we simply have SourceVariable = LocalVariable, and that works perfectly fine, but would that work with the signature above (i.e., captured = closureVar)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seem to recall that it was necessary - at least at some point, but I also forgot exactly why. Looking at the shared SSA lib, I don't think it's necessary there. It might be necessary (or perhaps just convenient) in the language-specific portions for C# and Java.
This signature is indeed an example where it would break if SourceVariables were not tied to a callable - we could fix it by adding a column with e.g. the first BasicBlock of the closure to implicitly identify the callable. Do you think we should do that here? Otherwise we'll indeed require tagging if this is to be implemented with anything other than none().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was also thinking if we should replace closureVar with the callable representing the closure or, as you suggest, its entry basic block.

* phi nodes, this will get the first control flow node of the basic block.
*/
ControlFlowNode getControlFlowNode() {
exists(BasicBlock bb, int i | this.definesAt(_, bb, i) | result = bb.getNode(0.maximum(i)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I'm not convinced that we want 0.maximum(i).

*/
class SsaWriteDefinition extends SsaDefinition instanceof WriteDefinition { }

private predicate explicitWrite(WriteDefinition def, VariableWrite write) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would cache this.

Expr getValue() { result = this.getDefinition().getValue() }
}

private predicate parameterInit(WriteDefinition def, Parameter p) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would cache this.

SsaImplicitEntryDefinition() { this.definesAt(_, any(EntryBasicBlock bb), -1) }

/** Holds if this is a closure definition that captures the value of `capturedvar`. */
predicate captures(SsaDefinition capturedvar) { captures(this, capturedvar) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry that this predicate may give the wrong impression that this definition will always read the same value as capturedvar (I suspect it you included it, because that is actually the case in Java?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit back and forth on this. Firstly, I think it's useful to be able to get to the values of the outer scope in ad-hoc def-use chains like getAnUltimateDefinition, and Java certainly makes good use of it, even though it's technically a bit wrong for Kotlin (Kotlin allows post-capture modifications IIRC).

Another option would be to extend it to get all the possible inputs from the outer scope in case the variable is modified post-capture - that way the Java case would stay the same and other languages would see potentially multiple origins for a capture init - a bit like a phi node.

x = 0
f = () => { .. x .. } // currently sees x=0, could be extended to also see x=1
x = 1

What do you think? OTOH maybe that's silly, if we don't also include the following

x = 0
f () => { .. x .. } // can see x=0, but what about x=2 from g?
g () => { .. x=2; .. }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C# used to have rather complex logic for doing this kind of thing, but I got rid of it when switching over to the shared capture library. I think it is hard to make a good analysis for this kind of thing, without ultimately resolving to saying that all other SSA definitions could potentially be a source, which would make it useless.

So, if you still want it for Java, I think I would prefer to have that a Java-specific thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, then I think I'll just delete it from the shared class.

@aschackmull aschackmull merged commit 414e5ec into github:main Oct 21, 2025
36 checks passed
@aschackmull aschackmull deleted the ssa/ssa-sig branch October 21, 2025 10:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-change-note-required This PR does not need a change note

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants