Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++: Reduce duplication from crement operations #14867

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

MathiasVP
Copy link
Contributor

This PR fixes a dataflow duplication issue caused by having multiple dataflow nodes return the same expression when the expression has tricky semantics that's difficult to model using just Exprs.

For example, consider the f(x++) operation. For this expression we generate IR equivalent to:

tmp = x
x = x + 1
f(tmp)

and the question is then: which dataflow node's asExpr() should return the x++ expression. Should it be the one corresponding to x = x + 1 (that is, the incremented value), or should it be tmp in f(tmp) (that is, the original value)?

The C/C++ standard would say that it should be the latter, but when writing queries people are less interested in what the standard demands, and more interesting in whatever fits their query logic, and often that ends up being the former 😂.

So C/C++ currently does both. So both the node corresponding to x = x + 1, and the node corresponding to tmp in f(tmp), return x++. This causes duplication when writing a query such as this (where we track flow from what source() points to, to what the argument of sink points to):

Screenshot 2023-11-20 at 15 34 42

This PR fixes this duplication by adding another dataflow node predicate called Node.asDefinition (and Node.asIndirectDefinition for the indirect case). In the context of the f(x++) example above, the Node.asDefinition predicate will have a result for the dataflow node corresponding to x = x + 1, and Node.asExpr will have a result only for the node corresponding to tmp.

This means that the query from before will only have 1 result:
Screenshot 2023-11-21 at 18 06 15

@github-actions github-actions bot added the C++ label Nov 21, 2023
@geoffw0
Copy link
Contributor

geoffw0 commented Nov 22, 2023

Seems like a reasonable direction. My main concern being whether people actually know / remember to use asDefinition() rather than asExpr() when they should. The explanation in the QLDoc on asDefinition() certainly helps.

@MathiasVP
Copy link
Contributor Author

MathiasVP commented Nov 22, 2023

Seems like a reasonable direction. My main concern being whether people actually know / remember to use asDefinition() rather than asExpr() when they should. The explanation in the QLDoc on asDefinition() certainly helps.

Thanks for the feedback! It should be noted that some other languages already have an asDefinition predicate on dataflow nodes, so it's hopefully not too unknown for people 🤞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants