-
Notifications
You must be signed in to change notification settings - Fork 1.9k
C++: Remove bad self joins in cpp/toctou-race-condition.
#7517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C++: Remove bad self joins in cpp/toctou-race-condition.
#7517
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes otherwise LGTM, I'd like to see DCA results or similar there is one, I'll take a look.
| ) | ||
| } | ||
|
|
||
| predicate checkUse(Expr check, Expr checkPath, FunctionCall use, Expr usePath) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This predicate is odd, there seems to be no association between check / checkPath and use / usePath. Should it be two predicates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an association: If check / checkPath is a check on a filename, then use / usePath is an operation on a filename. And if check / checkPath is another filename operation, then use / usePath looks like a sensitive operation on a filename. So I don't think we can split them up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still looks like a cartesian product to me (or two smaller ones). All we care about is that the other half exists in the correct form, we don't care which one it is. So we could rewrite to something like:
predicate checkUse(Expr check, Expr checkPath) {
// `check` looks like a check on a filename
checksPath(check, checkPath) and
exists(FunctionCall use, Expr usePath |
use = filenameOperation(usePath)
)
or
// another filename operation (null pointers can indicate errors)
check = filenameOperation(checkPath) and
exists(FunctionCall use, Expr usePath |
use = sensitiveFilenameOperation(usePath)
)
}
plus the corresponding predicate for use / usePath. I guess maybe the optimizer is doing this already?
It also seems like a very strange thing to want. I'd be surprised if this is behaving exactly as we want it to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you mean wrt. the cartesian product. To satisfy my curiosity, I'll check what the compiler is actually doing here. I also agree that the logic itself is weird. I don't want to change the logic in this PR, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've refactored the predicates in 173cefd to include the missing piece of information (which was still in the select clause) to associates check and use.
|
DCA shows a significant speedup for |
|
173cefd fixes the review comments and makes the entire select clause a super crisp giant join now: Tuple counts for #select#cpe#135#fff/3@3ee072o8 after 0ms:
15 ~0% {3} r1 = JOIN TOCTOUFilesystemRace::isCheckedPath#ffbfbf WITH TOCTOUFilesystemRace::isUsedPath#ffffff ON FIRST 6 OUTPUT Lhs.3 'use', Lhs.4 'usePath', Lhs.0 'check'
return r1There's been a bunch of changes since I did the first DCA run. So I'll run another one to make sure I didn't destroy any performance on another project. |
|
DCA looks good. |
|
Except I don't see a |
Oh, yeah. That's a good point. Here's how I concluded that we still get the performance benefits in the new DCA run: In the first DCA run the query is mentioned in the
I think the missing 1.5% performance can be attributed to noise, so I think it's fair to conclude that we didn't lose any performance (especially considering the fact that we removed two cartesian products from the query). (In fact, the conclusion column being |
|
Thanks, I didn't know about this information in |
geoffw0
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy with this, and I think all comments have been addressed.
This PR removes a bunch of bad self joins from the
cpp/toctou-race-conditionthat I observed onphp. For instance this one:The changes textual changes look quite scary, but it's really just simple refactorings following this "algorithm":
I recommend enabling "Hide whitespace" when reviewing the PR. That makes the changes really easy to review.