-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Ruby: Improvements to RegExpTracking
#13077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
59de393 to
1d723a4
Compare
b440db9 to
ddebf59
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your implementation looks good.
I'm used to "content" just being string-constants that directly correspond to property names, but I see that it's actually more complicated here.
You are using that string-constants gets converted to regular-expressions, but not the other way around, which makes the implementation conceptually simpler.
I should probably have done that from the start.
As said, the implementation is fine, but I don't think the readability is.
E.g. what is n in regFromString?
I know what the code did previously, which helped in reading it.
But I think the rest of the Ruby/Python team will find it harder.
If you can look through the code again and give some more descriptive names (and maybe some selective docstrings), that would help.
ddebf59 to
51087d0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
But I think you also need a review from the language teams.
@github/codeql-python / @github/codeql-ruby: your turn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Python change is good, thanks :-)
As I now read the other parts also, I left some questions, but they are mostly for my own curiosity..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - nice refactor, I enjoyed reading that :-)
When #11879 switched from using global data flow to using type tracking, for identifying regular expressions, we lost flow through built-in string methods. This is because the conjunct
doesn't work in type tracking, where we don't have flow-through. Instead, we should be using
FlowSummaryImpl::Private::Steps::summaryThroughStepTaint, which this PR does.Moreover, this PR improves on performance by strengthening the initial forwards/backwards pruning, which also revealed a bug in the existing
TypeBackTracker::getACompatibleTypeTrackerpredicate.DCA shows a nice speedup on
artichoke__artichoke.