Skip to content

Conversation

@geoffw0
Copy link
Contributor

@geoffw0 geoffw0 commented Mar 6, 2023

Model Sequence.withContiguousStorageIfAvailable. The intent is to model the whole set of container methods that take a closure and pass it a pointer of some kind to their contents, but I'm starting with just this one as a prototype.

There's an issue with taint being conflated between different calls to the function - namely between the parameter of the closure passed to the different calls.

TODO: once this is ready, it will need a change note as well.

@geoffw0 geoffw0 added no-change-note-required This PR does not need a change note Swift labels Mar 6, 2023
clean.withContiguousStorageIfAvailable({
ptr in
sink(arg: ptr)
sink(arg: ptr) // $ SPURIOUS: tainted=366
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no variable capturing going on here (so https://github.com/github/codeql-c-team/issues/1508 won't have any affect here). The model for withContiguousStorageIfAvailable says that there should be flow from the qualifier to the first parameter of the supplied callback function. So we have to resolve what the callback function actually calls. In this case we don't manage to distinguish between the two supplied callbacks, and we conclude that tainted on line 439 can flow to the ptr parameter in line 435.

So the fix for this would be to tell the dataflow library that certain calls would benefit from call contexts. This is done by implementing mayBenefitFromCallContext (to specify that a given call could be resolved more precisely if a call context was available) and viableImplInCallContext (to actually compute the target of the call given a call context).

I don't think we have an issue for implementing these predicates, so feel free to create one. For inspiration you can link to how we've implemented these predicates in IR dataflow for C/C++ (where we use call contexts to resolve calls to function pointers).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking my understanding here:

DataFlowCallable viableImplInCallContext(DataFlowCall call, DataFlowCall ctx)

is saying that call may target result. e.g. if we needed to do this for static calls, result would be call.getStaticTarget().

Is ctx the call to the enclosing function of call?

So in this case we want ctx to be the call to withContiguousStorageIfAvailable, call will be the (presumed*) call inside it to the closure parameter, and result will be the closure that was actually passed in for this particular ctx.

* - I'm not sure call will actually be in the database as withContiguousStorageIfAvailable is a summarized library function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is saying that call may target result. e.g. if we needed to do this for static calls, result would be call.getStaticTarget().

Correct.

Is ctx the call to the enclosing function of call?

Exactly. ctx is the magic sauce: it's call that caused us to enter the body that is call's enclosing function. So, for example, if we're trying to analyse what f may target in an example like:

void foo(void (*f)(int)) {
  f(42);
}

void myFunction(int) { ... }
...
foo(myFunction);

then call would be f(42), and ctx would be foo(myFunction).

So in this case we want ctx to be the call to withContiguousStorageIfAvailable, call will be the (presumed*) call inside it to the closure parameter, and result will be the closure that was actually passed in for this particular ctx.

Correct.

    • I'm not sure call will actually be in the database as withContiguousStorageIfAvailable is a summarized library function.

Hm, yes. I'm not actually sure what the story is regarding the interaction of call contexts and MaD 🤔. For that we might need to pull in either Schack or Tom.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This spurious result has magically disappeared (in the merge with main). I'm not aware of any changes being made to mayBenefitFromCallContext and viableImplInCallContext, but perhaps I missed them or something cleverer is happening now. I'm happy to accept this - but let me know if you think there's anything to discuss please.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not aware of any changes that would have affected this either.

MathiasVP
MathiasVP previously approved these changes Mar 7, 2023
Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also LGTM once we have a successful DCA run.

@EladiaPaviolitis4oe
Copy link

understand that we're trying to model the whole set of container methods that take a closure and pass it a pointer of some kind to their contents, but this issue is causing some roadblocks. I'm hoping that we can find a fix or at least take steps towards one soon.

@geoffw0 geoffw0 removed the no-change-note-required This PR does not need a change note label May 23, 2023
@geoffw0
Copy link
Contributor Author

geoffw0 commented Jul 24, 2023

I've just pushed some more changes:

  • merging main, which makes the spurious result go away. I don't really know why but there have been lots of data flow improvements between when I first made this branch and now, so I'm happy to accept it.
  • modelling return value propagation through Sequence.withContiguousStorageIfAvailable(_:).
  • added a change note.

Results should be better when we have #13795 as well. I'll update whichever PR isn't merged first with the overlap work.

I'm hoping that we can find a fix or at least take steps towards one soon.

Sorry for the delays, this issue got pushed to the back of the list somewhere. But it looks like it will be merged early this week and the other closure method models should follow soon after.

@geoffw0 geoffw0 marked this pull request as ready for review July 24, 2023 08:27
@geoffw0 geoffw0 requested a review from a team as a code owner July 24, 2023 08:27
Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as soon as we have a successful DCA run!

@geoffw0
Copy link
Contributor Author

geoffw0 commented Jul 25, 2023

DCA LGTM. Merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants