New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ruby: Restrict GraphQL remote flow sources #14216
base: main
Are you sure you want to change the base?
Conversation
Previously we considered any splat parameter in a graphql resolver to be a remote flow source. Now we limit that to reads of the parameter which yield scalar types (e.g. String), as defined by the GraphQL schema. This should reduce GraphQL false positives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this - only minor questions/suggestions.
| * A call to `argument` in a GraphQL InputObject class. | ||
| */ | ||
| class GraphqlInputObjectArgumentDefinitionCall extends DataFlow::CallNode { | ||
| GraphqlInputObjectArgumentDefinitionCall() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be restricted based on method name? I think this would also currently match things like the description call in
class Types::DateRangeInput < Types::BaseInputObject
description "Range of dates"
argument :min, Types::Date, "Minimum value of the range"
argument :max, Types::Date, "Maximum value of the range"
end| */ | ||
| GraphqlType getArgumentElementType() { | ||
| result = | ||
| any(ArrayLiteral lit | lit = this.getArgument(1) and lit.getNumberOfElements() = 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| any(ArrayLiteral lit | lit = this.getArgument(1) and lit.getNumberOfElements() = 1) | |
| any(ArrayLiteral lit | lit = this.getArgumentType() and lit.getNumberOfElements() = 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't work, sadly, because GraphqlType extends ConstantAccess which is incompatible with ArrayLiteral
| } | ||
| } | ||
|
|
||
| private class GraphqlType extends ConstantAccess { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've not really dug into this, but is it helpful that we drop to the AST level for this rather than using DataFlow::ConstantAccessNode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did try this, but it basically propagates the problem to all the other classes in this file. You end up changing basically everything to work at the dataflow level, and I think that's likely to be a breaking change? We would also need to add a bunch of new member preds to the various dataflow classes to match what we have at the AST level. This is arguably a good thing, but it might be worth tackling as a followup. What do you think?
| result instanceof HashSplatParameter // often you see `def field(**args)` | ||
| ) | ||
| argDefn = this.getDefinition().getAnArgumentCall() and | ||
| [argDefn.getArgumentType(), argDefn.getArgumentElementType()].isScalar() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that restricting to only scalar arguments within getARoutedParameter makes the logic simpler/more general for reasoning about which parameters are user-controlled? Is that why we check scalar-ness at this point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We add the isScalar check here because we don't want enum parameters to be marked as remote flow sources. That leaves the question of object parameters, which shouldn't themselves be tainted but do contain possibly tainted (nested) scalar fields. Because this interface only allows us to specify entire parameters which should be considered tainted, we have to look only at scalars.
It does raise a question, which is what to do about non-scalar, non-enum parameters that aren't in a hash splat, e.g.
class SomeObject < GraphQL::Schema::Object
field :bar, String
end
field foo, String do
argument :inner SomeObject, "An object"
end
def foo(inner:)
sink inner[:bar]
endI think we need to extend GraphqlParameterAccess to handle this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should also add that we don't want to push the isScalar check into parameterAccess because that will prevent it from following a chain of non-scalar nested fields to a final inner field which is a scalar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2214cae should address this
Previously we considered any parameter in a graphql resolver to be
a remote flow source. Now we limit that to parameters (and reads of hash splat parameters) which yield scalar types (e.g. String), as defined by the GraphQL schema.
This should reduce GraphQL false positives.
In general, the idea is to ignore sources which are constrained by GraphQL types. For example: