Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ruby: Restrict GraphQL remote flow sources #14216

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

hmac
Copy link
Contributor

@hmac hmac commented Sep 14, 2023

Previously we considered any parameter in a graphql resolver to be
a remote flow source. Now we limit that to parameters (and reads of hash splat parameters) which yield scalar types (e.g. String), as defined by the GraphQL schema.

This should reduce GraphQL false positives.

In general, the idea is to ignore sources which are constrained by GraphQL types. For example:

class class MediaCategory < GraphQL::Schema::Enum
  value "AUDIO", "An audio file, such as music or spoken word"
  value "IMAGE", "A still image, such as a photo or graphic"
  value "TEXT", "Written words"
  value "VIDEO", "Motion picture, may have audio"
end

class Post < GraphQL::Schema::Object
  field :title, String
  field :body, String
  field :media_category, Types::MediaCategory
end

class QueryType < GraphQL::Schema::Object
  field :my_field, String do
    argument :post, Post, "A post"
  end
  def my_field(**args)
    sink args[:post][:title] # NOT OK - remote input reaches sink
    sink args[:post][:media_category] # OK - input is constrained by the `MediaCategory` enum
  end
end

Previously we considered any splat parameter in a graphql resolver to be
a remote flow source. Now we limit that to reads of the parameter which
yield scalar types (e.g. String), as defined by the GraphQL schema.

This should reduce GraphQL false positives.
@github-actions github-actions bot added the Ruby label Sep 14, 2023
@hmac hmac marked this pull request as ready for review September 18, 2023 16:03
@hmac hmac requested a review from a team as a code owner September 18, 2023 16:03
Copy link
Contributor

@alexrford alexrford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this - only minor questions/suggestions.

* A call to `argument` in a GraphQL InputObject class.
*/
class GraphqlInputObjectArgumentDefinitionCall extends DataFlow::CallNode {
GraphqlInputObjectArgumentDefinitionCall() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be restricted based on method name? I think this would also currently match things like the description call in

class Types::DateRangeInput < Types::BaseInputObject
  description "Range of dates"
  argument :min, Types::Date, "Minimum value of the range"
  argument :max, Types::Date, "Maximum value of the range"
end

*/
GraphqlType getArgumentElementType() {
result =
any(ArrayLiteral lit | lit = this.getArgument(1) and lit.getNumberOfElements() = 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
any(ArrayLiteral lit | lit = this.getArgument(1) and lit.getNumberOfElements() = 1)
any(ArrayLiteral lit | lit = this.getArgumentType() and lit.getNumberOfElements() = 1)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work, sadly, because GraphqlType extends ConstantAccess which is incompatible with ArrayLiteral

}
}

private class GraphqlType extends ConstantAccess {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not really dug into this, but is it helpful that we drop to the AST level for this rather than using DataFlow::ConstantAccessNode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did try this, but it basically propagates the problem to all the other classes in this file. You end up changing basically everything to work at the dataflow level, and I think that's likely to be a breaking change? We would also need to add a bunch of new member preds to the various dataflow classes to match what we have at the AST level. This is arguably a good thing, but it might be worth tackling as a followup. What do you think?

result instanceof HashSplatParameter // often you see `def field(**args)`
)
argDefn = this.getDefinition().getAnArgumentCall() and
[argDefn.getArgumentType(), argDefn.getArgumentElementType()].isScalar()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that restricting to only scalar arguments within getARoutedParameter makes the logic simpler/more general for reasoning about which parameters are user-controlled? Is that why we check scalar-ness at this point?

Copy link
Contributor Author

@hmac hmac Sep 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We add the isScalar check here because we don't want enum parameters to be marked as remote flow sources. That leaves the question of object parameters, which shouldn't themselves be tainted but do contain possibly tainted (nested) scalar fields. Because this interface only allows us to specify entire parameters which should be considered tainted, we have to look only at scalars.

It does raise a question, which is what to do about non-scalar, non-enum parameters that aren't in a hash splat, e.g.

class SomeObject < GraphQL::Schema::Object
  field :bar, String
end

field foo, String do
  argument :inner SomeObject, "An object"
end

def foo(inner:)
  sink inner[:bar]
end

I think we need to extend GraphqlParameterAccess to handle this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should also add that we don't want to push the isScalar check into parameterAccess because that will prevent it from following a chain of non-scalar nested fields to a final inner field which is a scalar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2214cae should address this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants