[Java] Data flow through an object #17069
-
As a newbie, I am trying to explore dataflow for the simple code as seen at https://docs.aws.amazon.com/AmazonS3/latest/userguide/example_s3_CopyObject_section.html
I want to get all data sources that contribute to the argument that is being passed to the copyObject function. module MyDataFlowConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node source) {
exists(Expr sourceExpr | source.asExpr() = sourceExpr)
}
predicate isSink(DataFlow::Node sink) {
exists(MethodCall call |
call.getMethod().hasName("copyObject") and
call.getMethod().getDeclaringType().getQualifiedName() = "software.amazon.awssdk.services.s3.S3Client" and
sink.asExpr() = call.getArgument(_)
)
}
}
module MyFlow = TaintTracking::Global<MyDataFlowConfig>;
from DataFlow::Node source, DataFlow::Node sink // [4]
where
MyFlow::flow(source, sink)
select sink, source, sink, source.toString() I was expecting to see the dataflow start with the sources arg[1..3] , and sink to the copyObject's argument. In reality, the source shows up as the build() function in the builder, variables that went into the builder are not tracked/ traced. My question is : Any insights highly appreciated |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 12 replies
-
Hi @sudeep-hypredge 👋🏻 Thanks for the question! For reference, I have pasted the part of the example code you seem to refer to below: CopyObjectRequest copyReq = CopyObjectRequest.builder()
.sourceBucket(fromBucket)
.sourceKey(objectKey)
.destinationBucket(toBucket)
.destinationKey(objectKey)
.build(); In order for CodeQL to track how data flows through methods from a library, we need to understand how inputs to a method relate to its outputs. For example, CodeQL needs to know that You can read about how to define your own models in our documentation at https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-java-and-kotlin/ |
Beta Was this translation helpful? Give feedback.
Hi @sudeep-hypredge 👋🏻
Thanks for the question! For reference, I have pasted the part of the example code you seem to refer to below:
In order for CodeQL to track how data flows through methods from a library, we need to understand how inputs to a method relate to its outputs. For example, CodeQL needs to know that
destinationKey
mutates the object it is called on withobjectKey
and returns the resulting object (as opposed to e.g. a new object, or the initial object). We have a collection of models which summari…