Skip to content

Conversation

@rangadi
Copy link

@rangadi rangadi commented Feb 14, 2023

What changes were proposed in this pull request?

This fixes how setting for limiting recursive depth is handled in Protobuf functions (recursive.fields.max.depth).
Original documentation says as setting it to '0' removes the recursive field. But we never did that. We allow at least once. E.g. schema for recursive message 'EventPerson' does not change between the settings '0' or '1'.

This fixes it by requiring the max depth to be at least 1. It also fixes how the recursion enfored.
Updated the tests and added an extra test with new protobuf 'EventPersonWrapper'.

I will annotate the diff inline pointing to main fixes.

Why are the changes needed?

This fixes a bug with enforcing recursive.fields.max.depth and clarifies more in the documentation.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

  • Unit tests

@rangadi
Copy link
Author

rangadi commented Feb 14, 2023

@HeartSaVioR , @HyukjinKwon , @SandishKumarHN PTAL. I would like to get this fix asap to make it into 3.4.x in OSS and 12.2 RC cut in DBR.

throw QueryCompilationErrors.foundRecursionInProtobufSchema(fd.toString())
} else if (existingRecordNames.contains(recordName) &&
recursiveDepth > recursiveFieldMaxDepth) {
recursiveDepth >= recursiveFieldMaxDepth) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main bug fix:
We allowed recursive depth go over by one. This makes the setting of '0' does not work as documented (earlier documentation stated even the first occurrence would be removed it it was set to 0).
Removing the first occurrence does not make sense since we don't know if this field is recursive until the second occurrence.

}

test("Handle recursive fields in Protobuf schema, A->B->A") {
val schemaA = ProtobufUtils.buildDescriptor(testFileDesc, "recursiveA")
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This extra code to create actual records is not required since we are testing an AnalysisException. It does not need real data.

val df = Seq(employee.toByteArray).toDF("protoEvent")
val options = new java.util.HashMap[String, String]()
options.put("recursive.fields.max.depth", "1")
options.put("recursive.fields.max.depth", "2")
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to increase the setting because this PR fixes off by one with enforcement.

@rangadi
Copy link
Author

rangadi commented Feb 14, 2023

cc: @gengliangwang

Copy link
Contributor

@SandishKumarHN SandishKumarHN left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@gengliangwang gengliangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for one comment

@gengliangwang
Copy link
Member

Thanks, merging to master/3.4

gengliangwang pushed a commit that referenced this pull request Feb 14, 2023
…tions

### What changes were proposed in this pull request?

This fixes how setting for limiting recursive depth is handled in Protobuf functions (`recursive.fields.max.depth`).
Original documentation says as setting it to '0' removes the recursive field. But we never did that. We allow at least once. E.g. schema for recursive message 'EventPerson' does not change between the settings '0' or '1'.

This fixes it by requiring the max depth to be at least 1. It also fixes how the recursion enfored.
Updated the tests and added an extra test with new protobuf 'EventPersonWrapper'.

I will annotate the diff inline pointing to main fixes.

### Why are the changes needed?

This fixes a bug with enforcing `recursive.fields.max.depth` and clarifies more in the documentation.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
- Unit tests

Closes #40011 from rangadi/recursive-depth.

Authored-by: Raghu Angadi <[email protected]>
Signed-off-by: Gengliang Wang <[email protected]>
(cherry picked from commit 96516c8)
Signed-off-by: Gengliang Wang <[email protected]>
snmvaughan pushed a commit to snmvaughan/spark that referenced this pull request Jun 20, 2023
…tions

### What changes were proposed in this pull request?

This fixes how setting for limiting recursive depth is handled in Protobuf functions (`recursive.fields.max.depth`).
Original documentation says as setting it to '0' removes the recursive field. But we never did that. We allow at least once. E.g. schema for recursive message 'EventPerson' does not change between the settings '0' or '1'.

This fixes it by requiring the max depth to be at least 1. It also fixes how the recursion enfored.
Updated the tests and added an extra test with new protobuf 'EventPersonWrapper'.

I will annotate the diff inline pointing to main fixes.

### Why are the changes needed?

This fixes a bug with enforcing `recursive.fields.max.depth` and clarifies more in the documentation.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
- Unit tests

Closes apache#40011 from rangadi/recursive-depth.

Authored-by: Raghu Angadi <[email protected]>
Signed-off-by: Gengliang Wang <[email protected]>
(cherry picked from commit 96516c8)
Signed-off-by: Gengliang Wang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants