Skip to content

Conversation

@JoshRosen
Copy link
Contributor

What changes were proposed in this pull request?

This PR fixes an issue where BitwiseCount / bit_count of boolean inputs would cause codegen to generate syntactically invalid Java code that does not compile, triggering errors like

 java.util.concurrent.ExecutionException: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 41, Column 11: Failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 41, Column 11: Unexpected token "if" in primary

Even though this code has test cases in bitwise.sql via the query test framework, those existing test cases were insufficient to find this problem: I believe that is because the example queries were constant-folded using the interpreted path, leaving the codegen path without test coverage.

This PR fixes the codegen issue and adds explicit expression tests to ensure that the same tests run on both the codegen and interpreted paths.

Why are the changes needed?

Fix a rare codegen to interpreted fallback issue, which may harm query performance.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Added new test cases to BitwiseExpressionsSuite.scala, copied from the existing bitwise.sql query test case file.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label May 4, 2024
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @JoshRosen .

dongjoon-hyun pushed a commit that referenced this pull request May 4, 2024
…en syntax error for boolean type inputs

### What changes were proposed in this pull request?

This PR fixes an issue where `BitwiseCount` / `bit_count` of boolean inputs would cause codegen to generate syntactically invalid Java code that does not compile, triggering errors like

```
 java.util.concurrent.ExecutionException: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 41, Column 11: Failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 41, Column 11: Unexpected token "if" in primary
```

Even though this code has test cases in `bitwise.sql` via the query test framework, those existing test cases were insufficient to find this problem: I believe that is because the example queries were constant-folded using the interpreted path, leaving the codegen path without test coverage.

This PR fixes the codegen issue and adds explicit expression tests to ensure that the same tests run on both the codegen and interpreted paths.

### Why are the changes needed?

Fix a rare codegen to interpreted fallback issue, which may harm query performance.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Added new test cases to BitwiseExpressionsSuite.scala, copied from the existing `bitwise.sql` query test case file.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #46382 from JoshRosen/SPARK-48128-bit_count_codegen.

Authored-by: Josh Rosen <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 96f65c9)
Signed-off-by: Dongjoon Hyun <[email protected]>
dongjoon-hyun pushed a commit that referenced this pull request May 4, 2024
…en syntax error for boolean type inputs

### What changes were proposed in this pull request?

This PR fixes an issue where `BitwiseCount` / `bit_count` of boolean inputs would cause codegen to generate syntactically invalid Java code that does not compile, triggering errors like

```
 java.util.concurrent.ExecutionException: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 41, Column 11: Failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 41, Column 11: Unexpected token "if" in primary
```

Even though this code has test cases in `bitwise.sql` via the query test framework, those existing test cases were insufficient to find this problem: I believe that is because the example queries were constant-folded using the interpreted path, leaving the codegen path without test coverage.

This PR fixes the codegen issue and adds explicit expression tests to ensure that the same tests run on both the codegen and interpreted paths.

### Why are the changes needed?

Fix a rare codegen to interpreted fallback issue, which may harm query performance.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Added new test cases to BitwiseExpressionsSuite.scala, copied from the existing `bitwise.sql` query test case file.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #46382 from JoshRosen/SPARK-48128-bit_count_codegen.

Authored-by: Josh Rosen <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 96f65c9)
Signed-off-by: Dongjoon Hyun <[email protected]>
@dongjoon-hyun
Copy link
Member

Merged to master/3.5/3.4.

sinaiamonkar-sai pushed a commit to sinaiamonkar-sai/spark that referenced this pull request May 5, 2024
…en syntax error for boolean type inputs

### What changes were proposed in this pull request?

This PR fixes an issue where `BitwiseCount` / `bit_count` of boolean inputs would cause codegen to generate syntactically invalid Java code that does not compile, triggering errors like

```
 java.util.concurrent.ExecutionException: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 41, Column 11: Failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 41, Column 11: Unexpected token "if" in primary
```

Even though this code has test cases in `bitwise.sql` via the query test framework, those existing test cases were insufficient to find this problem: I believe that is because the example queries were constant-folded using the interpreted path, leaving the codegen path without test coverage.

This PR fixes the codegen issue and adds explicit expression tests to ensure that the same tests run on both the codegen and interpreted paths.

### Why are the changes needed?

Fix a rare codegen to interpreted fallback issue, which may harm query performance.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Added new test cases to BitwiseExpressionsSuite.scala, copied from the existing `bitwise.sql` query test case file.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#46382 from JoshRosen/SPARK-48128-bit_count_codegen.

Authored-by: Josh Rosen <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
turboFei pushed a commit to turboFei/spark that referenced this pull request Nov 6, 2025
…en syntax error for boolean type inputs (apache#388)

### What changes were proposed in this pull request?

This PR fixes an issue where `BitwiseCount` / `bit_count` of boolean inputs would cause codegen to generate syntactically invalid Java code that does not compile, triggering errors like

```
 java.util.concurrent.ExecutionException: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 41, Column 11: Failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 41, Column 11: Unexpected token "if" in primary
```

Even though this code has test cases in `bitwise.sql` via the query test framework, those existing test cases were insufficient to find this problem: I believe that is because the example queries were constant-folded using the interpreted path, leaving the codegen path without test coverage.

This PR fixes the codegen issue and adds explicit expression tests to ensure that the same tests run on both the codegen and interpreted paths.

### Why are the changes needed?

Fix a rare codegen to interpreted fallback issue, which may harm query performance.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Added new test cases to BitwiseExpressionsSuite.scala, copied from the existing `bitwise.sql` query test case file.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#46382 from JoshRosen/SPARK-48128-bit_count_codegen.

Authored-by: Josh Rosen <[email protected]>

(cherry picked from commit 96f65c9)

Signed-off-by: Dongjoon Hyun <[email protected]>
Co-authored-by: Josh Rosen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants