Don't generate code for Parquet readers; Make batchreader code available and findable in the source repository#23699
Conversation
ccdf3ca to
56f76af
Compare
|
@elharo has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
@elharo has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
rschlussel
left a comment
There was a problem hiding this comment.
Can you update the commit message description with some details about the motivation for this change and mention Parquet in the commit title (e.g. "Don't generate code for Parquet readers")?
I agree this is a good change. Let's also tag for review someone with expertise in the parquet code.
|
I'd suggest to keep the old templates and add a script doing source code generation for future when the code should be regenerated. |
There was a problem hiding this comment.
Good to put back the Parquet code. This is nice for debugging
I believe when @vkorukanti was adding the Parquet batch readers optimization, we were trying not to add duplicate code, so use code gen.
I would prefer if we could keep the code generator and template, just disabling them, so that in the future, if we decide to do code gen, we could leverage the existing framework
…ble and findable in the source repository
|
@elharo has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
elharo
left a comment
There was a problem hiding this comment.
PTAL. I restored the freemarker templates in case someone wants to use them again, and updated the commit message as requested.
|
looks good. thank you, @elharo |
Description
Actualize templated code. Someone was too clever by half when they committed this hack. Reuse in Java is provided by inheritance and generics, not by Rube Goldberg contraptions that search and replace code. This broke and wasted the time of at least two different developers in independent events in the last two weeks. And then a second time for me this morning. At least this time I recognized the problem so it cost me minutes and not hours to fix, but that's still minutes I shouldn't have spent on this.
"Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems." --Jamie Zawinski,
The general antipattern in play here is making the build system (Maven in our case) a fundamental part of the code rather than a tool that is applied to the code. This tightly couples the project to one build tool and prevents the project from being built with other tools like IntelliJ or buck. This fix is mandatory if Meta is ever to buckify its internal build. The same problems apply in Eclipse, blaze, gradle, and any other build tool that is more complex than simply shelling out to Maven. The current approach might not even work with Maven 4.
Motivation and Context
Impact
None
Test Plan
CI
mvn test
Contributor checklist
Release Notes