Skip to content

Comments

Don't generate code for Parquet readers; Make batchreader code available and findable in the source repository#23699

Merged
elharo merged 1 commit intomasterfrom
unmark
Oct 4, 2024
Merged

Don't generate code for Parquet readers; Make batchreader code available and findable in the source repository#23699
elharo merged 1 commit intomasterfrom
unmark

Conversation

@elharo
Copy link
Contributor

@elharo elharo commented Sep 22, 2024

Description

Actualize templated code. Someone was too clever by half when they committed this hack. Reuse in Java is provided by inheritance and generics, not by Rube Goldberg contraptions that search and replace code. This broke and wasted the time of at least two different developers in independent events in the last two weeks. And then a second time for me this morning. At least this time I recognized the problem so it cost me minutes and not hours to fix, but that's still minutes I shouldn't have spent on this.

"Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems." --Jamie Zawinski,

The general antipattern in play here is making the build system (Maven in our case) a fundamental part of the code rather than a tool that is applied to the code. This tightly couples the project to one build tool and prevents the project from being built with other tools like IntelliJ or buck. This fix is mandatory if Meta is ever to buckify its internal build. The same problems apply in Eclipse, blaze, gradle, and any other build tool that is more complex than simply shelling out to Maven. The current approach might not even work with Maven 4.

Screenshot 2024-09-23 at 9 30 46 AM

Motivation and Context

  1. Better code search and browsing since the actual classes and code are present in the source repo.
  2. Easier integration and builds in IntelliJ, Eclipse, and every other tool that tries to understand what's going on here
  3. More standard build process, easier for developers to grok. Developers shouldn't have to learn Freemarker to work on presto.
  4. One less dependency that can introduce bugs and security issues into our build and CI.



Impact

None

Test Plan

CI
mvn test

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

== NO RELEASE NOTE ==

@elharo elharo force-pushed the unmark branch 2 times, most recently from ccdf3ca to 56f76af Compare September 22, 2024 21:15
@elharo elharo changed the title Don't generate code Make batchreader code availabel and findable in the source repository Sep 22, 2024
@elharo elharo changed the title Make batchreader code availabel and findable in the source repository Make batchreader code available and findable in the source repository Sep 22, 2024
@elharo elharo marked this pull request as ready for review September 23, 2024 11:14
@elharo elharo requested review from a team and shangxinli as code owners September 23, 2024 11:14
@elharo elharo requested a review from presto-oss September 23, 2024 11:14
@facebook-github-bot
Copy link
Collaborator

@elharo has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Collaborator

@elharo has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@rschlussel rschlussel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the commit message description with some details about the motivation for this change and mention Parquet in the commit title (e.g. "Don't generate code for Parquet readers")?

I agree this is a good change. Let's also tag for review someone with expertise in the parquet code.

@rschlussel rschlussel requested a review from zhenxiao October 2, 2024 14:01
@sdruzkin
Copy link
Collaborator

sdruzkin commented Oct 2, 2024

I'd suggest to keep the old templates and add a script doing source code generation for future when the code should be regenerated.

zhenxiao
zhenxiao previously approved these changes Oct 2, 2024
Copy link
Collaborator

@zhenxiao zhenxiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to put back the Parquet code. This is nice for debugging
I believe when @vkorukanti was adding the Parquet batch readers optimization, we were trying not to add duplicate code, so use code gen.
I would prefer if we could keep the code generator and template, just disabling them, so that in the future, if we decide to do code gen, we could leverage the existing framework

@elharo elharo changed the title Make batchreader code available and findable in the source repository Don't generate code for Parquet readers; Make batchreader code available and findable in the source repository Oct 3, 2024
@facebook-github-bot
Copy link
Collaborator

@elharo has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor Author

@elharo elharo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL. I restored the freemarker templates in case someone wants to use them again, and updated the commit message as requested.

@zhenxiao
Copy link
Collaborator

zhenxiao commented Oct 3, 2024

looks good. thank you, @elharo

@elharo elharo merged commit 7c814ae into master Oct 4, 2024
@elharo elharo deleted the unmark branch October 4, 2024 11:11
@ZacBlanco ZacBlanco mentioned this pull request Oct 7, 2024
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants