Skip to content

Enable only Hadoop 3.1 testing version for S3 Select JSON tests#13486

Merged
hashhar merged 1 commit intotrinodb:masterfrom
preethiratnam:s3-select-json-tests
Aug 22, 2022
Merged

Enable only Hadoop 3.1 testing version for S3 Select JSON tests#13486
hashhar merged 1 commit intotrinodb:masterfrom
preethiratnam:s3-select-json-tests

Conversation

@preethiratnam
Copy link
Copy Markdown
Contributor

@preethiratnam preethiratnam commented Aug 3, 2022

Description

Limit S3 Select JSON tests to be run only on the Hadoop 3.1 test image. This is because the JSONSerDe class required by these tests is not available in Hadoop 2.6.

This PR is a follow up to #13354, where we incorrectly forced all S3 tests to run only on Hadoop 3.1. This PR attempts to fix this by limiting just the S3 Select JSON tests to run in Hadoop 3.1.

Is this change a fix, improvement, new feature, refactoring, or other?

Fix.

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

Trino Hive connector tests

How would you describe this change to a non-technical end user or system administrator?

Bug fix that improves test coverage for S3.

Related issues, pull requests, and links

#13354

Documentation

( ) No documentation is needed.
(x) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

(x) No release notes entries required.
( ) Release notes entries required.

@findepi
Copy link
Copy Markdown
Member

findepi commented Aug 4, 2022

@preethiratnam thank you for your PR!

can you please rebase on current master, to avoid merge commits?

@preethiratnam preethiratnam force-pushed the s3-select-json-tests branch 2 times, most recently from ed02aa2 to 760572f Compare August 4, 2022 14:49
@preethiratnam
Copy link
Copy Markdown
Contributor Author

Hi @nineinchnick, can you please review? I've removed the merge commit and all tests are now passing.

@preethiratnam preethiratnam force-pushed the s3-select-json-tests branch 4 times, most recently from c960eca to 92ff037 Compare August 8, 2022 12:20
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if: matrix.config == 'config-hdp3'
# JsonSerde class is only available on hdp3 which is needed for the S3 Select JSON tests.
if: matrix.config == 'config-hdp3'

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, thank you.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessary complexity for ci.yml and quite an overhead if all we want is to run a test method under some configs only.

This perhaps could be done more simply

  • define a profile in pom.xml which would disable the JSON test
  • change hive-tests-${{ matrix.config }}.sh so they populate some config-related profile

would that work?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, not sure I follow - so in this PR, I've defined a new profile in pom.xml to run the JSON test:

<profile>
            <id>test-hive-hadoop2-s3-select-json</id>
            <build>
                <plugins>
                    <plugin>
                        <groupId>org.apache.maven.plugins</groupId>
                        <artifactId>maven-surefire-plugin</artifactId>
                        <configuration>
                            <includes>
                                <include>**/TestHiveFileSystemS3SelectJsonPushdown.java</include>
                            </includes>
                        </configuration>
                    </plugin>
                </plugins>
            </build>
        </profile>

This test is now excluded from the Hive S3 tests profile.

In ci.yml, I initially had the following:

source plugin/trino-hive-hadoop2/conf/hive-tests-config-hdp3.sh &&
              plugin/trino-hive-hadoop2/bin/run_hive_s3_select_json_tests.sh

But switched it to the following based on Jan's feedback to be consistent with the other tests.

if: matrix.config == 'config-hdp3'
source plugin/trino-hive-hadoop2/conf/hive-tests-${{ matrix.config }}.sh &&
              plugin/trino-hive-hadoop2/bin/run_hive_s3_select_json_tests.sh

Are you referring to switching back to the first approach or something else?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source plugin/trino-hive-hadoop2/conf/hive-tests-${{ matrix.config }}.sh &&
plugin/trino-hive-hadoop2/bin/run_hive_s3_select_json_tests.sh

I am not convinced we need a separate run of tests, do we?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall I remove the Run Hive S3 Select JSON Tests step and add it to Run Hive S3 Tests step, like this:

- name: Run Hive S3 Tests
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESSKEY }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRETKEY }}
          S3_BUCKET: "presto-ci-test"
          S3_BUCKET_ENDPOINT: "s3.us-east-2.amazonaws.com"
        run: |
          if [ "${AWS_ACCESS_KEY_ID}" != "" ]; then
            source plugin/trino-hive-hadoop2/conf/hive-tests-${{ matrix.config }}.sh &&
              plugin/trino-hive-hadoop2/bin/run_hive_s3_tests.sh
          // if config is hdp3, run hive s3 select json tests
          fi

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would make sense to do.

@hashhar
Copy link
Copy Markdown
Member

hashhar commented Aug 19, 2022

Sorry for the delays here @preethiratnam.

This looks good to go % #13486 (comment).

I'll defer to Andrii for #13486 (comment)

@hashhar
Copy link
Copy Markdown
Member

hashhar commented Aug 22, 2022

Unrelated failures. Seen on other PR as well for example - #13532

@hashhar hashhar merged commit 98e5607 into trinodb:master Aug 22, 2022
@github-actions github-actions bot added this to the 394 milestone Aug 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants