Skip to content

[HUDI-5269] Enhancing spark-sql write tests for some of the core user flows#7230

Merged
codope merged 3 commits intoapache:masterfrom
jonvex:core_user_flow_spark_sql
Nov 29, 2022
Merged

[HUDI-5269] Enhancing spark-sql write tests for some of the core user flows#7230
codope merged 3 commits intoapache:masterfrom
jonvex:core_user_flow_spark_sql

Conversation

@jonvex
Copy link
Copy Markdown
Contributor

@jonvex jonvex commented Nov 17, 2022

Change Logs

We realized we don't have good test coverage for some of the core user flows w/ spark data source writes. So, enhancing the tests to cover the scenarios.

Tests coverage added w/ spark-data source writes:

COW and MOR * (w/ and w/o metadata)
    Partitioned(BLOOM, SIMPLE, GLOBAL_BLOOM), non-partitioned(GLOBAL_BLOOM).
        Immutable data. pure bulk_insert row writing. 
        Immutable w/ file sizing. pure inserts. 
        initial bulk ingest, followed by updates.

This pr is modeled after #7179
This uses a different sql schema that only has regular and struct fields because the the schema in the test above was too complicated to create in sql

Impact

Will help catch any bugs around core user flows upfront.

Risk level (write none, low medium or high below)

low

Documentation Update

N/A

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@jonvex jonvex force-pushed the core_user_flow_spark_sql branch from a396524 to b52fb73 Compare November 18, 2022 01:55
@nsivabalan
Copy link
Copy Markdown
Contributor

Please create a jira and fill in PR description.

@jonvex jonvex force-pushed the core_user_flow_spark_sql branch from 36b9c85 to 0d789c5 Compare November 22, 2022 15:48
@jonvex jonvex changed the title core flow tests working, but issues still to tackle and documentation… [HUDI-5269] Enhancing spark-sql write tests for some of the core user flows Nov 22, 2022
@jonvex jonvex requested a review from nsivabalan November 22, 2022 15:58
@jonvex jonvex marked this pull request as ready for review November 22, 2022 15:58
@hudi-bot
Copy link
Copy Markdown
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@codope codope merged commit bf2ca54 into apache:master Nov 29, 2022
satishkotha pushed a commit that referenced this pull request Dec 13, 2022
… flows (#7230)

Add good test coverage for some of the core user flows w/ spark data source writes.
fengjian428 pushed a commit to fengjian428/hudi that referenced this pull request Apr 5, 2023
… flows (apache#7230)

Add good test coverage for some of the core user flows w/ spark data source writes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

4 participants