Skip to content

Conversation

@yikf
Copy link
Contributor

@yikf yikf commented Sep 18, 2022

Why are the changes needed?

Fix #3441;

This pr aims to two points as follow:

  • Bump Spark 3.3.0 to Spark 3.3.1 in spark-3.3 profile
  • Change default Spark version to 3.3.1

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before make a pull request

@github-actions github-actions bot added kind:build kind:infra license, community building, project builds, asf infra related, etc. labels Sep 18, 2022
@yikf
Copy link
Contributor Author

yikf commented Sep 18, 2022

@pan3793 , Tracking from here

@yikf yikf changed the title [KYUUBI #3441] Change default Spark version to 3.3.0 [KYUUBI #3441] Change default Spark version to 3.3.1 Sep 18, 2022
@yikf yikf changed the title [KYUUBI #3441] Change default Spark version to 3.3.1 [KYUUBI #3441][WIP] Change default Spark version to 3.3.1 Sep 18, 2022
@cfmcgrady
Copy link
Contributor

FYI: Spark-3.3.1-rc2 is out.
https://www.mail-archive.com/[email protected]/msg29460.html

@yikf
Copy link
Contributor Author

yikf commented Sep 28, 2022

@cfmcgrady Thanks, will test

@pan3793
Copy link
Member

pan3793 commented Sep 30, 2022

The test failed consistently, could you please take a look?

@yikf
Copy link
Contributor Author

yikf commented Oct 1, 2022

The test failed consistently, could you please take a look?

will check

run: >-
./build/mvn ${MVN_OPT} clean install
-Pflink-provided,hive-provided
-Pspark-3.2
Copy link
Contributor Author

@yikf yikf Oct 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like that we currently support spark 3.2 only, We should specify the profile here to prevent some incompatibilities (on k8s w/ client mode)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's tricky, what's the specific issue here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In spark on k8s w/ client mode, task deserialization may fail due to inconsistent Jar packages on the Driver and Executor ends

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh sorry, i missed we hardcoded the spark image in code, how about changing it to get from ENV then we can override the in test?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can get the compile Spark version by

SPARK_VERSION=$("$MVN" help:evaluate -Dexpression=spark.version $@ 2>/dev/null\
    | grep -v "INFO"\
    | grep -v "WARNING"\
    | tail -n 1)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I had the idea that we could pass the spark version of the current profile through the maven plugin so that the image version would be consistent with the spark version of the current profile, but I wanted to do this later for two reasons:

  • This is to address the current todo and seems to be unrelated to the current pr
  • The official spark image does not have a sub-version like Spark 3.3.1 https://hub.docker.com/r/apache/spark/tags, the Spark community is currently working on making the spark image the official docker image, maybe we can solve this todo after that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense, thanks

Copy link
Member

@pan3793 pan3793 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, it's in good shape now, let's wait for the next RC or GA version, thanks @yikf

@yikf
Copy link
Contributor Author

yikf commented Oct 17, 2022

Test for Spark 3.3.1 (RC4)

@pan3793
Copy link
Member

pan3793 commented Oct 23, 2022

Spark 3.3.1 (RC4) vote passed.
https://www.mail-archive.com/[email protected]/msg29565.html

@pan3793 pan3793 added this to the v1.7.0 milestone Oct 23, 2022
@pan3793 pan3793 changed the title [KYUUBI #3441][WIP] Change default Spark version to 3.3.1 [KYUUBI #3441] Change default Spark version to 3.3.1 Oct 25, 2022
@pan3793
Copy link
Member

pan3793 commented Oct 25, 2022

Thanks @yikf, merging to master! Also thanks @wangyum for releasing Spark 3.3.1

@pan3793 pan3793 closed this in f7c08dc Oct 25, 2022
@yikf
Copy link
Contributor Author

yikf commented Oct 25, 2022

Thanks all.

@yikf yikf deleted the pr/3471 branch October 25, 2022 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind:build kind:infra license, community building, project builds, asf infra related, etc. module:extensions module:spark

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Subtask] Change default Spark version to 3.3

3 participants