Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions .github/workflows/build_and_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -147,8 +147,9 @@ jobs:
mllib-local, mllib, graphx
- >-
streaming, sql-kafka-0-10, streaming-kafka-0-10, streaming-kinesis-asl,
yarn, kubernetes, hadoop-cloud, spark-ganglia-lgpl,
connect, protobuf
kubernetes, hadoop-cloud, spark-ganglia-lgpl, protobuf
- >-
yarn, connect
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm ... Does separating the build help the stabilization? I am not against this change but splitting this will cause 30mins time for building. Although it runs in parallel, individual fork has some resource limitation.

Copy link
Member Author

@dongjoon-hyun dongjoon-hyun Feb 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, I also was reluctant due to that building time. Specifically, It will take 34 minutes in total run time. The second goal is to reduce the re-trigger time. Previously, YARN/Connect/Kafka modules build a bad synergy because their failure rates are multiplied. And, I needed to re-trigger and wait over 70 ~ 90minutes for this pipeline.

Screenshot 2024-02-14 at 16 39 19

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, the streaming, ... pipeline run time is down to 62m. So, the total increased overhead is around 20min.

Screenshot 2024-02-14 at 16 47 28

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm.. I think we should actually better fix the flakiness .. I am fine with this as a temporary change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I hoped so too. For now, SparkSessionE2ESuite flakiness and YarnClusterSuite seems to be abandoned.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my side, we don't use YARN and Spark Connect.

# Here, we split Hive and SQL tests into some of slow ones and the rest of them.
included-tags: [""]
excluded-tags: [""]
Expand Down