Skip to content

Conversation

@gengliangwang
Copy link
Member

What changes were proposed in this pull request?

Revert 4274fb8

Why are the changes needed?

I am starting a new project for storing live UI data in disk-based KVStore for higher UI retentions(e.g. storing more than 1000 SQL queries in live UI).
The new approach will be better than 4274fb8 . The new configurations and listeners won't be needed. ThusI am having a revert before starting my project.

Does this PR introduce any user-facing change?

No, it is not released yet.

How was this patch tested?

CI tests

@gengliangwang
Copy link
Member Author

cc @linhongliu-db @cloud-fan

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a consensus among @gengliangwang , @linhongliu-db , @cloud-fan, this looks like a clean revert.

cc @mridulm too

@mridulm
Copy link
Contributor

mridulm commented Nov 8, 2022

Do we have a prototype for the new approach ? Before we revert, would be good to understand what is replacing it

@gengliangwang
Copy link
Member Author

gengliangwang commented Nov 8, 2022

@mridulm There is no full prototype. I will send out a PR soon.
I did some benchmarks, it showed using Protobuf as serialization can speed up the read/write 3~4 times compared to the current implementation. Thus, we can use RocksDB as the storage layer, similar to the current InMemoryStore.
I have to revert this one for removing the duplicated diskStore in 4274fb8#diff-0cd94c36f45f50f513aeda6504fda0545270b05e105191131b601eb12b8a656bR42

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Nov 8, 2022

Hi, @gengliangwang . You can make a PR with two commits (this reverting and your PR).
Since this is for Apache Spark 3.4, we have enough time. Let's not hurry to merge this as we can wait like @mridulm advised.

@gengliangwang
Copy link
Member Author

@dongjoon-hyun sure, no problem

@LuciferYang
Copy link
Contributor

@mridulm There is no full prototype. I will send out a PR soon. I did some benchmarks, it showed using Protobuf as serialization can speed up the read/write 3~4 times compared to the current implementation. Thus, we can use RocksDB as the storage layer, similar to the current InMemoryStore. I have to revert this one for removing the duplicated diskStore in 4274fb8#diff-0cd94c36f45f50f513aeda6504fda0545270b05e105191131b601eb12b8a656bR42

@gengliangwang Have you compared Protobuf and Flatbuffers? From the official benchmark data, Flatbuffers may have better performance.

http://google.github.io/flatbuffers/flatbuffers_benchmarks.html

@gengliangwang
Copy link
Member Author

@LuciferYang I didn't check that. Protobuf is mature(Spark Connect is also using Protobuf), and the result is good enough.

@LuciferYang
Copy link
Contributor

@LuciferYang I didn't check that. Protobuf is mature(Spark Connect is also using Protobuf), and the result is good enough.

Got it

@gengliangwang
Copy link
Member Author

FYI I have created #38567, which contains the reverting changes in this one.
Merging the revert of this one can make the review a lot easier.

@gengliangwang
Copy link
Member Author

Since both @linhongliu-db and @cloud-fan agreed on the revert, I will merge this one to make the review of #38567 a lot easier.

SandishKumarHN pushed a commit to SandishKumarHN/spark that referenced this pull request Dec 12, 2022
…debug information for live UI"

### What changes were proposed in this pull request?

Revert apache@4274fb8

### Why are the changes needed?

I am starting a new project for storing live UI data in disk-based KVStore for higher UI retentions(e.g. storing more than 1000 SQL queries in live UI).
The new approach will be better than apache@4274fb8 . The new configurations and listeners won't be needed. ThusI am having a revert before starting my project.

### Does this PR introduce _any_ user-facing change?

No, it is not released yet.
### How was this patch tested?

CI tests

Closes apache#38542 from gengliangwang/revertSPARK-38550.

Authored-by: Gengliang Wang <[email protected]>
Signed-off-by: Gengliang Wang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants