Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvements #58

Merged
merged 4 commits into from
Jan 9, 2021

Conversation

rpopescu
Copy link
Contributor

@rpopescu rpopescu commented Oct 8, 2020

No description provided.

rpopescu and others added 3 commits October 8, 2020 11:55
Moved `name` and `type` vars back into loop body. Since moving them out doesn't provide any measurable performance improvement to justify odd look.
@Enmk
Copy link
Collaborator

Enmk commented Jan 6, 2021

Ok

@rpopescu
Copy link
Contributor Author

rpopescu commented Jan 7, 2021

@Enmk how does creating two strings in a loop not have any impact? In what test? In my real world use-case, it made a significant difference. It won't make one only if the strings can benefit from SSO, which is dependent on the values that they take at runtime (and having a SSO implementation of std::string).

@alexey-milovidov
Copy link
Member

+1 for @rpopescu

@Enmk
Copy link
Collaborator

Enmk commented Jan 7, 2021

@Enmk how does creating two strings in a loop not have any impact? In what test? In my real world use-case, it made a significant difference. It won't make one only if the strings can benefit from SSO, which is dependent on the values that they take at runtime (and having a SSO implementation of std::string).

@rpopescu I actually tested that against 9000-columns wide table and found no significant difference. But could you please share your real-life experience (and compiler\toolset version too)?

@alexey-milovidov
Copy link
Member

@Enmk you will always note the difference in tight loops if std::string is allocating memory. It is not a question... note that std::string also memsets the memory region.

@alexey-milovidov
Copy link
Member

But it may not be noticed due to overall substandard performance of clickhouse-cpp driver.

@alexey-milovidov
Copy link
Member

Or it's not a tight loop, then it's ok.

@Enmk
Copy link
Collaborator

Enmk commented Jan 9, 2021

@rpopescu @alexey-milovidov
My point:
This is hardly a tight loop, there are lots of things going on besides those std::string instances being constructed\destructed. Namely, creating and de-serializing a whole column. My estimation was that those std::string take less than 1% of execution time at best, and my tests unfortunately confirm this on reading a pretty wide (9000 columns) and pretty shallow (1 row) block.

BUT to make you both happy, I've moved those outside of the for loop. And please let's keep discussion civilized, and for performance that means numbers, not opinions.

But it may not be noticed due to overall substandard performance of clickhouse-cpp driver.

Agree it is not ideal and we are trying little by little to improve it, (specifically, ColumnString and ColumnFixedString got much better during 2020). I'm all in on improving performance, but unfortunately this simple trick doesn't help at all in this case.

@Enmk Enmk merged commit 3ea7e49 into ClickHouse:master Jan 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants