-
Notifications
You must be signed in to change notification settings - Fork 759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: simple aggregation does not return #17036
Comments
Possibly related(only in 670 version): #17037 |
It was happening prior to 670, we upgraded to see if maybe fixed. |
I used version v1.2.672 with #17037 , still does not return. Please see revised logs attached. |
@zhang2014 any chance this could get some attention on the priority list? :) This is causing significant issues during workflows that involve any significant data sizes. |
can you reproduce it via |
I will try this and report back |
I have tried that just now and it doesn't seem to make a difference |
Does the message below mean that it is waiting for a task for which no workers are assigned? |
Eventually aborted with these final logs:
|
We still can't figure out the reason, better provide us a case to reproduce the issue. |
It is difficult to provide the replica case, I have tried myself previously by extracting the data to a development machine. The query appeared to run fine there. However, I will try that again in case I get a different result and can demonstrate the failure. If not, is there any other useful information to provide from this instance? |
It appears to be caused by other nodes. Could you provide the logs from all nodes in the this cluster? |
Those are consolidated logs, contain entries from both query pods, unfortunately it does not include the node for each query entry. I will gather the individual logs |
Hi, we may find some possible codes that lead this issue, you can try two different approaches
|
Thanks @sundy-li , I will try Option 1 firstly as that is easy to try. Option 2 requires a little more work to get a build env going, I don't suppose you could provide link to built docker image of that PR? |
See Makefile which has
|
Expanding to 3 nodes made the query work |
Apologies, I could not figure out how to make a custom query service docker image from the PR. Seems to be many interactions in the github workflows/actions. |
Hi, we find it's not related to that pr. We will submit another pr in recent days and give you an image to test this case. Please keep the data not changed, the bug is related to data distribution and is hard to reproduce.... |
Great, thanks for the update. We found another place where this happened and again, increasing to 3 query pods over 2 worked. I shall keep the test and await a test image, thanks |
@rad-pat Please test with 2 nodes pr: #17245 |
Thanks, I will test and report back 👍 |
@sundy-li I performed the following test: So, I think we can conclude that the revised image prevents the query failure in our case. |
Please wait for the next official nightly release, cause we fixed some memory leak bug in cluster #17252 |
Search before asking
Version
v1.2.670-nightly
What's Wrong?
Simple Aggregation query does not return. If we remove just one field from the aggregation, it will run quickly. Extracting the data to another server and running the query there seems to also work, so I'm not sure why the query will not run where it is. I was advised to record logs at the INFO level and they are attached.
Please see attached logs, query of interest is id
1adebd66-611e-41eb-9f2c-3bbf93931d36
databend-query-logs.txt
If you require any further details, please let me know.
How to Reproduce?
No response
Are you willing to submit PR?
The text was updated successfully, but these errors were encountered: