-
Notifications
You must be signed in to change notification settings - Fork 527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
curvefs client : fix bug of getleader always fails causes stack overflow #1070
Conversation
recheck |
excutor->DoAsyncRPCTask(taskDone); | ||
TaskExecutorDone *taskDone = new TaskExecutorDone( | ||
excutor, done); | ||
done_guard.release(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems this done_guard
is useless
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
excutor->DoAsyncRPCTask(taskDone); | ||
TaskExecutorDone *taskDone = new TaskExecutorDone( | ||
excutor, done); | ||
done_guard.release(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
ExcuteTask(channel.get(), done); | ||
done_guard.release(); | ||
return; | ||
return retCode; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
retCode
is always -1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
bthread_usleep(opt_.retryIntervalUS); | ||
continue; | ||
} | ||
ExcuteTask(channel.get(), done); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ExcuteTask
has a return value, but you didn't check it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
excutor, done); | ||
done_guard.release(); | ||
brpc::ClosureGuard taskDone_guard(taskDone); | ||
int ret = excutor->DoAsyncRPCTask(taskDone); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DoAsyncRPCTask
will block forever if the target copyset doesn't have a valid leader, and in this case, caller's thread is also blocked, which means it's not actually asynchronous.
is this you expected? or make the whole process asynchronous
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if getleader failed, which means the cluster is abnomal, it is reasonable to stuck all the async task here.
recheck |
recheck |
1 similar comment
recheck |
@@ -586,8 +586,15 @@ void MetaServerClientImpl::UpdateInodeAsync(const Inode &inode, | |||
MetaServerOpType::UpdateInode, task, inode.fsid(), inode.inodeid()); | |||
auto excutor = std::make_shared<UpdateInodeExcutor>(opt_, | |||
metaCache_, channelManager_, taskCtx); | |||
TaskExecutorDone *taskDone = new TaskExecutorDone(excutor, done); | |||
excutor->DoAsyncRPCTask(taskDone); | |||
TaskExecutorDone *taskDone = new TaskExecutorDone( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please state the reason for stack overflow in issue or commit message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When getleader always fails, the function 'TaskExcutor::DoAsyncRpcTask' and the function 'TaskExcurorDone:Run' will call each other cyclically, resulting in stack overflow
@@ -303,7 +304,11 @@ void TaskExecutorDone::Run() { | |||
needRetry = excutor_->OnReturn(code_); | |||
if (needRetry) { | |||
excutor_->PreProcessBeforeRetry(code_); | |||
excutor_->DoAsyncRPCTask(this); | |||
code_ = excutor_->DoAsyncRPCTask(this); | |||
if (code_ < 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AsyncTask always return MetaStatusCode::OK
return MetaStatusCode::OK; |
return MetaStatusCode::OK; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AsyncTask will return -1, when retryTimes runing out
What problem does this PR solve?
Issue Number: close #1067
Problem Summary:
What is changed and how it works?
What's Changed:
How it Works:
Side effects(Breaking backward compatibility? Performance regression?):
Check List