-
Notifications
You must be signed in to change notification settings - Fork 456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add StatusOr
for error handling in modern C++ style
#768
Conversation
This approach may be a more modern C++ approach, but I don't think it's very friendly for a rookie like me. RocksDB as the storage engine, also uses the |
First of all, I do not think contributors need to understand the implementation of Secondly, the idea of Finally, all use of |
In fact, I think This is because But |
To be honest, this change looks a bit complex at first glance. For myself, I would happy to see that we can use the modern way |
ASSERT_EQ(f(7).Msg(), "ok"); | ||
ASSERT_TRUE(f(6)); | ||
ASSERT_EQ(f(11).GetCode(), Status::NotOK); | ||
ASSERT_EQ(f(11).Msg(), "x large than 10"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what would happen if calling f(11).GetValue()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ShooterIT, if you look at the definition of StatusOr<T>::GetValue
:
value_type& GetValue() & {
CHECK(*this);
return getValue();
}
value_type&& GetValue() && {
CHECK(*this);
return std::move(getValue());
}
const value_type& GetValue() const& {
CHECK(*this);
return getValue();
}
You will find there is three different overload of this method via changing the qualifier of *this
.
Hence, in f(x).GetValue()
, obviously f(x)
is a prvalue since a temporary object is returned, and then the signature value_type&& GetValue() &&
is decided to be called after the overload resolution, and thus you get an xvalue (which is a special kind of rvalue) referenced to the underlying value of the returned StatusOr
.
So we should take care of the lifetime in this case, the rvalue will become invalid after the computation of the current statement. We can extend it via a call to a move ctor (T(T&&)
). This is not a defect of this type, but it is an idiom of C++, we should always care about lifetime of references to avoid dangling references in all standards of C++ (98/03/11/14/17/20 ...).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, the classic usage of StatusOr<T>
we expected is:
// 1
StatusOr<T> someProcess(...); // as a return type
// 2
T val;
...
return val; // return val typed T to StatusOr<T>
// 3
return {Status::someErrCode, "msg..."}; // return an error status
// 4
auto res = someProcess(...); // get result typed StatusOr<T>
if(!res) return res; // forward it if it is an error status
doSomething(*res); // process the value inside the StatusOr
Use StatusOr
as an rvalue is totally (well) supported, but everyone should be aware of what they are doing, otherwise just follow the classic usage is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i want to describe 11
is greater than 10
, GetValue()
will return error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GetValue()
will trigger a fatal error.
It is just like you call f
as Status f(int *output)
, but do not check the returned Status
and directly dereference the output
. It is not allowed, you must check status before dereference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that guarded by CHECK(*this)
, it can cause a fatal error.
I wonder if we can generate a compile time error instead of runtime fatal in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirm its failure as:
WARNING: Logging before InitGoogleLogging() is written to STDERR
F20220812 15:30:19.518088 1124890 status.h:202] Check failed: *this
*** Check failure stack trace: ***
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @tisonkun, thanks for your review.
I wonder if we can generate a compile time error instead of runtime fatal in this case.
Good idea. We consider an example for this:
auto res = someProcess(...); // get result typed StatusOr<T>
// we cannot deref `res` here
if(!res) {
processError(res.GetCode(), res.Msg()); /* or */ return res; // we cannot deref `res` here
}
doSomething(*res); // we can deref it
In this example, only the doSomething
part can actually dereference the res
.
We cannot dereference it both before if(!res)
and inside if(!res)
.
So only in the execution path with the condition res.IsOk() == true
being satisfied, we can dereference res
,
It is hardly to express in compile time, since it is related to the control flow.
Like a raw/smart pointer, we should check it does point to a value (not null) before dereference it.
I have implemented a checker in Clang Static Analyzer (included in clang-tidy) for OneFlow that checks whether a type like StatusOr
is dereferenced after checking IsOk()
using symbolic execution, and I think it can be done in this way in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checks whether a type like StatusOr is dereferenced after checking IsOk() using symbolic execution
Sounds good to me.
Hi everyone, feel free to review and ask questions. If there is no further discussion, I will merge it. 🚀 |
Please wait a minute, i still need some time to put my thoughts |
To be honest, this is the first time to see this usage. I study your PR and https://abseil.io/docs/cpp/guides/status, i find it truly satisfies my requirement that i want to return status or value if ok when calling function, a bit like multi return value, such as
before you PR, our usage is like the following:
I think you may want to refactor these usages above, right? i am not a c++ language expert, i don't see this usage frequently, i think there are a few people accept quickly, so i am not sure we should merge this. Maybe i am lazy, and not want to learn new knowledge. |
Hi @ShooterIT, your understanding is generally correct. I'm still holding on to the hope that we'll be able to merge this PR, which I state below on various fronts. First, the original error handling method has several flaws:
Second, Lastly, I have no current plans to replace any part of the current project that use |
If using exception to report error instead of status, we don't need Status and this StatusOr at all. What is the better way to report error in C++? error code or exception? Here are same posts to discuss this question: |
Hi @wy-ei , a large part of C++ projects do not use exception as the mainly error handling method (but use it while some fatal or very accidental error occurred) since exception is relatively expensive in bad path (or say, failure path, maybe x10 - x20 slower than conditional jumping, but zero-cost on happy path)[1-2], like all Google products, LLVM projects, etc. So I don't think it's a good idea to use exceptions on some execution path that is not a very low probability error or is user controllable. Another defect is that C++ does not have checked exception, this makes exception flows not easy to maintain. In addition, it is not easy to ensure the exception safety of the program, which requires the programmer to have a deep background knowledge of C++. So I don't have a strong desire to use exceptions. Some PL like Rust make heavy use of sum types to express potentially error-prone results, and uses a functional approach to provide higher-order composition, which I think is easier to use than exceptions. [1] https://pspdfkit.com/blog/2020/performance-overhead-of-exceptions-in-cpp |
Throwing exception is more than 100 times slower than return error code, If the probability of throwing exception is less than 1/100, the cost of throwing exception can be ignored. I run the benchmark in [1] on my machine, here is the result:
In this benchmark, the exception was thrown every two function calls, but in real situation, exception unlikely be thrown this often. I change the benchmark code to make the exception be thrown every 200 function calls. (change the variable
Since throwing exception is a small part of the code, in real situation, I think the cost of exception is not a big problem.
If user input cause error, then error may occur with high probability, use exception would be inefficient. In this situation, some pre check code can be added to return error fast, and comparing with disk and network read and write, the cost of exception is not the main part.
C++ do not force user catch the exception, but if you miss an exception the process will crash. We can ignore an error status, and something weird happened, but we didn't known the reason. If we keep using RAII, the exception safety can almost be ensure. I mentioned exception here because I was think is there another way to deal with error instead Status. I think both exception and status are ok. But using exception in kvrocks will break the consistency of the function design (return Status and pass result as a pointer in last argument), I think this will be a big change, and I except this won't happed in the near future. I was off topic for a long time, Back to Status and StatusOr. In kvrocks, Status class is a large object which include a std::string in it. The Status class implemented in leveldb only take 8 bytes (only one member which is const char *state_). If status is ok, state_ is nullptr, if status is not ok, the state_ store the error code and error message. If we change the Status implementation in kvrocks to the leveldb way, the size of Status is not a problem. Another problem of Status for me is it occupy the place of the return value, we must pass a pointer to take the return value. Golang solve this problem by returning multi value. A sample and easily understand way is return a tuple or another Class will contain return value T and status. For StatusOr I have one question. StatusOr use a char[] to store T or Status, I don't known why, why don't use union? C++23 std::expected [2] is used to contains expected value or an error, does this a more generic way, Can we implement same thing like this. We can combine Status and the expected class do the things StatusOr does. [1] https://pspdfkit.com/blog/2020/performance-overhead-of-exceptions-in-cpp |
Hi @wy-ei , You can check Oneflow-Inc/oneflow/pull/6820 for more details. But obviously, the implementation of
The same problem is faced with the union implementation, you also need to evaluate some conditions before calling the ctor/dtor, so I don't see a difference. And you also need to manual call dtor and carefully handle lifetime in union for non-POD types.
I think it's only a little different from
This is actually a huge flaw in golang: golang has no sum type. So in golang, both the two values (the actual value and an error) should be returned instead of expressing an "either .. or .." relationship. Of course, interface and generics can achieve this, but it is obviously far more troublesome than sum type. |
To be honest, we didn't think carefully about this case before, because AFAIK the bottleneck
I guess the reason why no one mentioned this issue: call the chain of Kvrocks is simple engouh: |
I wonder why kvrocks' When I gothrough the code of project, I found that status code is widely used, introducing exception might be ok or not (some says it can help trace the error stack: apache/doris#8406 , and some doesn't like it because of performance lost, like scylladb/seastar#73 . But introducing exception will change the code style greatly, so I don't thinks it's a good idea to change to exception. Using |
Yes, I also think |
Agree with use
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good.
One comment is inline at https://github.com/apache/incubator-kvrocks/pull/768/files#r944171514.
Another comment is that where we use StatusOr
outside status.h
or you plan to do so? Otherwise, we trade off new code snippets for none or little usage which seems redundant.
UPDATE: I see this comment now.
We maximize the use of move semantics in C++ to eliminate redundant copies and optimize the storage, so that developers do not need to worry too much about its performance when using it.
Merging... |
In kvrocks we use a style like
Status process(input..., T* output)
to handle errors. It works well, but in C++ we can do better via constructing type-safe union types.In this PR, we propose a new class template
StatusOr<T>
, which contains EITHER a value typedT
OR an error status. Here is a simple example:We maximize the use of move semantics in C++ to eliminate redundant copies and optimize the storage, so that developers do not need to worry too much about its performance when using it.