-
-
Notifications
You must be signed in to change notification settings - Fork 6.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Horribly inconsistent behavior between const/non-const reference in operator [] () #289
Comments
Neither version of
I though about the semantics for quite a while as there is no similar I would be happy for ideas how to improve this situation. |
Hello!
In that metod (string 3616): template<typename T>
const_reference operator[](T* key) const; We should simply change this: assert(m_value.object != nullptr);
assert(m_value.object->find(key) != m_value.object->end());
return m_value.object->(key)->second; to something like this: assert(m_value.object != nullptr);
if(m_value.object->find(key) == m_value.object->end()) {
throw std::domain_error("something");
}
return m_value.object->find(key)->second; It works pretty nice for me :-) |
That's exactly what nlohmann said he doesn't want to do
|
Well, I think this problem is not too hard, but it might require some API change. To see what is the appropriate behavior here, let's learn from how others do it. Since JSON originates from and is heavily used by high-level scripting languages, we consider Python and Javascript's handling of json here (these two languages even have json functionalities built into the language, and these functionalites work intimately with built-in data structures). [Javascript] Here is the behavior of Javascript
[Python] Here is the behavior of Python
Now return to the C++ world. I think these two principles are worth following:
So from the above discussions, IMHO the best solution is to mimic Python's behavior, which is well tested, widely used and already familiar by many programmers. There's really no need to invent a new behavior. We can just define some exceptions like
So, to sum up, IMHO, I think we really should follow the behavior of Javascript or Python. I would just choose Python's. Or, we can implement both and provide a global config option to let the user choose which flavor to use. |
This is exactly why the test isn't there.
The assertions are removed when you build with NDEBUG, so there are no assertions when building for production. They're debugging aids. Therefore, ignore them. If the user of the library needs to know whether or not the desired field is there because it's reading from a const object constructed from an unknown source, then it is up to the user of the library to do the test. Adding this test makes the 99.99% slower so that the 0.01% can be exactly the same speed, but with the test in the library instead of outside the library. |
That is simply not possible. It would mean that you could not do this:
|
@gregmarr This is not what I mean. I mean 'the const part behavior' should match, of course. In our specific case, it means the behavior on reading. Removing the assertion by such debug option is pointless, since what you get instead is undefined behavior, it's essentially the same. Failed with undefined behavior is even worse IMHO. Have you read through my post? I did not advocate adding tests at all. Where did you get that impression from? |
I confused your post with dka8's post. |
@gregmarr I propose to leave everything as is:
|
@kawing-chiu I am aware how other languages cope with JSON, and the motivation for this project was "What if JSON was part of Modern C++?". Where possible, I tried to make JSON feel like a part of the STL and basically combined the interfaces of
Option 1 makes no sense. There a so many Stack Overflow threads about people being confused that they cannot use Option 2 makes more sense, but would always check for an element even if such a check is not needed. This, in my opinion, is not very C++ like. Option 3 is documented as such. Just like dereferencing the past-the-end iterator or using an invalid vector index, the behavior is undefined. One could argue that calling About assertions: I shall think about switching them off by default and only to switch them on with a special preprocessor symbol. However, the current code only use assertions in cases of otherwise undefined behavior. |
There is no need for a special preprocessor symbol, assert comes with one as part of the standard, as I mentioned. I agree with leaving things as is. |
@gregmarr I agree. I think the main issue of @kawing-chiu is that they did not expect an assertion in this case. I do mention assertions in the README - I am not sure what I can do to avoid such confusions. |
@nlohmann OK, now I see why. Actually, I think Option 1 is the best of three and it makes some sense. If we're to follow STL's lead, then just follow as close as possible... Another possibility I can think of is to define subclass of Either way, I think there should be a section dedicated to it in the readme. |
I think providing a function in the flavor of There is currently a notes section in the README, reading
What would you like to add? |
@nlohmann Exceptions are not like checks. They are zero cost on the 'normal path'. And unlike assertions they can be catched and won't make your server crash. The problem is not at all whether we can turn assertions into undefined behaviors by As an example: |
With
I meant that one would still need code like if (m_object->find(key) != m_object->end())
throw std::logic_error(key + " not found"); Or did you mean to have a different assertion macro that throws, so that check would not be executed in production? |
I don't understand the point. You already have it in your
So isn't it just the same in terms of overhead?? |
OK, so you meant this:
:) |
Ok, so you propose to replace the vanilla |
Well, not exactly. I mean we use the exception for both development and production, like But since there's
is not quite enough. 'There is a minefield'. But where are the mines? |
C++ developers are still likely to work with
At least the behavior of both operators is consistent as neither of them can be used for inserting new elements. |
I agree with @maksis . I think that is the exact reason why |
The assertions are now mentioned in the README and the documentation of the function. Furthermore, the release notes for 2.0.3 will also mention the existence of assertions in the project and how to disable them. |
I know I'm late to the discussion here but I wanted to add some thoughts about this. I understand that the idea is that since const First of all, indexing into an array is fundamentally a very cheap operation, possibly a single processor instruction. Adding a bounds check to it makes the operation many times slower. By contrast, looking up an entry in a string map is already an expensive operation involving O(log n) string comparisons. Doing an extra Second of all, the most common operation with input arrays is to iterate through them and access each element. With this usage pattern, it's easy to ensure that there is no out-of-bounds access. By contrast, with input maps, random access by key is more common than iteration. And with random access it's a lot more likely that out-of-bounds access is a real possibility that you have to deal with. Third of all, everyone is already very used to the fact that indexing into arrays is unchecked, ever since the days of C. There is no corresponding expectation for maps. I think that when people are writing code using json objects in C++, they might expect one of 2 things:
I really don't think that people's first thought would be that the behavior of I don't agree with the concern that it violates "pay only for what you use". Like I mentioned earlier, the cost of the check is negligible compared to the existing cost of the indexing operation. It's a small price to pay for eliminating a source of undefined behavior which, in my opinion, an inexperienced user is very likely to run into. Someone said that this check would make the 99.99% case slower for the sake of the 0.01% case. I strongly disagree with that analysis. How can it be that in 99.99% of cases, people are so sure about the source of their input that they're willing to invoke undefined behavior if they're wrong? In 99.99% of cases, the json that you're reading came from some external source, whether a config file or some other process. You really want to give that external source the power to cause a segfault in your application? I think in 99.99% of cases, people are going to want checked access. Or, if they're an inexperienced developer, they won't bother to do any checking thinking they don't need it, until the day that they're wrong and now they have a segfault to deal with. In conclusion, what I want is for the const version of While we're here, another possible behavior for const
In this case, since all the fields inside of "foo" are optional, we can make "foo" itself be optional. But probably the less surprising behavior for const |
Actually, even for arrays it is only marginally slower even for cache-friendly sequential access on modern architectures (pipelining and all that), and is essentially free for not-cache-friendly random-access. I have to agree that undefined behavior should not be introduced without really good reasons, and if can be eliminated for extremely negligible cost, that should be made so. On that note, I'm a believer that
|
@eferreira The |
The value function returns a copy, right? The child function would return a const reference. I'd like a function that returns a reference to a child if it's there, otherwise it returns a reference to some dummy instance. Is it possible to achieve that with the value function? |
Right, it returns a copy. |
Things like
works nicely and as expected. However, if
json
is a const reference, the above code will fail by assertion. Why is it designed like so? Shouldn't it be better and more consistent to at least throw adomain_error
instead of fail by assertion? It took me more than one hour to debug this issue when I pass the above json by const reference to another function to extract data...The text was updated successfully, but these errors were encountered: