-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logic cleanups for basic_string
#3830
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
make the behavior in `operator=(const basic_string&::if (_Al != _Right_al)` normal
This comment was marked as outdated.
This comment was marked as outdated.
_Tidy_deallocate(); | ||
_Traits::copy(_Mypair._Myval2._Bx._Buf, _Right_ptr, _Right_size + 1); | ||
_Mypair._Myval2._Mysize = _Right_size; | ||
_Mypair._Myval2._Myres = _SMALL_STRING_CAPACITY; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think dropping _Memcpy_val_from
optimization (was in _Copy_assign_val_from_small
) is ok here.
Otherwise we have to deal with three cases:
right is in small mode(res) and in non-constexpr context (and !defined(_INSERT_STRING_ANNOTATION)
) -> can do _Memcpy_val_from
right is small(size) -> copy str without allocation
right is large(size) -> allocate and copy str
Hm... seems not hard to add the optimization back?
@@ -3165,21 +3149,24 @@ public: | |||
auto&& _Right_alproxy = _GET_PROXY_ALLOCATOR(_Alty, _Right_al); | |||
_Container_proxy_ptr<_Alty> _New_proxy(_Right_alproxy, _Leave_proxy_unbound{}); // throws | |||
|
|||
if (_Right._Mypair._Myval2._Large_string_engaged()) { | |||
const auto _New_size = _Right._Mypair._Myval2._Mysize; | |||
const auto _New_capacity = _Calculate_growth(_New_size, 0, _Right.max_size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part was broken. It is a misuse of _Large_string_engaged
, and can cause allocation smaller than BUF_SIZE
. (e.g. consider when large_string_engaged()
, but size() == 0
)
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
const _Elem* const _Right_ptr = _Right._Mypair._Myval2._Myptr(); | ||
if (_Entails_large_string(_Right_size)) { | ||
const auto _New_capacity = | ||
_Calculate_growth(_Right_size, _SMALL_STRING_CAPACITY, _Right.max_size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this fix is adopted, the second parameter of _Calculate_growth(x,y,z)
will be uniformly _SMALL_STRING_CAPACITY
and further rectification will be possible.
There is also a special use of _Calculate_growth(newsize)
in _Construct_from_iter
. According to impl logic it is performing capacity*=1.5 logic. What about replacing the grow logic here with _Reallocate_grow_by
or push_back
?
Line 2712 in 40640c6
const size_type _New_capacity = _Calculate_growth(_My_data._Mysize); |
stl/inc/xstring
Outdated
@@ -2242,10 +2242,10 @@ public: | |||
return _Result; | |||
} | |||
|
|||
_NODISCARD static _CONSTEXPR20 bool _Large_string_engaged(size_type _Strcap) noexcept { | |||
// decides whether string is in large mode; has no implication on largeness of string's size | |||
_STL_INTERNAL_CHECK(_Strcap >= _SMALL_STRING_CAPACITY); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was this chec that were causing lots of test failures, as
_Myres
is(was) initialized to0
on construction.- In several places(like
_Construct
),_Large_string_engaged
is called early before assignment of_Myres
.
After making _Myres
initialized to _SMALL_STRING_CAPACITY
, I made some checks passed, however there were some errors in clang checks due to constexpr evaluation hit maximum step limit
.
Here I decide to remove this check to pass those clang tests. However, I'm uncertain wether this check is helpful. Does the check make sense? Should I add it back?
size_type _Myres = 0; // current storage reserved for string | ||
// invariant: _Myres >= _Mysize, and _Myres >= _SMALL_STRING_CAPACITY | ||
size_type _Mysize = 0; // current length of string (size) | ||
size_type _Myres = _SMALL_STRING_CAPACITY; // current storage reserved for string (capacity) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am generally concerned about the magnitude of these changes; there are a lot of transformations happening here, and having them all in a single PR (without well-structured fine-grained commits) makes the validity of each transformation difficult to review. I think it's reviewable but I'll need some time. |
If you have time, that would be great! I started doing that, but I ran out of time today (and will need to work on a Patch Tuesday toolset update tomorrow before I can resume reviewing this PR). I pushed a conflict-free merge with You can see main...StephanTLavavej:STL:fine-grained for an idea of what commits would be easiest to review. When each commit is focused on a single transformation whose correctness can be verified at each location, or is focused on a single fix, then they can be reviewed sequentially much easier without the risk of missing anything. We usually discourage force-pushes for PRs that are under review, since GitHub doesn't make it easy to see what changed. In this case, because I've paused reviewing and the other maintainers are all busy, it's your call - you can either force-push this PR, or you can close this PR and open a new one. Either way, I recommend verifying that your commit restructuring isn't introducing unintentional changes - you can simply Thanks! 😻 Your changes look like a great refactoring, I just want to be really careful about reviewing it given how complicated |
Ok, I will do that. I'm going to close this pr and make a new one. |
(Updated)
string(const T*)
andstring(const T*, const Alloc&)
adjacent. They work as a whole to resolve tostring(const T*, const Alloc&={})
._SMALL_STRING_CAPACITY
, which equals_BUF_SIZE - 1
, and replace every_BUF_SIZE - 1
inbasic_string
with it._Entails_large_string(_Size/_Cap)
, which decides whether newly planned size/capacity requires large string mode._Large_string_engaged
logic, and fixes a relevant misuse inoperator=(const string&)
(which will break invariant, and should be_Entails_large_string(_Mysize)
instead)._LEAST_ALLOCATION_SIZE
for_Move_assign_from_buffer
and_Release_to_buffer
._BUF_SIZE
inbasic_string
._Allocate_for_capacity
, which does stl check and lifetime activation, and takes capacity as input. This will eliminate all(al.allocate(res+1) ... _Start_element_lifetimes)
pattern._Deallocate_for_capacity
counterpart.al.deallocate
, and only one wildal.allocate
(without lifetime startup) is left in the interaction part withbasic_stringbuf
.2 and 3 will make it easier to keep the following invariant:
capacity always== _SMALL_STRING_CAPACITY in small mode && capacity always> _SMALL_STRING_CAPACITY in large mode