-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return sets for enumerated property value data using CPT data #1608
Return sets for enumerated property value data using CPT data #1608
Conversation
…quiring Into<u32> on TrieValue
…ralCategory and Script to use CPT::get_range()
…umerated properties
ebec909
to
32fdc50
Compare
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
@sffc Can you take an initial look to make sure that I'm on the right track here? A few specifics:
cc @iainireland |
Co-authored-by: Shane F. Carr <[email protected]>
moving @nordzilla from reviewer list to cc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks generally good to me. I think we will need some extensions to handle GeneralCategory, but it's fine to land this and do those in a separate PR.
@@ -110,196 +110,3 @@ impl IterableDynProvider<UnicodePropertyV1Marker> for EnumeratedPropertyUnicodeS | |||
Ok(Box::new(core::iter::once(ResourceOptions::default()))) | |||
} | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I'm missing something, it looks like these tests are going away entirely. Is there any way to keep them around?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved these tests over to the icu_properties
component, which has the "provider-aware" APIs that give you the data payload.
These tests will test the data loading side of things, both by getting the data from the provider and by testing the logical correctness of that data. So I suppose that you could make a case either way to put them here (provider_uprops
) or there (icu_properties
).
But when using the APIs, and when using the data provider in testdata
, the same tests can behave identically but can be written with less code. So that's why I moved the tests. These tests ended up at icu_properties::sets
.
//! let data_struct = payload.get(); | ||
//! let line_sep = &data_struct.inv_list; | ||
//! let gc = &data_struct.code_point_trie; | ||
//! let line_sep = gc.get_set_for_value(GeneralCategory::LineSeparator); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this belongs in this patch or somewhere else, but the API we expose for this should take GeneralCategoryGroup
, not GeneralCategory
, to make it possible for people to ask for gc=Letter
and so on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method get_set_for_value()
is implemented on a CodePointTrie, so it exists for all enumerated properties.
The special method that exists for the GeneralCategoryGroup
case is over in icu_properties::sets
called get_for_general_category_group()
.
/// assert_eq!(start..=end, sip_range); | ||
/// | ||
/// assert!(sip_range_iter.next().is_none()); | ||
pub fn get_ranges_for_value(&self, value: T) -> impl Iterator<Item = RangeInclusive<u32>> + '_ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be easier to implement gc=Letter
if this takes a closure (to use in the .filter()
call) instead of a value. The simple cases would use |range| range.value == value
like here, but you could also, for example, check to see if range.value
is included in a GeneralCategoryGroup.
I'm okay landing this and adding the extended version in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's #1532. GeneralCategoryGroup
here is the second place we've seen where it would come in handy. The first was for Script_Extensions.
I believe the deep dive discussion answered my question originally in the issue on the impacts to FFI, code size, etc., but the discussion also surfaced a desire to add benchmarks before determining whether we feel it worth beginning the work for #1532.
+1 on making a better API for general category now that we are migrating to the new data model. See #1239 (comment) |
Closes #1577