Skip to content

Conversation

@vasil-pashov
Copy link
Collaborator

@vasil-pashov vasil-pashov commented Jul 24, 2023

Reference Issues/PRs

Resolves #639
Relates to: #540 #224

What does this implement/fix? Explain your changes.

Change the C++ to support a special undefined type that can be changed into any other type in a subsequent data key.

Any other comments?

Only the C++ layers is changed, it might need additional python layer integration.

@vasil-pashov vasil-pashov changed the title Dev/vasil.pashov/add none type Add none type Jul 24, 2023
Copy link
Collaborator

@mehertz mehertz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Don't leave blank please!

What is the NoneType?

Copy link
Collaborator

@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on it, @vasil-pashov.

Here is a first review with a few questions and suggestions.

@mehertz mehertz dismissed their stale review July 27, 2023 09:01

PR description added

@vasil-pashov vasil-pashov force-pushed the dev/vasil.pashov/add-none-type branch from bfd4cf4 to 1157c3a Compare July 27, 2023 09:27
@vasil-pashov vasil-pashov force-pushed the dev/vasil.pashov/add-none-type branch from 47c2e05 to a138a55 Compare August 9, 2023 15:36
@vasil-pashov vasil-pashov force-pushed the dev/vasil.pashov/add-none-type branch 4 times, most recently from 1cd406b to 17666bd Compare August 10, 2023 15:04
@vasil-pashov vasil-pashov force-pushed the dev/vasil.pashov/add-none-type branch from 17666bd to a4f6658 Compare August 10, 2023 15:05
@vasil-pashov
Copy link
Collaborator Author

This is a list of functions which assume that if something is not string it's number of vice versa:

VariantData binary_membership(const ColumnWithStrings& column_with_strings, ValueSet& value_set, Func&& func)
VariantData binary_comparator(const Value& val, const ColumnWithStrings& column_with_strings, Func&& func)
VariantData binary_comparator(const ColumnWithStrings& left, const ColumnWithStrings& right, Func&& func)
VariantData binary_comparator(const ColumnWithStrings& column_with_strings, const Value& val, Func&& func)
SegmentInMemoryImpl::visit_field(Callable &&c)
SegmentInMemoryImpl::operator==(const SegmentInMemoryImpl& left, const SegmentInMemoryImpl& right)
FrameSliceMap::FrameSliceMap(const std::shared_ptr<PipelineContext>& context, bool dynamic_schema)
std::shared_ptr<SegmentInMemoryImpl> filter(const util::BitSet& filter_bitset,
                                                       bool filter_down_stringpool=false,
                                                       bool validate=false) const {
Value(const Value& other)
void SumAggregatorData::aggregate(const std::optional<ColumnWithStrings>& input_column, const std::vector<size_t>& groups, size_t unique_values)
void MinMaxAggregatorData::aggregate(const ColumnWithStrings& input_column)
void MaxOrMinAggregatorData::aggregate(const std::optional<ColumnWithStrings>& input_column, const std::vector<size_t>& groups, size_t unique_values)
VariantData binary_membership(const ColumnWithStrings& column_with_strings, ValueSet& value_set, Func&& func) {
VariantData binary_comparator(const Value& val, const ColumnWithStrings& column_with_strings, Func&& func) {
VariantData binary_comparator(const ColumnWithStrings& left, const ColumnWithStrings& right, Func&& func) {
VariantData binary_comparator(const ColumnWithStrings& column_with_strings, const Value& val, Func&& func) {
inline StreamId stream_id_from_segment(const SegmentInMemory &seg, ssize_t row) {
folly::Future<arcticdb::entity::VariantKey> write_frame_data(const TestDataFrame &data_frame, WriterType &writer) {

they all have contain if/else case similar to:

if(is_sequence_type(dt)) {
...
} else {
...
}

or

if(is_numeric_type(dt)) {
...
} else {
...
}

Probably for some of them it's fine to leave it as is, but we might have problems down the road when none type and other types are added.

Copy link
Collaborator

@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @vasil-pashov.

Here are a few comments and questions.

William and you mentioned other PRs in the threads of reviews. Do you intend to submit subsequent PRs for the support of other types? 🙂

Comment on lines +172 to +173
} else if constexpr (!is_empty_type(dt)) {
static_assert(!sizeof(dt), "Unknown data type");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a else clause and raise if it's reached?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of doing the same thing isn't it? We should do nothing in case of empty type is passed and if dt is not empty then we have unhandled type and the code should not compile at all. This is a shorthand for:

...
else if constexpr (is_empty_type(dt)) {
} else {
            static_assert(!sizeof(dt), "Unknown data type");
}

and it's better than throwing since if we have type which is not handled we won't even compile.

@alexowens90
Copy link
Collaborator

This is a list of functions which assume that if something is not string it's number of vice versa:

VariantData binary_membership(const ColumnWithStrings& column_with_strings, ValueSet& value_set, Func&& func)
VariantData binary_comparator(const Value& val, const ColumnWithStrings& column_with_strings, Func&& func)
VariantData binary_comparator(const ColumnWithStrings& left, const ColumnWithStrings& right, Func&& func)
VariantData binary_comparator(const ColumnWithStrings& column_with_strings, const Value& val, Func&& func)
SegmentInMemoryImpl::visit_field(Callable &&c)
SegmentInMemoryImpl::operator==(const SegmentInMemoryImpl& left, const SegmentInMemoryImpl& right)
FrameSliceMap::FrameSliceMap(const std::shared_ptr<PipelineContext>& context, bool dynamic_schema)
std::shared_ptr<SegmentInMemoryImpl> filter(const util::BitSet& filter_bitset,
                                                       bool filter_down_stringpool=false,
                                                       bool validate=false) const {
Value(const Value& other)
void SumAggregatorData::aggregate(const std::optional<ColumnWithStrings>& input_column, const std::vector<size_t>& groups, size_t unique_values)
void MinMaxAggregatorData::aggregate(const ColumnWithStrings& input_column)
void MaxOrMinAggregatorData::aggregate(const std::optional<ColumnWithStrings>& input_column, const std::vector<size_t>& groups, size_t unique_values)
VariantData binary_membership(const ColumnWithStrings& column_with_strings, ValueSet& value_set, Func&& func) {
VariantData binary_comparator(const Value& val, const ColumnWithStrings& column_with_strings, Func&& func) {
VariantData binary_comparator(const ColumnWithStrings& left, const ColumnWithStrings& right, Func&& func) {
VariantData binary_comparator(const ColumnWithStrings& column_with_strings, const Value& val, Func&& func) {
inline StreamId stream_id_from_segment(const SegmentInMemory &seg, ssize_t row) {
folly::Future<arcticdb::entity::VariantKey> write_frame_data(const TestDataFrame &data_frame, WriterType &writer) {

they all have contain if/else case similar to:

if(is_sequence_type(dt)) {
...
} else {
...
}

or

if(is_numeric_type(dt)) {
...
} else {
...
}

Probably for some of them it's fine to leave it as is, but we might have problems down the road when none type and other types are added.

Can you add 2 tickets to deal with these. One for everything QueryBuilder related:

VariantData binary_membership(const ColumnWithStrings& column_with_strings, ValueSet& value_set, Func&& func)
VariantData binary_comparator(const Value& val, const ColumnWithStrings& column_with_strings, Func&& func)
VariantData binary_comparator(const ColumnWithStrings& left, const ColumnWithStrings& right, Func&& func)
VariantData binary_comparator(const ColumnWithStrings& column_with_strings, const Value& val, Func&& func)
std::shared_ptr<SegmentInMemoryImpl> filter(const util::BitSet& filter_bitset,
                                                       bool filter_down_stringpool=false,
                                                       bool validate=false) const {
void SumAggregatorData::aggregate(const std::optional<ColumnWithStrings>& input_column, const std::vector<size_t>& groups, size_t unique_values)
void MinMaxAggregatorData::aggregate(const ColumnWithStrings& input_column)
void MaxOrMinAggregatorData::aggregate(const std::optional<ColumnWithStrings>& input_column, const std::vector<size_t>& groups, size_t unique_values)
VariantData binary_membership(const ColumnWithStrings& column_with_strings, ValueSet& value_set, Func&& func) {
VariantData binary_comparator(const Value& val, const ColumnWithStrings& column_with_strings, Func&& func) {
VariantData binary_comparator(const ColumnWithStrings& left, const ColumnWithStrings& right, Func&& func) {
VariantData binary_comparator(const ColumnWithStrings& column_with_strings, const Value& val, Func&& func) {

and one for everything else. I can knock all the query builder ones off pretty quickly, I don't know off the top of my head about the others.

@vasil-pashov vasil-pashov merged commit 579c4f2 into master Aug 16, 2023
@vasil-pashov vasil-pashov deleted the dev/vasil.pashov/add-none-type branch August 16, 2023 11:53
vasil-pashov added a commit that referenced this pull request Oct 18, 2023
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->

Resolves #639
Relates to: #540 #224
<!--
Example: Fixes #1234. See also #3456.
Please use keywords (e.g., Fixes) to create link to the issues or pull
requests
you resolved, so that they will automatically be closed when your pull
request
is merged.

See:
https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue
-->

Change the C++ to support a special undefined type that can be changed
into any other type in a subsequent data key.

Only the C++ layers is changed, it might need additional python layer
integration.

---------

Co-authored-by: willdealtry <[email protected]>
Co-authored-by: Vasil Pashov <[email protected]>
Co-authored-by: Julien Jerphanion <[email protected]>
grusev pushed a commit that referenced this pull request Nov 25, 2024
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->

#### Reference Issues/PRs
Resolves #639
Relates to: #540 #224 
<!--
Example: Fixes #1234. See also #3456.
Please use keywords (e.g., Fixes) to create link to the issues or pull
requests
you resolved, so that they will automatically be closed when your pull
request
is merged.

See:
https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue
-->


#### What does this implement/fix? Explain your changes.
Change the C++ to support a special undefined type that can be changed
into any other type in a subsequent data key.

#### Any other comments?
Only the C++ layers is changed, it might need additional python layer
integration.

---------

Co-authored-by: willdealtry <[email protected]>
Co-authored-by: Vasil Pashov <[email protected]>
Co-authored-by: Julien Jerphanion <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add C++ support for None type

7 participants