Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement DataType::try_form(&str) #5994

Merged
merged 4 commits into from
Jul 3, 2024

Conversation

samuelcolvin
Copy link
Contributor

@samuelcolvin samuelcolvin commented Jul 3, 2024

Which issue does this PR close?

Closes #3821

Rationale for this change

I would like to use this functionality, currently it's private in datafusion.

What changes are included in this PR?

DataType::from_str is moved here from DataFusion.

Are there any user-facing changes?

DataType::from_str is now public.

@github-actions github-actions bot added the arrow Changes to the arrow crate label Jul 3, 2024

use crate::{DataType, ArrowError, TimeUnit, IntervalUnit, Field};

pub(crate) fn parse_data_type(val: &str) -> ArrowResult<DataType> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb we could make this public and return Result<DataType, String>. So we can use it in datafusion arrow_cast without having to consider the other members of the ArrowError enum.

Or we could leave this as is and in arrow_cast do

match arrow_error_from_parse_data_type {
    ArrowError:ParseError(e) => DatafusinoError:Plan(e),
    e => DatafusionError:ArrowError(e, None),
}

LMK what you think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we could leave this as is and in arrow_cast do

I think leaving the API surface area small (and hence not making this pub) is the better choice.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me @samuelcolvin ❤️

I got inspired to improve the DataType documentation as well based on this PR (to make the feature potentially more discoverable -- PR in #5997)


use crate::{DataType, ArrowError, TimeUnit, IntervalUnit, Field};

pub(crate) fn parse_data_type(val: &str) -> ArrowResult<DataType> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we could leave this as is and in arrow_cast do

I think leaving the API surface area small (and hence not making this pub) is the better choice.

@samuelcolvin
Copy link
Contributor Author

Great 👍

@alamb
Copy link
Contributor

alamb commented Jul 3, 2024

In terms of releases, I think this one will likely appear in arrow 52.2.0: #5998

#5905 was cut yesterday

@alamb alamb merged commit 1f0b000 into apache:master Jul 3, 2024
27 checks passed
@samuelcolvin
Copy link
Contributor Author

That's fine, thanks for the quick response.

If it becomes a problem, I can copy and paste datatype_parse.rs into our code base for a few weeks - it's pretty well self-contained.

@samuelcolvin samuelcolvin deleted the DataType_TryFrom_str branch July 3, 2024 15:43
@alamb
Copy link
Contributor

alamb commented Jul 3, 2024

FWIW I made apache/datafusion#11254 to use this function upstream in DataFusion ❤️

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement FromStr for DataType / Parse DataType description
2 participants