Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: strconv: Add Parse[T] #57975

Closed
icholy opened this issue Jan 24, 2023 · 25 comments
Closed

proposal: strconv: Add Parse[T] #57975

icholy opened this issue Jan 24, 2023 · 25 comments

Comments

@icholy
Copy link

icholy commented Jan 24, 2023

I'd like to propose a generic interface to the parsing functionality provided by strconv.

type Types interface {
	string | bool | int | int8 | int16 | int32 | int64 |
		uint | uint8 | uint16 | uint32 | uint64 | uintptr |
		float32 | float64 | complex64 | complex128
}

func Parse[T Types](s string) (T, error)

This would mainly be used from other generic functions. For example:

func QueryValue[T strconv.Types](q url.Values, key string) (T, error) {
	v := q.Get(key)
	if v == "" {
		var z T
		return z, fmt.Errorf("query value not found: %q", key)
	}
	return strconv.Parse[T](v)
}
@gopherbot gopherbot added this to the Proposal milestone Jan 24, 2023
@seankhliao
Copy link
Member

Wouldn't this have to use a big type switch internally?

How often does this come up that it needs to be in the standard library?

@icholy
Copy link
Author

icholy commented Jan 24, 2023

Wouldn't this have to use a big type switch internally?

Yes.

How often does this come up that it needs to be in the standard library?

I've needed it about 3 times now. Writing that big type switch is very tedious.

@DeedleFake
Copy link

I don't think a single Parse[T Types]() would work because there are varying options to set for each type. However, I'd really like variants to the existing Parse*() functions that does the conversion internally. My issue with them is generally that because they return two values, I wind up having to pollute the local namespace with an extra variable for each parsing unless I happen to want the one it happens to return. For example,

v64, err := strconv.ParseInt(str, 10, 0)
// Check error.
v := int(v64) // Can't do the conversion inline with the call above.

That's quite annoying, but it's relatively minor. If generics get default types or something, it would be nice to see those functions made generic so that I could just do v, err := strconv.ParseInt[int](str, 10, 0), or even var v int; v, err := strconv.ParseInt(str, 10, 0) with return-based inference.

@dsnet
Copy link
Member

dsnet commented Jan 24, 2023

\cc @josharian. Seems similar to his atox package.

@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Jan 24, 2023
@apparentlymart
Copy link

apparentlymart commented Jan 25, 2023

I tried a different implementation of Parse in terms of fmt.Sscanf here: https://go.dev/play/p/OLs0uTqBhCi

The most relevant part:

func Parse[T Parseable](str string) (T, error) {
	var result T
	_, err := fmt.Sscanf(str, "%v", &result)
	return result, err
}

I don't think it would be possible for this specific implementation to live in strconv because IIRC fmt relies on strconv, but given the existing division of labor between strconv and fmt -- the former deals with specific types and all of the finicky special options for each, while the latter "just does what you mean" -- perhaps this utility would be more at home in fmt anyway.

It feels perhaps like what fmt.Scan might've been if fmt had been designed for a generics-capable language. (Although that function supports more than one argument.)

With all of that said, I'm also not feeling super convinced about this being in stdlib. This fmt-based implementation is relatively short to copy-paste into a package that needs it, and when I want something more involved than this I tend to want more control over exactly how each type would be handled anyway and so I probably wouldn't reach for an all-purpose stdlib function like this one. And the aforementioned josharian/atox seems to already have the slightly longer switch-statement-based version.

@icholy
Copy link
Author

icholy commented Jan 25, 2023

I had no idea you could use %v in the scan functions.

@earthboundkid
Copy link
Contributor

I am against “useless uses of generics.” The signature should be Parse(s string, ptr any) error instead. The calling code will be just as long either way because you need to add [type] with the faux-generic version.

@apparentlymart
Copy link

Hmm indeed and further to @carlmjohnson's point, if a proposal like #57644 were accepted then it could become possible to use a sealed type set for an interface value.

If the implementation is going to immediately wrap the given value in any and use dynamic dispatch for it anyway, perhaps better for the signature to be honest about that.

With that said, I can see the attraction of a form that returns a value rather than writing through a pointer. It tends to lead to (subjectively) more "normal-looking" code. But for a case like this I'm not sure it is worth it, especially since "write through a typed pointer" is already the typical API style for unmarshal-like functions in Go.

@icholy
Copy link
Author

icholy commented Jan 25, 2023

@apparentlymart The assumption is that the overhead will go away once we have something like #45380

@earthboundkid
Copy link
Contributor

If #45380 existed, this would not be a "useless use of generics", but for now at least, it doesn't and best practices documents specifically call out not using generics for deserialization. Also, maybe I'm idiosyncratic, but I try to arrange my generic functions so that the type inference means I rarely if ever need to explicitly use a type specifier. That means I prefer Parse(s, &x) over x = Parse[int](s).

All that said, I too have written a function like this before, so it's clearly something users want, whether we should or not. :-)

@icholy
Copy link
Author

icholy commented Jan 25, 2023

best practices documents specifically call out not using generics for deserialization

Link? (not disagreeing, just curious).

@earthboundkid
Copy link
Contributor

https://go.dev/blog/when-generics

If some operation has to support even types that don’t have methods (so that interface types don’t help), and if the operation is different for each type (so that type parameters aren’t appropriate), use reflection.

An example of this is the encoding/json package. We don’t want to require that every type that we encode have a MarshalJSON method, so we can’t use interface types. But encoding an interface type is nothing like encoding a struct type, so we shouldn’t use type parameters. Instead, the package uses reflection. The code is not simple, but it works.

@adonovan
Copy link
Member

Any time I've ever needed a function like Parse[T], it's been in a problem domain where the set of literals and types doesn't actually match that of Go, so I've written out the switch myself. Some of the cases delegate to Go primitives, others are bespoke. It's not difficult to implement it correctly and efficiently yourself.

I don't think the need for this exact function is general enough to warrant inclusion in the standard library.

@icholy
Copy link
Author

icholy commented Feb 22, 2023

Given the discussion, I think I'd like to alter the proposed function to something like this:

func Parse(s string, v any) error

This would be functionally equivalent to:

func Parse(s string, v any)  error {
	_, err := fmt.Sscanf(str, "%v", v)
	return err
}

@dsnet
Copy link
Member

dsnet commented Feb 22, 2023

strconv currently lacks a dependency on reflect, and I don't think we should indirectly add one through a direct dependency on fmt.

@DeedleFake
Copy link

fmt currently depends on strconv, so simply adding a dependency on fmt without refactoring a whole bunch of stuff into a third package wouldn't work anyways.

@rsc
Copy link
Contributor

rsc commented Mar 15, 2023

This really seems like an unnecessary use of generics. Unifying parsing of quoted strings and floating point numbers into a single function makes no sense.

@tdakkota
Copy link

For numeric types, such function would be useful. It may help to avoid some overflow/conversion bugs like this

// Example from https://codeql.github.com/codeql-query-help/go/go-incorrect-integer-conversion/#id1
func overflowBug(wanted string) int32 {
	parsed, err := strconv.ParseInt(wanted, 10, 64)
	if err != nil {
		panic(err)
	}
	return int32(parsed) 
}

func useParse(wanted string) int32 {
	parsed, err := strconv.Parse[int32](wanted, 10)
	if err != nil {
		panic(err)
	}
	return parsed
}

@earthboundkid
Copy link
Contributor

I think a convenience function like ParseIntOf(string, any) error would be fine. Equivalently it could be slightly more type safe as ParseIntOf[N int|int64|int32|int16|int8|](string, *N) error. Mixing parsing ints, floats, quotes, etc. into one function feels like too much though.

@josharian
Copy link
Contributor

Having an int-generic strconv might also help with 128 bit ints: #9455 (comment)

@josharian
Copy link
Contributor

This really seems like an unnecessary use of generics.

From a pure API perspective, yes. But as I wrote in the atox readme: "It is mainly helpful when you are writing other generic code that needs to parse number out of strings." Generics can be viral in this way, and using them with the entirely non-generic standard library can cause a frustrating impedance mismatch, which is how atox started: I wanted to write that repetitive, uninteresting, error-prone reflect switch wrapper code only once.

Unifying parsing of quoted strings and floating point numbers into a single function makes no sense.

Fair. I didn't put too much thought into the atox API, I just solved the problems I had. Again, though, I suspect you'll end up with people writing repetitive reflect-based wrappers around it anyway, which is the pain that this is trying to prevent.

@leaxoy
Copy link

leaxoy commented Mar 18, 2023

Rust support trait based polymorphism, for example: std::convert::Into, it support a type implements a trait multiply times, the difference each time is the type parameter.

It is regrettable, in go, no indication that the core team is interested in this, although it's useful in many places and improves code readability.

@rsc
Copy link
Contributor

rsc commented Mar 29, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals Mar 29, 2023
@rsc
Copy link
Contributor

rsc commented Apr 12, 2023

Based on the discussion above, this proposal seems like a likely decline.
— rsc for the proposal review group

@rsc rsc moved this from Active to Likely Decline in Proposals Apr 12, 2023
@rsc
Copy link
Contributor

rsc commented Apr 19, 2023

No change in consensus, so declined.
— rsc for the proposal review group

@rsc rsc moved this from Likely Decline to Declined in Proposals Apr 19, 2023
@rsc rsc closed this as completed Apr 19, 2023
@golang golang locked and limited conversation to collaborators Apr 18, 2024
@rsc rsc removed this from Proposals Apr 18, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests