-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encapsulate encoding information in string conversion #955
Comments
In my experience, the most popular pattern is to have the environment information accessed through an object named “context”. |
Thanks @tt4g, that sounds like a good idea. Perhaps the I guess then if we want to add more optional members to the context, we can just default-construct those. For my other question, about what the default value for the encoding group should be... I think we need an enum value for "unknown." That way, the few conversions that care about the encoding can just check the encoding and throw an error if the conversion is dangerous. Frankly, if the entier SJIS//GB/BIG5 families of encodings had been deprecated in favour f UTF-8 a decade ago, that would probably make my life a lot easier! |
I think the encoding gotten by EDIT: I didn't write why I think |
@tt4g does libpq support any encodings that libpqxx does not? (Bearing in mind of course that libpqxx really only cares about encoding groups — a custom abstraction that lets us treat encodings with the same encoding mechanism as identical for our purposes.) I'd like to avoid the call to obtain the client encoding where possible, since there's going to be a small performance cost to calling out to the C library. Very very few conversions will need the information, and we do so many conversions that I want them to be really fast. Caching could be an option but then we need to worry about invalidation. |
A list is available on https://www.postgresql.org/docs/17/multibyte.html I am not familiar with all encodings, but perhaps the following encoding support is missing:
Tip
libpqxx/include/pqxx/internal/encoding_group.hxx Lines 18 to 38 in 6af956b
It may also need to support behavior compatible with
|
@tt4g it looks to me like libpqxx currently supports all encodings in that list! Most of them fall under |
@jtv That's good. |
Oh and by the way @tt4g I'm using a "context" struct for the new string conversion API, like you suggested. |
@jtv Yeah. I saw it added in a new commit. Glad to see my opinion was adopted. |
For array conversions in particular, the string conversion API may need to know the client encoding group.
I'd prefer not to expose encoding groups, since they're not set in stone and we may later discover that we need more detailed information; and also, I'm not completely confident that they correspond exactly to something that the client would want to know. They're more of an internal data type. So, it would seem to make sense to encapsulate the information in a small class.
We then need to pass an instance of that class to some of the string conversions (only
to_buf()
andinto_buf()
I think), which means a further extension of the conversion API.Unresolved question: is there a way to pass this only when needed? It seems like a stupid question, but there's no such thing as a default encoding really. Perhaps there should be a way of expressing "don't care" as an encoding group, ideally at compile time.
The text was updated successfully, but these errors were encountered: