Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: New string conversion API #951

Open
wants to merge 46 commits into
base: start-8
Choose a base branch
from
Open

WIP: New string conversion API #951

wants to merge 46 commits into from

Conversation

jtv
Copy link
Owner

@jtv jtv commented Feb 16, 2025

This changes the string conversion API, but also introduces a mechanism for maintaining backward compatibility for existing string conversions, so as to minimise the trouble for users who implement their own.

It works like this: string conversion for a type is defined by its specialisation of pqxx::string_traits. This type contains a few conversion functions: to_buf(), into_buf(), from_string(). The signatures for these functions are changing, but in ways that we can reasonably "translate" both ways.

So these new generic functions outside of the traits types hide the difference. They expose only the new-style API, but they each can call either an old-style implementation or a new-style implementation (preferring the new-style one, of course). The use of function concepts for this avoids niggling incompatibilities due to optional extra parameters, or slight differences in types. If the call works, the API matches.

The new API reduces the use of raw pointers, in favour of std::span<char> buffers. It also introduces an optional std::source_location argument. And, in the near future, I hope to add some encapsulation of client encoding information so that we can parse some things that aren't safe to parse right now.

jtv added 10 commits February 16, 2025 20:56
Replaces a lot of pointers with `std::span`, adds `source_location`.

Keeping the new API backwards-compatible by checking at compile time
which API each of a traits struct's conversion functions implement.
These versions of the functions live _outside_ the traits types.  They
help hide the changes in the string conversion API in libpqxx 7.
@jtv
Copy link
Owner Author

jtv commented Feb 19, 2025

I'm working to improve memory safety and verifiability, and part of that is reducing the use of raw pointers. As it stands, the new conversion API still has one pointer in it — pqxx::string_traits<TYPE>::into_buf() returns a char *. I'd like to change that.

Question is: should the function return a view on the string it wrote into the buffer? Or should it return the span of remaining buffer space, keeping it convenient to append another value? It's not a completely trivial question because there's a terminating zero between the two. Returning the pointer is elegant in that it's the only piece of information that the caller wants and doesn't already have. I could also just return an index instead of a pointer, but I feel that would just complicate subsetting operations, and invite subtle bugs.

I may simply have to try each of the alternatives and compare them.

@jtv
Copy link
Owner Author

jtv commented Feb 20, 2025

I'm working to improve memory safety and verifiability, and part of that is reducing the use of raw pointers. As it stands, the new conversion API still has one pointer in it — pqxx::string_traits<TYPE>::into_buf() returns a char *. I'd like to change that.

Question is: should the function return a view on the string it wrote into the buffer? Or should it return the span of remaining buffer space, keeping it convenient to append another value? It's not a completely trivial question because there's a terminating zero between the two. Returning the pointer is elegant in that it's the only piece of information that the caller wants and doesn't already have. I could also just return an index instead of a pointer, but I feel that would just complicate subsetting operations, and invite subtle bugs.

I may simply have to try each of the alternatives and compare them.

Actually, now that I've looked at the uses... returning an index into the buffer does look like the solution that best fits the callers' needs! And that means it's a simple solution, which in turn probably makes it less error-prone.

@jtv jtv added the 8.0 label Feb 21, 2025
jtv added 19 commits March 2, 2025 02:24
The new string conversion API will include encoding information.  We
need that for array/composite parsing.

I was thinking to hide this away in a class, but...  it's starting to
sound like too much overhead, both for the hardware and for humans.
We've had this "encoding groups" system for years now, and haven't
seen much need to change it.

There are _some_ changes: I just added an `UNKNOWN` value to the enum
so we can deal at least somewhat gracefully with cases where we don't
have the information.  And I'm looking forward in the 8.0 release to
retiring a few of the encoding groups.  Perhaps there are ones we'll
want to add as well, but adding is cheap.

Compilers and tooling will warn about unhandled cases in a `switch`,
and that gives me more confidence that people won't be surprised by
unexpected new groups.  At least, nobody who cares about keeping their
code correct, enables compiler warnings, and pays attention.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant