Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: New string conversion API #951

Open
wants to merge 46 commits into
base: start-8
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
e7158cb
Start sketching out 8.0 string conversion API.
jtv Feb 16, 2025
a226d90
Introduce generic `to_buf()`/`into_buf()`.
jtv Feb 16, 2025
1a0ce30
Format.
jtv Feb 16, 2025
28228b1
Guard against bad types.
jtv Feb 16, 2025
493e662
Add `source_location` param to `from_string()`.
jtv Feb 16, 2025
dcbf61b
Use `span` for `esc_bin()`/`unesc_bin()`.
jtv Feb 16, 2025
4950dca
Simplify `separated_list()` using `if constexpr`.
jtv Feb 17, 2025
6966d53
Use `std::format()`.
jtv Feb 17, 2025
471eae3
Retire `concat()` and `cat2()`.
jtv Feb 17, 2025
fe213ee
Convert some funcs to the new API.
jtv Feb 17, 2025
fef27d2
Return `std::size_t` from `into_buf()`.
jtv Feb 23, 2025
9bde7a6
Update doc.
jtv Feb 23, 2025
249dcb8
Documentation.
jtv Feb 25, 2025
aa1f627
Use `apt-get` in scripts, not `apt`.
jtv Feb 25, 2025
4d69e19
Don't need debhelper.
jtv Feb 25, 2025
4d414df
Set nonteractive Debain frontend.
jtv Feb 25, 2025
679f0c5
Use UTC timezone as well.
jtv Feb 25, 2025
ce0fd9c
Don't install cmake.
jtv Feb 25, 2025
79c03f5
Typo.
jtv Feb 25, 2025
77509ab
Bunch of detailed `to_buf` tests.
jtv Feb 28, 2025
877d756
Use `std::format()`.
jtv Mar 1, 2025
0e49aa9
More `std::format()`.
jtv Mar 1, 2025
1265729
Test `to_buf()` for arrays, dates, and ranges.
jtv Mar 1, 2025
0890a79
Thoroughly test `to_buf()` & `into_buf()`.
jtv Mar 2, 2025
f436ff8
In CircleCI, run `apt-get ugprade`.
jtv Mar 2, 2025
659ec11
`-y`.
jtv Mar 2, 2025
5203283
Trying to work around CircleCI OS problem.
jtv Mar 2, 2025
1f6601c
Trying to work around CircleCI OS problem.
jtv Mar 2, 2025
0a855ae
What, no exim4?
jtv Mar 2, 2025
0d0947c
Ah, exim4-base.
jtv Mar 2, 2025
6f8ea83
Nope.
jtv Mar 2, 2025
4eb27e8
Retire comment.
jtv Mar 2, 2025
5165abb
Notes.
jtv Mar 2, 2025
44f5f94
Represent `source_location` as text.
jtv Mar 5, 2025
934a068
Try Debian `unstable`.
jtv Mar 5, 2025
122f1a1
`libtoolize` was missing in CI.
jtv Mar 5, 2025
024f796
Pass `std::source_location` in a few more places.
jtv Mar 5, 2025
a9f6848
Introduce `UNKNOWN` encoding group.
jtv Mar 7, 2025
6e40d3d
Move `encoding_group` into the public API.
jtv Mar 7, 2025
f8ffc76
Introduce `conversion_context`.
jtv Mar 7, 2025
8317c6f
More context. More encoding group.
jtv Mar 8, 2025
ea12673
Remove some unneeded specialisations.
jtv Mar 8, 2025
a27f9eb
A public member function helps.
jtv Mar 8, 2025
e653e42
Rename function, avoid name clash.
jtv Mar 8, 2025
2463572
Comment.
jtv Mar 8, 2025
239314b
Check for unknown encoding in one more place.
jtv Mar 8, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,20 +3,20 @@ version: 2
jobs:
build:
docker:
- image: debian:unstable
- image: debian:testing
environment:
- PGHOST: "/tmp"
steps:
- checkout
- run:
name: Configure apt archives
command: apt update
command: apt-get update && apt-get -y upgrade
- run:
name: Install
command: apt install -y lsb-release python3 cmake postgresql libpq-dev
postgresql-server-dev-all build-essential autoconf dh-autoreconf
autoconf-archive automake cppcheck clang shellcheck
python3-virtualenv
command: DEBIAN_FRONTEND=noninteractive TZ=UTC apt-get install -y
lsb-release python3 postgresql libpq-dev postgresql-server-dev-all
build-essential autoconf autoconf-archive automake cppcheck clang
shellcheck python3-virtualenv libtool
- run:
name: Identify
command: lsb_release -a && c++ --version && clang++ --version
Expand Down
2 changes: 1 addition & 1 deletion config/Makefile.in
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ am__can_run_installinfo = \
esac
am__tagged_files = $(HEADERS) $(SOURCES) $(TAGS_FILES) $(LISP)
am__DIST_COMMON = $(srcdir)/Makefile.in compile config.guess \
config.sub depcomp install-sh ltmain.sh missing mkinstalldirs
config.sub install-sh ltmain.sh missing mkinstalldirs
DISTFILES = $(DIST_COMMON) $(DIST_SOURCES) $(TEXINFOS) $(EXTRA_DIST)
ACLOCAL = @ACLOCAL@
AMTAR = @AMTAR@
Expand Down
1 change: 1 addition & 0 deletions include/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ install(
PATTERN connection
PATTERN cursor
PATTERN dbtransaction
PATTERN encoding_group
PATTERN errorhandler
PATTERN except
PATTERN field
Expand Down
3 changes: 1 addition & 2 deletions include/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ nobase_include_HEADERS= pqxx/pqxx \
pqxx/connection pqxx/connection.hxx \
pqxx/cursor pqxx/cursor.hxx \
pqxx/dbtransaction pqxx/dbtransaction.hxx \
pqxx/encoding_group pqxx/encoding_group.hxx \
pqxx/errorhandler pqxx/errorhandler.hxx \
pqxx/except pqxx/except.hxx \
pqxx/field pqxx/field.hxx \
Expand Down Expand Up @@ -37,9 +38,7 @@ nobase_include_HEADERS= pqxx/pqxx \
pqxx/version pqxx/version.hxx \
pqxx/internal/array-composite.hxx \
pqxx/internal/callgate.hxx \
pqxx/internal/concat.hxx \
pqxx/internal/conversions.hxx \
pqxx/internal/encoding_group.hxx \
pqxx/internal/encodings.hxx \
pqxx/internal/header-pre.hxx \
pqxx/internal/header-post.hxx \
Expand Down
3 changes: 1 addition & 2 deletions include/Makefile.in
Original file line number Diff line number Diff line change
Expand Up @@ -355,6 +355,7 @@ nobase_include_HEADERS = pqxx/pqxx \
pqxx/connection pqxx/connection.hxx \
pqxx/cursor pqxx/cursor.hxx \
pqxx/dbtransaction pqxx/dbtransaction.hxx \
pqxx/encoding_group pqxx/encoding_group.hxx \
pqxx/errorhandler pqxx/errorhandler.hxx \
pqxx/except pqxx/except.hxx \
pqxx/field pqxx/field.hxx \
Expand Down Expand Up @@ -385,9 +386,7 @@ nobase_include_HEADERS = pqxx/pqxx \
pqxx/version pqxx/version.hxx \
pqxx/internal/array-composite.hxx \
pqxx/internal/callgate.hxx \
pqxx/internal/concat.hxx \
pqxx/internal/conversions.hxx \
pqxx/internal/encoding_group.hxx \
pqxx/internal/encodings.hxx \
pqxx/internal/header-pre.hxx \
pqxx/internal/header-post.hxx \
Expand Down
102 changes: 55 additions & 47 deletions include/pqxx/array.hxx
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,16 @@

#include <algorithm>
#include <cassert>
#include <format>
#include <stdexcept>
#include <string>
#include <type_traits>
#include <utility>
#include <vector>

#include "pqxx/connection.hxx"
#include "pqxx/encoding_group.hxx"
#include "pqxx/internal/array-composite.hxx"
#include "pqxx/internal/encoding_group.hxx"
#include "pqxx/internal/encodings.hxx"


Expand Down Expand Up @@ -66,7 +67,7 @@ public:
* `ELEMENT` type does not support null values.
*/
array(std::string_view data, connection const &cx, sl loc = sl::current()) :
array{data, pqxx::internal::enc_group(cx.encoding_id(loc), loc), loc}
array{data, cx.get_encoding_group(loc), loc}
{}

/// How many dimensions does this array have?
Expand All @@ -84,7 +85,6 @@ public:
return m_extents;
}

// TODO: How can we pass std::source_location here?
template<std::integral... INDEX> ELEMENT const &at(INDEX... index) const
{
static_assert(sizeof...(index) == DIMENSIONS);
Expand Down Expand Up @@ -169,14 +169,14 @@ private:
* walking through the entire array sequentially, and identifying all the
* character boundaries. The main parsing routine detects that one.
*/
void check_dims(std::string_view data, sl loc = sl::current())
void check_dims(std::string_view data, sl loc)
{
auto sz{std::size(data)};
if (sz < DIMENSIONS * 2)
throw conversion_error{
pqxx::internal::concat(
"Trying to parse a ", DIMENSIONS, "-dimensional array out of '",
data, "'."),
std::format(
"Trying to parse a {}-dimensional array out of '{}'.", DIMENSIONS,
data),
loc};

// Making some assumptions here:
Expand All @@ -193,15 +193,15 @@ private:
for (std::size_t i{0}; i < DIMENSIONS; ++i)
if (data[i] != '{')
throw conversion_error{
pqxx::internal::concat(
"Expecting ", DIMENSIONS, "-dimensional array, but found ", i,
"."),
std::format(
"Expecting {}-dimensional array, but found {}.", DIMENSIONS, i),
loc};
if (data[DIMENSIONS] == '{')
throw conversion_error{
pqxx::internal::concat(
"Tried to parse ", DIMENSIONS,
"-dimensional array from array data that has more dimensions."),
std::format(
"Tried to parse {}-dimensional array from array data that has more "
"dimensions.",
DIMENSIONS),
loc};
for (std::size_t i{0}; i < DIMENSIONS; ++i)
if (data[sz - 1 - i] != '}')
Expand All @@ -213,11 +213,15 @@ private:
// Couldn't make this work through a call gate, thanks to the templating.
friend class ::pqxx::field;

array(std::string_view data, pqxx::internal::encoding_group enc, sl loc)
array(std::string_view data, encoding_group enc, sl loc) : m_ctx{enc, loc}
{
using group = pqxx::internal::encoding_group;
using group = encoding_group;
switch (enc)
{
case group::UNKNOWN:
throw usage_error{
"Tried to parse array without knowing its encoding.", loc};

case group::MONOBYTE: parse<group::MONOBYTE>(data, loc); break;
case group::BIG5: parse<group::BIG5>(data, loc); break;
case group::EUC_CN: parse<group::EUC_CN>(data, loc); break;
Expand Down Expand Up @@ -262,10 +266,10 @@ private:
case '}': break;
default:
throw conversion_error{
pqxx::internal::concat(
"Unexpected character in array: ",
static_cast<unsigned>(static_cast<unsigned char>(data[here])),
" where separator or closing brace expected."),
std::format(
"Unexpected character in array: {} where separator or closing "
"brace expected.",
static_cast<unsigned>(static_cast<unsigned char>(data[here]))),
loc};
}
return here;
Expand All @@ -290,12 +294,12 @@ private:
return static_cast<std::size_t>(separators + 1);
}

template<pqxx::internal::encoding_group ENC>
void parse(std::string_view data, sl loc)
template<encoding_group ENC> void parse(std::string_view data, sl loc)
{
static_assert(DIMENSIONS > 0u, "Can't create a zero-dimensional array.");
conversion_context const c{m_ctx.enc, loc};
auto const sz{std::size(data)};
check_dims(data);
check_dims(data, loc);

m_elts.reserve(estimate_elements(data));

Expand Down Expand Up @@ -407,7 +411,7 @@ private:
std::string const buf{
pqxx::internal::parse_double_quoted_string<ENC>(
std::data(data), end, here, loc)};
m_elts.emplace_back(from_string<ELEMENT>(buf));
m_elts.emplace_back(from_string<ELEMENT>(buf, c));
}
break;
default: {
Expand All @@ -424,14 +428,14 @@ private:
m_elts.emplace_back(nullness<ELEMENT>::null());
else
throw unexpected_null{
pqxx::internal::concat(
"Array contains a null ", type_name<ELEMENT>,
". Consider making it an array of std::optional<",
type_name<ELEMENT>, "> instead."),
std::format(
"Array contains a null {}. Consider making it an array of "
"std::optional<{}> instead.",
type_name<ELEMENT>, type_name<ELEMENT>),
loc};
}
else
m_elts.emplace_back(from_string<ELEMENT>(field));
m_elts.emplace_back(from_string<ELEMENT>(field, c));
}
}
here = end;
Expand Down Expand Up @@ -471,9 +475,8 @@ private:
template<typename OUTER, typename... INDEX>
constexpr std::size_t add_index(OUTER outer, INDEX... indexes) const noexcept
{
sl loc{sl::current()};
std::size_t const first{
check_cast<std::size_t>(outer, "array index"sv, loc)};
check_cast<std::size_t>(outer, "array index"sv, m_ctx.loc)};
if constexpr (sizeof...(indexes) == 0)
{
return first;
Expand All @@ -488,24 +491,24 @@ private:
}
}

// TODO: How can we pass std::source_location here?
/// Check that indexes are within bounds.
/** @throw pqxx::range_error if not.
*/
template<typename OUTER, std::integral... INDEX>
constexpr void check_bounds(OUTER outer, INDEX... indexes) const
{
sl loc{sl::current()};
std::size_t const first{
check_cast<std::size_t>(outer, "array index"sv, loc)};
check_cast<std::size_t>(outer, "array index"sv, m_ctx.loc)};
static_assert(sizeof...(indexes) < DIMENSIONS);
// (Offset by 1 here because the outer dimension is not in there.)
constexpr auto dimension{DIMENSIONS - (sizeof...(indexes) + 1)};
static_assert(dimension < DIMENSIONS);
if (first >= m_extents[dimension])
throw range_error{pqxx::internal::concat(
"Array index for dimension ", dimension, " is out of bounds: ", first,
" >= ", m_extents[dimension])};
throw range_error{
std::format(
"Array index for dimension {} is out of bounds: {} >= {}.",
dimension, first, m_extents[dimension]),
m_ctx.loc};

// Now check the rest of the indexes, if any.
if constexpr (sizeof...(indexes) > 0)
Expand All @@ -527,6 +530,13 @@ private:
* multiply by that number.
*/
std::array<std::size_t, DIMENSIONS - 1> m_factors;

/// Conversion context representing the construction point.
/** It's not always possible to pass a context, e.g. in overloaded operators
* or functions that take parameter packs (at least not nicely). In those
* situations, we use the construction point.
*/
conversion_context m_ctx;
};


Expand Down Expand Up @@ -576,8 +586,7 @@ public:
*/
[[deprecated("Use pqxx::array instead.")]]
explicit array_parser(
std::string_view input,
internal::encoding_group = internal::encoding_group::MONOBYTE);
std::string_view input, encoding_group = encoding_group::MONOBYTE);

/// Parse the next step in the array.
/** Returns what it found. If the juncture is @ref juncture::string_value,
Expand Down Expand Up @@ -607,30 +616,29 @@ private:
std::pair<juncture, std::string> (array_parser::*)(sl);

/// Pick the `implementation` for `enc`.
static implementation
specialize_for_encoding(pqxx::internal::encoding_group enc, sl loc);
static implementation specialize_for_encoding(encoding_group enc, sl loc);

/// Our implementation of `parse_array_step`, specialised for our encoding.
implementation m_impl;

/// Perform one step of array parsing.
template<pqxx::internal::encoding_group>
template<encoding_group>
std::pair<juncture, std::string> parse_array_step(sl loc);

template<pqxx::internal::encoding_group>
template<encoding_group>
std::string::size_type scan_double_quoted_string(sl loc) const;
template<pqxx::internal::encoding_group>
template<encoding_group>
std::string
parse_double_quoted_string(std::string::size_type end, sl loc) const;
template<pqxx::internal::encoding_group>
template<encoding_group>
std::string::size_type scan_unquoted_string(sl loc) const;
template<pqxx::internal::encoding_group>
template<encoding_group>
std::string_view
parse_unquoted_string(std::string::size_type end, sl loc) const;

template<pqxx::internal::encoding_group>
template<encoding_group>
std::string::size_type scan_glyph(std::string::size_type pos, sl loc) const;
template<pqxx::internal::encoding_group>
template<encoding_group>
std::string::size_type scan_glyph(
std::string::size_type pos, std::string::size_type end, sl loc) const;
};
Expand Down
3 changes: 0 additions & 3 deletions include/pqxx/blob.hxx
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,6 @@ public:
*/
static constexpr std::size_t chunk_limit = 0x7fffffff;

// XXX: Can we build a generic version of this?
/// Read up to `size` bytes of the object into `buf`.
/** Uses a buffer that you provide, resizing it as needed. If it suits you,
* this lets you allocate the buffer once and then re-use it multiple times.
Expand Down Expand Up @@ -218,15 +217,13 @@ public:
*/
static oid from_file(dbtransaction &, zview path, oid, sl = sl::current());

// XXX: Can we build a generic version of this?
/// Convenience function: Read up to `max_size` bytes from blob with `id`.
/** You could easily do this yourself using the @ref open_r and @ref read
* functions, but it can save you a bit of code to do it this way.
*/
static void to_buf(
dbtransaction &, oid, bytes &, std::size_t max_size, sl = sl::current());

// XXX: Can we build a generic version of this?
/// Read part of the binary large object with `id`, and append it to `buf`.
/** Use this to break up a large read from one binary large object into one
* massive buffer. Just keep calling this function until it returns zero.
Expand Down
Loading