Skip to content

refactor: Use re2::StringPiece for RE2::FullMatch()#15259

Closed
kou wants to merge 5 commits intofacebookincubator:mainfrom
kou:re2-full-match-absl-string-view
Closed

refactor: Use re2::StringPiece for RE2::FullMatch()#15259
kou wants to merge 5 commits intofacebookincubator:mainfrom
kou:re2-full-match-absl-string-view

Conversation

@kou
Copy link
Copy Markdown
Contributor

@kou kou commented Oct 23, 2025

RE2::FullMatch() uses absl::string_view not std::string_view:

https://github.com/google/re2/blob/61c4644171ee6b480540bf9e569cba06d9090b4b/re2/re2.h#L411

absl::string_view may not be an alias of std::string_view. In the case, the following error is reported:

In file included from velox/functions/prestosql/registration/DateTimeFunctionsRegistration.cpp:18:
velox/functions/prestosql/DateTimeFunctions.h:1939:10: error: no matching function for call to 'FullMatch'
 1939 |     if (!RE2::FullMatch(
      |          ^~~~~~~~~~~~~~
/include/re2/re2.h:411:15: note: candidate function template not viable: no known conversion from 'std::string_view' (aka 'basic_string_view<char>') to 'absl::string_view' for 1st argument
  411 |   static bool FullMatch(absl::string_view text, const RE2& re, A&&... a) {
      |               ^         ~~~~~~~~~~~~~~~~~~~~~~

Old RE2 that is provided by CentOS Stream 9 doesn't accept absl::string_view.

Old RE2 uses re2::StringPiece for RE2::FullMatch() and new RE2 provides re2::StringPiece as an alias of absl::string_view. So we can use re2::StringPiece for both of old and new RE2.

We can drop support for old RE2 to always use absl::string_view but we use re2::StringPiece for now. It seems that RE2 will use std::string_view instead of absl::string_view eventually. For example, google/re2@2a029e2 is a commit to migrate to std::optional from absl::optional.

We can revisit this after RE2 migrates to std::string_view.

Related: GH-15124

@netlify
Copy link
Copy Markdown

netlify bot commented Oct 23, 2025

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit fc49726
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/68ff98529d738b00088c5e33

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 23, 2025
@kou kou force-pushed the re2-full-match-absl-string-view branch from 3958f22 to fbfc654 Compare October 23, 2025 05:46
@kou
Copy link
Copy Markdown
Contributor Author

kou commented Oct 23, 2025

@abhinavmuk04 Could you take a look at this? This is a follow-up PR for your PR GH-15124.

@@ -1937,7 +1937,10 @@ struct ParseDurationFunction {
re2::StringPiece valueStr;
re2::StringPiece unit;
if (!RE2::FullMatch(
std::string_view{amountUnit}, *durationRegex_, &valueStr, &unit)) {
absl::string_view(amountUnit.data(), amountUnit.size()),
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, old RE2 that is used by our CentOS Stream 9 based image uses StringPiece not absl::string_view/std::string_view...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use re2::StringPiece instead for old RE2 and new RE2.

@kou kou force-pushed the re2-full-match-absl-string-view branch from 2584f29 to cd6ca16 Compare October 23, 2025 06:06
@kou kou changed the title refactor: Use absl::string_view for RE2::FullMatch() refactor: Use re2::StringPiece for RE2::FullMatch() Oct 23, 2025
@kou kou force-pushed the re2-full-match-absl-string-view branch from cd6ca16 to e827d9e Compare October 23, 2025 06:09
@@ -1937,7 +1937,12 @@ struct ParseDurationFunction {
re2::StringPiece valueStr;
re2::StringPiece unit;
if (!RE2::FullMatch(
std::string_view{amountUnit}, *durationRegex_, &valueStr, &unit)) {
// We can use absl::string_view() once we require RE2
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not use std::string_view? CC: @pedroerp

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised these libraries don't take a std::string_view. Is this because we are using old versions?

In general, for now we need this explicit cast (velox::StringView->std::string_view), but we are working on making this conversion implicit.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use std::string_view when:

  • We use old RE2 (because re2::StringPiece provides an implicit cast from std::string_view) or
  • We use new RE2 and Abseil that uses std::string as an alias of absl::string_view (because absl::string_view does not provide an implicit cast from std::string_view)

If we use new RE2 and Abseil that does not use std::string as an alias of absl::string_view (e.g. Debian's Abseil package), we can not use std::string_view. We need to use absl::string_view. FYI: re2::StringPiece is an alias of absl::string_view in new RE2.

FYI: This works with new RE2:

diff --git a/velox/functions/prestosql/DateTimeFunctions.h b/velox/functions/prestosql/DateTimeFunctions.h
index 09200e41f..aa86d196a 100644
--- a/velox/functions/prestosql/DateTimeFunctions.h
+++ b/velox/functions/prestosql/DateTimeFunctions.h
@@ -1936,8 +1936,7 @@ struct ParseDurationFunction {
       const arg_type<Varchar>& amountUnit) {
     re2::StringPiece valueStr;
     re2::StringPiece unit;
-    if (!RE2::FullMatch(
-            std::string_view{amountUnit}, *durationRegex_, &valueStr, &unit)) {
+    if (!RE2::FullMatch(amountUnit, *durationRegex_, &valueStr, &unit)) {
       VELOX_USER_FAIL(
           "Input duration is not a valid data duration string: {}", amountUnit);
     }
diff --git a/velox/type/StringView.h b/velox/type/StringView.h
index 5c1bc00a0..6613882fd 100644
--- a/velox/type/StringView.h
+++ b/velox/type/StringView.h
@@ -25,6 +25,8 @@
 #include <folly/Range.h>
 #include <folly/dynamic.h>
 
+#include <absl/strings/string_view.h>
+
 #include <fmt/format.h>
 
 #include "velox/common/base/BitUtil.h"
@@ -213,10 +215,17 @@ struct StringView {
   }
 
   operator std::string_view() && = delete;
-  explicit operator std::string_view() const& {
+  operator std::string_view() const& {
     return std::string_view(data(), size());
   }
 
+#ifndef ABSL_USES_STD_STRING_VIEW
+  operator absl::string_view() && = delete;
+  operator absl::string_view() const& {
+    return absl::string_view(data(), size());
+  }
+#endif
+
   const char* begin() && = delete;
   const char* begin() const& {
     return data();

@mbasmanova
Copy link
Copy Markdown
Contributor

old RE2 that is used by our CentOS Stream 9 based image

Can we update the image instead? CC: @assignUser @kgpai

@assignUser
Copy link
Copy Markdown
Collaborator

Yeah, using a more recent re2 seems like the better solution imo.

@pedroerp
Copy link
Copy Markdown
Contributor

I'm surprised there is no implicit conversion between std::string_view and absl's or RE2's. Is it just because we are using old versions of these libraries?

@kou
Copy link
Copy Markdown
Contributor Author

kou commented Oct 24, 2025

I'm surprised there is no implicit conversion between std::string_view and absl's or RE2's. Is it just because we are using old versions of these libraries?

No. New RE2::FullMatch() uses absl::string_view and absl::string_view doesn't provide implicit/explicit conversion from std::string_view. If new Abseil is built as C++11 compatible like Debian's Abseil package does, std::string_view doesn't work.

@kou kou force-pushed the re2-full-match-absl-string-view branch from e827d9e to 2fa054a Compare October 24, 2025 01:08
Copy link
Copy Markdown
Contributor

@pedroerp pedroerp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indeed sort of ugly, but looks like there is no better way. Thanks @kou

@pedroerp pedroerp added the ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall label Oct 24, 2025
@@ -1937,7 +1937,10 @@ struct ParseDurationFunction {
re2::StringPiece valueStr;
re2::StringPiece unit;
if (!RE2::FullMatch(
std::string_view{amountUnit}, *durationRegex_, &valueStr, &unit)) {
absl::string_view(amountUnit.data(), amountUnit.size()),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR title says "Use re2::StringPiece", but the code uses "absl::string_view". Other code (2 lines above) uses re2::StringPiece. It would be nice to limit the number of different string views used in this code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. I forgot to update PR title/description.

I've replaced all re2::StringPiece with absl::string_view.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see more usages of re2::StringPiece in the repo. Should we replace all of them with absl::string_view? Might be simpler to just use re2::StringPiece everywhere, no?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh... Let's use re2::StringPiece for now because:

  • RE2 will use std::string_view not absl::string_view eventually because RE2 migrated to std::optional from absl::optional: google/re2@2a029e2
  • re2::StringPiece works with old RE2 and new RE2

We can revisit this when RE2 migrates to std::string_view from absl::string_view.

@@ -70,7 +70,7 @@ function install_build_prerequisites {
# Install dependencies from the package managers.
function install_velox_deps_from_dnf {
dnf_install libevent-devel \
openssl-devel re2-devel libzstd-devel lz4-devel double-conversion-devel \
openssl-devel libzstd-devel lz4-devel double-conversion-devel \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, document this change in the PR description.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. I forgot to do it. Done.

@mbasmanova
Copy link
Copy Markdown
Contributor

CI is red.

@mbasmanova mbasmanova removed the ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall label Oct 24, 2025
@kou kou changed the title refactor: Use re2::StringPiece for RE2::FullMatch() refactor: Use absl::string_view for RE2::FullMatch() Oct 24, 2025
@kou kou force-pushed the re2-full-match-absl-string-view branch from c7012c2 to f41e6b1 Compare October 24, 2025 11:24
@kou
Copy link
Copy Markdown
Contributor Author

kou commented Oct 24, 2025

The "Linux release with adapters" failure was fixed by #15272 . I've rebased.

The "Fuzzer Jobs" failure is caused because setup-centos9.sh change isn't used. I think that this is a limitation of the current CI configuration. (I can work on improving it if it's needed.)

@kou kou force-pushed the re2-full-match-absl-string-view branch from f41e6b1 to a3b9271 Compare October 27, 2025 01:37
@kou kou changed the title refactor: Use absl::string_view for RE2::FullMatch() refactor: Use re2::StringPiece for RE2::FullMatch() Oct 27, 2025
Copy link
Copy Markdown
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@mbasmanova mbasmanova added the ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall label Oct 27, 2025
@mbasmanova
Copy link
Copy Markdown
Contributor

Old RE2 that is provided by CentOS Stream 9 doesn't accept absl::string_view.

Do you expect CentOS Stream 9 to be supported going forward? If so, we may need to add CI for it. CC: @assignUser @kgpai

@meta-codesync
Copy link
Copy Markdown

meta-codesync bot commented Oct 27, 2025

@mbasmanova has imported this pull request. If you are a Meta employee, you can view this in D85539525.

kou added 5 commits October 27, 2025 12:05
`RE2::FullMatch()` uses `absl::string_view` not `std::string_view`:

https://github.com/google/re2/blob/61c4644171ee6b480540bf9e569cba06d9090b4b/re2/re2.h#L411

`absl::string_view` may not be an alias of `std::string_view`. In the
case, the following error is reported:

```text
In file included from velox/functions/prestosql/registration/DateTimeFunctionsRegistration.cpp:18:
velox/functions/prestosql/DateTimeFunctions.h:1939:10: error: no matching function for call to 'FullMatch'
 1939 |     if (!RE2::FullMatch(
      |          ^~~~~~~~~~~~~~
/include/re2/re2.h:411:15: note: candidate function template not viable: no known conversion from 'std::string_view' (aka 'basic_string_view<char>') to 'absl::string_view' for 1st argument
  411 |   static bool FullMatch(absl::string_view text, const RE2& re, A&&... a) {
      |               ^         ~~~~~~~~~~~~~~~~~~~~~~
```

Related: facebookincubatorGH-15124
@mbasmanova mbasmanova force-pushed the re2-full-match-absl-string-view branch from a3b9271 to fc49726 Compare October 27, 2025 16:05
@meta-codesync meta-codesync bot closed this in 9f94be0 Oct 27, 2025
@meta-codesync
Copy link
Copy Markdown

meta-codesync bot commented Oct 27, 2025

@mbasmanova merged this pull request in 9f94be0.

@MBkkt
Copy link
Copy Markdown
Collaborator

MBkkt commented Oct 27, 2025

this PR #15134 already exists more than two weeks
And more than two weeks ago fixed the same issue...

mhaseeb123 pushed a commit to mhaseeb123/velox that referenced this pull request Oct 27, 2025
…bator#15259)

Summary:
`RE2::FullMatch()` uses `absl::string_view` not `std::string_view`:

https://github.com/google/re2/blob/61c4644171ee6b480540bf9e569cba06d9090b4b/re2/re2.h#L411

`absl::string_view` may not be an alias of `std::string_view`. In the case, the following error is reported:

```text
In file included from velox/functions/prestosql/registration/DateTimeFunctionsRegistration.cpp:18:
velox/functions/prestosql/DateTimeFunctions.h:1939:10: error: no matching function for call to 'FullMatch'
 1939 |     if (!RE2::FullMatch(
      |          ^~~~~~~~~~~~~~
/include/re2/re2.h:411:15: note: candidate function template not viable: no known conversion from 'std::string_view' (aka 'basic_string_view<char>') to 'absl::string_view' for 1st argument
  411 |   static bool FullMatch(absl::string_view text, const RE2& re, A&&... a) {
      |               ^         ~~~~~~~~~~~~~~~~~~~~~~
```

Old RE2 that is provided by CentOS Stream 9 doesn't accept `absl::string_view`.

Old RE2 uses `re2::StringPiece` for `RE2::FullMatch()` and new RE2 provides `re2::StringPiece` as an alias of `absl::string_view`. So we can use `re2::StringPiece` for both of old and new RE2.

We can drop support for old RE2 to always use `absl::string_view` but we use `re2::StringPiece` for now. It seems that RE2 will use `std::string_view` instead of `absl::string_view` eventually. For example, google/re2@2a029e2 is a commit to migrate to `std::optional` from `absl::optional`.

We can revisit this after RE2 migrates to `std::string_view`.

Related: facebookincubatorGH-15124

Pull Request resolved: facebookincubator#15259

Reviewed By: kevinwilfong

Differential Revision: D85539525

Pulled By: mbasmanova

fbshipit-source-id: 1dde1c47d7a337d220488aa64b5efa3408876d1e
@kou kou deleted the re2-full-match-absl-string-view branch October 28, 2025 03:09
@kou
Copy link
Copy Markdown
Contributor Author

kou commented Oct 28, 2025

Oh, sorry. I missed the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants