ARROW-15841: [R] Implement SafeCallIntoR to safely call the R API from another thread #12558

paleolimbot · 2022-03-03T21:05:38Z

This is a very WIP draft that currently just sketches a few things related to calling into R from other threads. Some code to get started:

arrow:::TestSafeCallIntoR(
  list(
    function() "string one",
    function() "string two"
  )
)
#> [1] "string one" "string two"

arrow:::TestSafeCallIntoR(
  list(
    function() stop("This is an error!")
  )
)
#> Error in (function () : This is an error!

github-actions · 2022-03-03T21:05:56Z

https://issues.apache.org/jira/browse/ARROW-15841

github-actions · 2022-03-03T21:05:58Z

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

paleolimbot · 2022-03-04T02:10:13Z

(See also westonpace#10)

paleolimbot · 2022-03-25T17:03:35Z

Redoing this with an eye towards where I would actually like to use it! I think that it does need a synchronous Status<cpp_type> SafeCallIntoR<cpp_type>([]() { return r_api_call(); }), even if all the synchronous version does is error when it's not safe to execute R code. I think I have this working from other threads too but I'm too new to this to know exactly what I should be testing.

The places where I would prefer to use this in some other PRs:

Some sketch examples:

arrow:::TestSafeCallIntoR(
  function() "string one!",
  opt = "on_main_thread"
)
#> [1] "string one!"

arrow:::TestSafeCallIntoR(
  function() stop("This is an error"),
  opt = "on_main_thread"
)
#> Error in (function () : This is an error

arrow:::TestSafeCallIntoR(
  function() "string one!",
  opt = "async_with_executor"
)
#> [1] "string one!"

# This runs with the expected error, but causes subsequent segfaults, probably related
# to the error_token_ (maybe having to do with the copy-constructor?)

# arrow:::TestSafeCallIntoR(
#   function() stop("This is an error"),
#   opt = "async_with_executor"
# )

arrow:::TestSafeCallIntoR(
  function() "string one!",
  opt = "async_without_executor"
)
#> Error: NotImplemented: Call to R from a non-R thread without an event loop

westonpace

This will be an awesome capability. A few nits and thoughts but overall I think this is the right direction.

r/src/safe-call-into-r-impl.cpp

r/src/safe-call-into-r.h

r/src/safe-call-into-r-impl.cpp

r/src/safe-call-into-r.h

r/src/safe-call-into-r-impl.cpp

r/tests/testthat/test-safe-call-into-r.R

r/src/safe-call-into-r-impl.cpp

paleolimbot · 2022-03-29T19:25:42Z

r/src/safe-call-into-r-impl.cpp

+// [[arrow::export]]
+std::string TestSafeCallIntoR(cpp11::function r_fun_that_returns_a_string,


We don't have a precedent for this in the Arrow R package (a place to test C++ code from C++ that is hard to test from R). We probably don't want something like this running on CRAN, but I'm not sure what the best way is to fence this off / keep it from compiling anywhere except CI?

I haven't dug in too much too the code yet, but is this resolved with new commits, or do we still need to find a way to gate this?

Neal took a quick look and said it it's fine as long as there's a note as to where TestSafeCallIntoR is defined (there's some Altrep tests that do this, too)

r/src/safe-call-into-r.h

westonpace

Very clean and easier to understand now. Thanks for figuring this out.

westonpace · 2022-04-04T23:36:43Z

r/R/arrow-package.R

 .onLoad <- function(...) {
+  if (arrow_available()) {
+    # Make sure C++ knows on which thread it is safe to call the R API
+    InitializeMainRThread()


Do we know for a fact that the R thread never changes? For example, in JS, there is always "one thread" but the actual thread id can change from iteration to iteration of the event loop.

I asked in the r-lib slack channel and nobody seems to feel that this will be a problem. They did advise to check parallel::mclapply() since this creates a fork of the process, but a check seems to indicate that the value of std::this_thread::get_id() seems to be stable if somebody does happen to do that:

cpp11::cpp_source(code = ' #include "cpp11.hpp" #include <thread> #include <sstream> [[cpp11::register]] std::string thread_id() { std::thread::id id = std::this_thread::get_id(); std::stringstream ss; ss << id; return ss.str(); } ') thread_id() #> [1] "0x100e33d40" unique(lapply(1:1e3, function(x) thread_id())) #> [[1]] #> [1] "0x100e33d40" unique(parallel::mclapply(1:1e3, function(x) thread_id(), mc.cores = 8)) #> [[1]] #> [1] "0x100e33d40"

westonpace · 2022-04-04T23:43:17Z

r/src/safe-call-into-r-impl.cpp

+        });
+
+    thread_ptr->join();
+    delete thread_ptr;


So this is probably fine but you could wrap thread_ptr in a unique_ptr. For example:

thread_ptr = std::unique_ptr<std::thread>(new std::thread(...));

It gets rid of the delete call and guards you against very unlikely things like ->join() throwing an exception and the memory never getting cleaned up (not that such a thing would really matter in test code).

I can't get this to work without a crash!

Odd. If you want to create a commit (doesn't have to be part of any PR) then I'd be happy to take a look and see what was going on. Otherwise, like I said, it isn't very important, so let's not worry too much about it.

r/src/safe-call-into-r-impl.cpp

r/src/safe-call-into-r.h

jonkeane · 2022-04-07T13:42:03Z

r/tests/testthat/test-safe-call-into-r.R

+# under the License.
+
+# Note that TestSafeCallIntoR is defined in safe-call-into-r-impl.cpp
+


Suggested change

skip_on_cran()

Is this sufficient to make sure we don't test this on cran?

(I added them inside the test_that() blocks as mostly a stylistic choice!)

jonkeane

I'll merge when CI is green. Thank you!

ursabot · 2022-04-08T11:31:08Z

Benchmark runs are scheduled for baseline = 76d064c and contender = e110eac. e110eac is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.13% ⬆️0.04%] test-mac-arm
[Failed ⬇️0.71% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.09% ⬆️0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/468| e110eac7 ec2-t3-xlarge-us-east-2>
[Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/453| e110eac7 test-mac-arm>
[Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/454| e110eac7 ursa-i9-9960x>
[Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/463| e110eac7 ursa-thinkcentre-m75q>
[Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/467| 76d064c7 ec2-t3-xlarge-us-east-2>
[Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/452| 76d064c7 test-mac-arm>
[Failed] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/453| 76d064c7 ursa-i9-9960x>
[Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/462| 76d064c7 ursa-thinkcentre-m75q>
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

compiling sketch

d08b946

github-actions bot added the Component: R label Mar 3, 2022

paleolimbot added 3 commits March 4, 2022 08:11

remove event loop indirection

5671a13

license

e3dbdd3

redo safe call into r

7941d8d

paleolimbot added 8 commits March 25, 2022 14:06

cpp lint

a9a6c3c

don't evaluate C++ when arrow not available

d17dcad

with theoretical support for an executor

07be4fb

shuffle tests to make room for the async with executor bit

b1d360f

even more test simplifying

21518f0

insert test with event loop

ed3ea45

clang-format

7c03527

maybe fix cpp linter, add tests

9750f72

paleolimbot marked this pull request as ready for review March 25, 2022 19:29

maybe fix arrow-without-arrow

f361ad5

westonpace reviewed Mar 25, 2022

View reviewed changes

paleolimbot added 10 commits March 29, 2022 11:45

change location of main_r_thread

431d7b0

fix comments and error messages to align with behaviour

b373fe6

don't copy fun

6455c98

RunWithCaptureR(task) -> RunWithCapturedR(make_arrow_call)

65801f3

simplify Future/Status usage

be22a4d

fix one more lying comment

9665226

RunTask returns a Result<>

c1f62b2

MainRThread::RunTask() returns a Future

9cabbdc

Implement SafeCallIntoRAsync() and make SafeCallIntoR() use it

50fd564

handle errors that occur during event loop execution

8202e22

remove unused class, use ResetError()

20867dd

paleolimbot commented Mar 29, 2022

View reviewed changes

remove comment that is no longer relevant

48845fe

westonpace self-requested a review April 4, 2022 16:58

westonpace approved these changes Apr 4, 2022

View reviewed changes

paleolimbot added 2 commits April 6, 2022 09:31

reviews

6a9915a

make a note of where TestSafeCallIntoR is defined

c795d20

jonkeane reviewed Apr 7, 2022

View reviewed changes

skip custom C++ checks on CRAN

c28d2b3

jonkeane approved these changes Apr 7, 2022

View reviewed changes

jonkeane closed this in e110eac Apr 7, 2022

paleolimbot deleted the r-safe-call-into branch December 9, 2022 16:38

		// [[arrow::export]]
		std::string TestSafeCallIntoR(cpp11::function r_fun_that_returns_a_string,

		# under the License.

		# Note that TestSafeCallIntoR is defined in safe-call-into-r-impl.cpp

ARROW-15841: [R] Implement SafeCallIntoR to safely call the R API from another thread #12558

ARROW-15841: [R] Implement SafeCallIntoR to safely call the R API from another thread #12558

Uh oh!

Conversation

paleolimbot commented Mar 3, 2022

Uh oh!

github-actions bot commented Mar 3, 2022

Uh oh!

github-actions bot commented Mar 3, 2022

Uh oh!

paleolimbot commented Mar 4, 2022

Uh oh!

paleolimbot commented Mar 25, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

westonpace left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

westonpace left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonkeane left a comment

Choose a reason for hiding this comment

Uh oh!

ursabot commented Apr 8, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

paleolimbot commented Mar 25, 2022 •

edited

Loading