-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-15841: [R] Implement SafeCallIntoR to safely call the R API from another thread #12558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
|
(See also westonpace#10) |
|
Redoing this with an eye towards where I would actually like to use it! I think that it does need a synchronous The places where I would prefer to use this in some other PRs:
Some sketch examples: arrow:::TestSafeCallIntoR(
function() "string one!",
opt = "on_main_thread"
)
#> [1] "string one!"
arrow:::TestSafeCallIntoR(
function() stop("This is an error"),
opt = "on_main_thread"
)
#> Error in (function () : This is an error
arrow:::TestSafeCallIntoR(
function() "string one!",
opt = "async_with_executor"
)
#> [1] "string one!"
# This runs with the expected error, but causes subsequent segfaults, probably related
# to the error_token_ (maybe having to do with the copy-constructor?)
# arrow:::TestSafeCallIntoR(
# function() stop("This is an error"),
# opt = "async_with_executor"
# )
arrow:::TestSafeCallIntoR(
function() "string one!",
opt = "async_without_executor"
)
#> Error: NotImplemented: Call to R from a non-R thread without an event loop |
westonpace
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be an awesome capability. A few nits and thoughts but overall I think this is the right direction.
| // [[arrow::export]] | ||
| std::string TestSafeCallIntoR(cpp11::function r_fun_that_returns_a_string, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have a precedent for this in the Arrow R package (a place to test C++ code from C++ that is hard to test from R). We probably don't want something like this running on CRAN, but I'm not sure what the best way is to fence this off / keep it from compiling anywhere except CI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't dug in too much too the code yet, but is this resolved with new commits, or do we still need to find a way to gate this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neal took a quick look and said it it's fine as long as there's a note as to where TestSafeCallIntoR is defined (there's some Altrep tests that do this, too)
westonpace
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very clean and easier to understand now. Thanks for figuring this out.
| .onLoad <- function(...) { | ||
| if (arrow_available()) { | ||
| # Make sure C++ knows on which thread it is safe to call the R API | ||
| InitializeMainRThread() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know for a fact that the R thread never changes? For example, in JS, there is always "one thread" but the actual thread id can change from iteration to iteration of the event loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked in the r-lib slack channel and nobody seems to feel that this will be a problem. They did advise to check parallel::mclapply() since this creates a fork of the process, but a check seems to indicate that the value of std::this_thread::get_id() seems to be stable if somebody does happen to do that:
cpp11::cpp_source(code = '
#include "cpp11.hpp"
#include <thread>
#include <sstream>
[[cpp11::register]]
std::string thread_id() {
std::thread::id id = std::this_thread::get_id();
std::stringstream ss;
ss << id;
return ss.str();
}
')
thread_id()
#> [1] "0x100e33d40"
unique(lapply(1:1e3, function(x) thread_id()))
#> [[1]]
#> [1] "0x100e33d40"
unique(parallel::mclapply(1:1e3, function(x) thread_id(), mc.cores = 8))
#> [[1]]
#> [1] "0x100e33d40"| }); | ||
|
|
||
| thread_ptr->join(); | ||
| delete thread_ptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is probably fine but you could wrap thread_ptr in a unique_ptr. For example:
thread_ptr = std::unique_ptr<std::thread>(new std::thread(...));
It gets rid of the delete call and guards you against very unlikely things like ->join() throwing an exception and the memory never getting cleaned up (not that such a thing would really matter in test code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't get this to work without a crash!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Odd. If you want to create a commit (doesn't have to be part of any PR) then I'd be happy to take a look and see what was going on. Otherwise, like I said, it isn't very important, so let's not worry too much about it.
| # under the License. | ||
|
|
||
| # Note that TestSafeCallIntoR is defined in safe-call-into-r-impl.cpp | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| skip_on_cran() |
Is this sufficient to make sure we don't test this on cran?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I added them inside the test_that() blocks as mostly a stylistic choice!)
jonkeane
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll merge when CI is green. Thank you!
|
Benchmark runs are scheduled for baseline = 76d064c and contender = e110eac. e110eac is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
This is a very WIP draft that currently just sketches a few things related to calling into R from other threads. Some code to get started: