feat: Add `get_or_infer_runner_type` to support getting runner type from context #4810

plotor · 2025-07-21T12:04:36Z

Changes Made

We found that in some scenarios, users need to obtain the Runner type of Daft in UDF, but currently it can only be obtained through daft.context.get_context()._runner.name. The problem is that the UDF running on the ray worker gets None result when call daft.context.get_context()._runner, so I added a daft.context.get_context().get_runner_type() method in this PR. The execution mechanism of this method is as follows:

Prioritize daft.context.get_context()._runner to determine the Runner type;
If daft.context.get_context()._runner is None, call the detect_ray_state method to determine whether it's currently running on ray. If so, the current Runner type is considered to be ray, otherwise it is native.

In addition, I found that when the DAFT_RUNNER env is inconsistent with set_runner_xxx, Daft will prioritize the set_runner_xxx settings, so I added some warn logs to remind users.

Related Issues

No issue

Checklist

Documented in API Docs (if applicable)
Documented in User Guide (if applicable)
If adding a new documentation page, doc is added to docs/mkdocs.yml navigation
Documentation builds and is formatted properly (tag @/ccmao1130 for docs review)

codecov · 2025-07-21T12:37:36Z

Codecov Report

❌ Patch coverage is 50.94340% with 26 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.21%. Comparing base (296a129) to head (79721be).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
src/daft-context/src/python.rs	0.00%	12 Missing ⚠️
src/daft-context/src/lib.rs	66.66%	11 Missing ⚠️
src/daft-py-runners/src/lib.rs	0.00%	2 Missing ⚠️
daft/context.py	75.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4810      +/-   ##
==========================================
+ Coverage   78.81%   79.21%   +0.40%     
==========================================
  Files         893      893              
  Lines      124507   124159     -348     
==========================================
+ Hits        98128    98357     +229     
+ Misses      26379    25802     -577

Files with missing lines	Coverage Δ
daft/utils.py	`90.16% <100.00%> (ø)`
daft/context.py	`81.57% <75.00%> (-0.37%)`	⬇️
src/daft-py-runners/src/lib.rs	`81.55% <0.00%> (ø)`
src/daft-context/src/lib.rs	`77.39% <66.66%> (-2.42%)`	⬇️
src/daft-context/src/python.rs	`62.26% <0.00%> (-7.95%)`	⬇️

... and 28 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

colin-ho

So i'm a little worried that this API can cause confusion to users, given that the results of get_runner_type can easily be changed based on order of operations and execution environment. I think it would be more appropriate for this to be called get_or_infer_runner. Additionally, I'd also prefer if this was a standalone function instead of a method on DaftContext, to keep it simple.

In order to know which runner is executing the UDF though, it will be more foolproof to pass this information into the UDF itself. This would be of larger scope and needs some design.

Lastly, could you please add tests for this in tests/test_context.py.

colin-ho · 2025-07-25T07:29:32Z

src/daft-context/src/python.rs

@@ -45,6 +46,24 @@ impl PyDaftContext {
            }
        }
    }
+
+    pub fn get_runner_type(&self, py: Python) -> PyResult<PyObject> {
+        let runner_type = self.inner.runner().map_or_else(


This should call get_runner_config_from_env to check the env as well

I don't quite understand why get_runner_config_from_env needs to be called here, because the setting of DAFT_RUNNER env may be inconsistent with set_runner_xxx, and the latter has a higher priority, so the result obtained by get_runner_config_from_env may not be accurate.

get_runner_type() can also be inaccurate if it is called before and after set_runner_xxx, however if there is no set_runner_xxx, it can also be inaccurate due to DAFT_RUNNER env.

Got it, because native is the default runner of daft, if get_runner_config_from_env is not called, then get_runner_type will always return native before and after set_runner_ray is not called. Now the so-called "inaccurate" will only occur when the following two conditions are met:

get_runner_type is called before set_runner_xxx.

set_runner_xxx is inconsistent with DAFT_RUNNER env settings.

But now when set_runner_xxx is inconsistent with DAFT_RUNNER env settings, a warn log will be printed to remind the user, so get_runner_type will not confuse the user.

Good suggestions, thanks.

plotor · 2025-07-28T09:57:23Z

So i'm a little worried that this API can cause confusion to users, given that the results of get_runner_type can easily be changed based on order of operations and execution environment. I think it would be more appropriate for this to be called get_or_infer_runner. Additionally, I'd also prefer if this was a standalone function instead of a method on DaftContext, to keep it simple.

In order to know which runner is executing the UDF though, it will be more foolproof to pass this information into the UDF itself. This would be of larger scope and needs some design.

Lastly, could you please add tests for this in tests/test_context.py.

Thanks for taking the time to review. The tests have been added to test_context.py, please review it again.

colin-ho

One last thing, otherwise it looks good

colin-ho · 2025-07-29T18:03:39Z

daft/context.py

@@ -47,6 +47,21 @@ def __init__(self, ctx: PyDaftContext | None = None):
        else:
            self._ctx = PyDaftContext()

+    def get_runner_type(self) -> str:


Rename to get_or_infer_runner_type to make it clearer that this will not create the runner.

Make sense, and modified

…rom context Signed-off-by: plotor <[email protected]>

github-actions bot added the fix label Jul 21, 2025

plotor force-pushed the zhenchao-context-runner-20250721 branch 2 times, most recently from c8e282c to 5a176e7 Compare July 23, 2025 10:04

plotor changed the title ~~fix: Getting the Runner type in UDF return None~~ feat: Support get runner type in context by get_runner_type Jul 23, 2025

github-actions bot added feat and removed fix labels Jul 23, 2025

plotor marked this pull request as ready for review July 23, 2025 10:05

plotor force-pushed the zhenchao-context-runner-20250721 branch from 5a176e7 to aa012c3 Compare July 23, 2025 10:49

plotor changed the title ~~feat: Support get runner type in context by get_runner_type~~ feat: Add get_runner_type method to support getting the currently used Runner type Jul 23, 2025

plotor force-pushed the zhenchao-context-runner-20250721 branch from aa012c3 to 713779f Compare July 24, 2025 02:02

colin-ho reviewed Jul 25, 2025

View reviewed changes

plotor force-pushed the zhenchao-context-runner-20250721 branch 3 times, most recently from ea9988b to 3823999 Compare July 28, 2025 09:52

colin-ho approved these changes Jul 29, 2025

View reviewed changes

plotor force-pushed the zhenchao-context-runner-20250721 branch from 3823999 to ec7ce13 Compare July 30, 2025 02:13

feat: Add get_or_infer_runner_type to support getting runner type f…

79721be

…rom context Signed-off-by: plotor <[email protected]>

plotor changed the title ~~feat: Add get_runner_type method to support getting the currently used Runner type~~ feat: Add get_or_infer_runner_type to support getting runner type from context Jul 30, 2025

plotor force-pushed the zhenchao-context-runner-20250721 branch from ec7ce13 to 79721be Compare July 30, 2025 02:33

colin-ho merged commit f4b0f15 into Eventual-Inc:main Jul 30, 2025
45 of 46 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add `get_or_infer_runner_type` to support getting runner type from context #4810

feat: Add `get_or_infer_runner_type` to support getting runner type from context #4810

Uh oh!

plotor commented Jul 21, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jul 21, 2025 •

edited

Loading

Uh oh!

colin-ho left a comment

Uh oh!

colin-ho Jul 25, 2025

Uh oh!

plotor Jul 25, 2025

Uh oh!

colin-ho Jul 25, 2025

Uh oh!

plotor Jul 28, 2025

Uh oh!

plotor commented Jul 28, 2025 •

edited

Loading

Uh oh!

colin-ho left a comment

Uh oh!

colin-ho Jul 29, 2025

Uh oh!

plotor Jul 30, 2025

Uh oh!

Uh oh!

Uh oh!

feat: Add get_or_infer_runner_type to support getting runner type from context #4810

feat: Add get_or_infer_runner_type to support getting runner type from context #4810

Uh oh!

Conversation

plotor commented Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes Made

Related Issues

Checklist

Uh oh!

codecov bot commented Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

colin-ho left a comment

Choose a reason for hiding this comment

Uh oh!

colin-ho Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

plotor Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

colin-ho Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

plotor Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

plotor commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

colin-ho left a comment

Choose a reason for hiding this comment

Uh oh!

colin-ho Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

plotor Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

feat: Add `get_or_infer_runner_type` to support getting runner type from context #4810

feat: Add `get_or_infer_runner_type` to support getting runner type from context #4810

plotor commented Jul 21, 2025 •

edited

Loading

codecov bot commented Jul 21, 2025 •

edited

Loading

plotor commented Jul 28, 2025 •

edited

Loading