[python-package] avoid data_has_header check in predict() #5970

jameslamb · 2023-07-11T04:03:39Z

In the Python package, you can generate predictions on data stored in a delimited text file by passing a filepath to Booster.predict().

That calls LGBM_BoosterPredictForFile() in the C API, which takes a boolean argument indicating whether or not the file's first row is a header with feature names.

That argument, data_has_header, is a boolean in Booster.predict()'s interface but an int in LGBM_BoosterPredictForFile(), leading to this conversion:

LightGBM/python-package/lightgbm/basic.py

Line 964 in 7140396

int_data_has_header = 1 if data_has_header else 0

This PR proposes moving that conversion down into the if-else block corresponding to "data is a file", so it's cost is avoided on all other prediction paths.

I'm sure the cost of that one check is very very very very very small, but Booster.predict() is a latency-sensitive part of the API since it's used in model serving.

python-package/lightgbm/basic.py

jmoralez

Just a small suggestion in case you want to go for it.

Co-authored-by: José Morales <[email protected]>

github-actions · 2023-10-11T00:19:05Z

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

[python-package] avoid data_has_header check in predict()

3ea7426

jameslamb added the maintenance label Jul 11, 2023

jameslamb requested review from StrikerRUS, shiyu1994 and jmoralez as code owners July 11, 2023 04:03

jameslamb added the awaiting review label Jul 11, 2023

jmoralez reviewed Jul 12, 2023

View reviewed changes

python-package/lightgbm/basic.py Outdated Show resolved Hide resolved

jmoralez approved these changes Jul 12, 2023

View reviewed changes

jmoralez removed the awaiting review label Jul 12, 2023

Update python-package/lightgbm/basic.py

451bec3

Co-authored-by: José Morales <[email protected]>

jameslamb merged commit 84c657e into master Jul 12, 2023
39 checks passed

jameslamb deleted the header-check branch July 12, 2023 14:38

github-actions bot locked as resolved and limited conversation to collaborators Oct 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[python-package] avoid data_has_header check in predict() #5970

[python-package] avoid data_has_header check in predict() #5970

jameslamb commented Jul 11, 2023

jmoralez left a comment

github-actions bot commented Oct 11, 2023

[python-package] avoid data_has_header check in predict() #5970

[python-package] avoid data_has_header check in predict() #5970

Conversation

jameslamb commented Jul 11, 2023

jmoralez left a comment

Choose a reason for hiding this comment

github-actions bot commented Oct 11, 2023