Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python-package] Allow to pass Arrow table and array as init scores #6167

Merged
merged 82 commits into from
Dec 4, 2023
Merged

[python-package] Allow to pass Arrow table and array as init scores #6167

merged 82 commits into from
Dec 4, 2023

Conversation

borchero
Copy link
Collaborator

Motivation

This is part of a series of PRs towards #6022 and depends on #6166.

For a legible diff, see borchero/LightGBM@arrow-support-groups...arrow-support-init-scores.

@borchero borchero changed the title WIP: [python-package] Allow to pass Arrow table and array as init scores [python-package] Allow to pass Arrow table and array as init scores Nov 22, 2023
@borchero borchero marked this pull request as ready for review November 22, 2023 22:46
Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! I support it overall.

I think it should be possible to provide 2D pyarrow.Array and pyarrow.ChunkedArray objects for init_score as well. Are there some limitations there that I'm missing, that led you to only designate pyarrow.Table as "for multi-class" in the docs + not add tests on 2D pyarrow.Array / pyarrow.ChunkedArray?

If that was just an oversight, please add some tests on providing 2D data for init_score via the pyarrow array types.

python-package/lightgbm/basic.py Outdated Show resolved Hide resolved
python-package/lightgbm/basic.py Show resolved Hide resolved
tests/python_package_test/test_arrow.py Outdated Show resolved Hide resolved
python-package/lightgbm/basic.py Show resolved Hide resolved
@borchero
Copy link
Collaborator Author

borchero commented Nov 22, 2023

I think it should be possible to provide 2D pyarrow.Array and pyarrow.ChunkedArray objects

I'm not sure I mistook something in the Arrow specification but aren't Arrow arrays 1D by design? In my mind, there's no n-d Arrow arrays (like you have NumPy arrays), an Arrow array is really just a vector.

@jameslamb
Copy link
Collaborator

aren't Arrow arrays 1D by design

Oh I guess they are: https://blog.djnavarro.net/posts/2022-05-25_arrays-and-tables-in-arrow/#data-objects

(the author was working at Voltron on Arrow when that was written, I believe)

Sorry, I'm not that familiar with Arrow... trying to get caught up to review these PRs. I think this was the first of these PRs that included both the Array types and Table, so the first place I really thought about it (e.g. the absence of Array never occurred to me in #6034).

Thanks for your patience and quick answers.

Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me, thanks!

@shiyu1994 or @guolinke could you help with a review as well?

Copy link
Collaborator

@guolinke guolinke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@borchero
Copy link
Collaborator Author

Can we continue here @jameslamb? 🙏🏼 would love to get Arrow support in a LightGBM release before Christmas 😄

@jameslamb
Copy link
Collaborator

I dont want to merge this until #6216 is merged, as that fixes an issue causing CI to fail deterministically. That is just waiting for a review. Sorry for the delay.

@jameslamb
Copy link
Collaborator

Ok #6216 is merged, thanks to @guolinke !

And thanks to @shiyu1994 , I could merge master into this by clicking a button on my phone 😁

Once CI passes, I'll merge this and we can move on to the next one. Thanks as always for your patience and help @borchero 😊

@borchero
Copy link
Collaborator Author

borchero commented Dec 1, 2023

What's up with the CUDA CI, is this a problem across this repo? 👀

@jameslamb
Copy link
Collaborator

is this a problem across the repo?

It seems like it, yes.

You can see all runs of GitHub Actions in a project at the Actions tab. Here are the recent runs of CUDA across all commits:

https://github.com/microsoft/LightGBM/actions/workflows/cuda.yml

Looks to me like many of them failed yesterday (around 15-20 hours ago) with the same types of errors.

The job running on runner nv6-01 has exceeded the maximum execution time of 60 minutes.
The operation was canceled.

(example build not from this PR)

These run on a dedicated VM Microsoft pays for, not GitHub's fleet of hosted runners, so there can only be one job running at a time across the entire project. That makes it easy for them to get backed up.

I do see a task got picked up and is running right now for another PR: https://github.com/microsoft/LightGBM/actions/runs/7061296005/job/19222636772

Let's see if that succeeds. Sorry for the inconvenience.

@jameslamb
Copy link
Collaborator

Yeah the CUDA CI runs are all timing out at this point:

docker run --env-file docker.env -v "$GITHUB_WORKSPACE":"$ROOT_DOCKER_FOLDER" --rm --gpus all "$docker_img" /bin/bash $ROOT_DOCKER_FOLDER/docker-script.sh

@shiyu1994 can you please look into it? Maybe try shelling into that box and running the steps from the CI configuration manually, to see if you can get more logs?

I suspect it might be something like an interactive dialog that's being triggered, and then timing out waiting for that.

@jameslamb
Copy link
Collaborator

@shiyu1994 can you please help with the CUDA machine or give me access so I can do it?

The most recent manually-triggered re-run for this PR has been stuck for almost 24 hours.

image

Requested labels: self-hosted, linux
Job defined at: microsoft/LightGBM/.github/workflows/cuda.yml@refs/pull/6167/merge
Waiting for a runner to pick up this job...

(build link)

Thank you.

@shiyu1994
Copy link
Collaborator

The agent for CUDA action simply doesn't respond with and command in the shell. I've restarted the VM and it should work now.

@jameslamb
Copy link
Collaborator

Thanks so much @shiyu1994 ! I just restarted the builds for this PR.

@jameslamb
Copy link
Collaborator

@jameslamb jameslamb merged commit f5b6bd6 into microsoft:master Dec 4, 2023
41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants