Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LightGBM stuck on fitting with libomp 15.0.7 on new Apple M2 if n_jobs != 1 #5764

Open
Zahlii opened this issue Mar 3, 2023 · 10 comments
Open
Labels

Comments

@Zahlii
Copy link

Zahlii commented Mar 3, 2023

from lightgbm import LGBMRegressor
import numpy as np

x = np.random.random((100, 10))
y = x.dot(np.random.random((10,)))
l = LGBMRegressor(n_estimators=1) # doesnt work, with n_jobs=1 it works
l.fit(x, y)

Darwin PC0455 22.3.0 Darwin Kernel Version 22.3.0: Mon Jan 30 20:39:46 PST 2023; root:xnu-8792.81.3~2/RELEASE_ARM64_T6020 arm64

Name: lightgbm
Version: 3.3.5

==> libomp: stable 15.0.7 (bottled) [keg-only]

MacOs Ventura 13.2.1

@jameslamb
Copy link
Collaborator

Thanks for using LightGBM. Can you please be more specific about what "doesn't work" means?

Do you get an exception, process crash, something else? Are there any logs you can report?

@Zahlii
Copy link
Author

Zahlii commented Mar 3, 2023

Hi, sorry for not being more clear: If i omit the n_jobs=1, it will hang indefinitely on the fit line, and I have to sigkill it. When running lightgbm as part of a pytest test suite, I sometimes get a python segmentation fault around the time the LGBM fit occurs.

@jameslamb
Copy link
Collaborator

It's ok. In the future, please provide all the information asked for in the issue template.

How did you install LightGBM? Please be as specific as possible.

@Zahlii
Copy link
Author

Zahlii commented Mar 3, 2023

brew install miniforge
brew install cmake
brew install gcc
brew install libomp
conda create -n venv-3.9-conda python=3.9.14 -y
conda activate venv-3.9-conda
pip install lightgbm

Some more info; it seems to get stuck here when constructing a booster (from hyperopt):
https://github.com/microsoft/LightGBM/blob/v3.3.5/python-package/lightgbm/basic.py#L2639

Setting OMP_NUM_THREADS=1 fixes (both) issues.

@Zahlii
Copy link
Author

Zahlii commented Mar 6, 2023

Old somewhat related thread: #4229

Downgrading libomp via homebrew is impossible, as older libomp versions are not compatible with M2.

@Zahlii
Copy link
Author

Zahlii commented Apr 27, 2023

Issue still persists with libomp 16.0.2

@tszyan-bain
Copy link

Facing same problem -- I have to set num_threads=1 otherwise kernel died shortly after starting the train job. Interestingly when I set a value >1 (for example 2 or 3), the kernel died after a few seconds while if I do not set any values at all (which default to 0), it almost died immediately.

I originally use homebrew to install lightgbm but switched to the build from github (here) due to the error, before I found this thread and setting num_threads=1.

I can only suspect the homebrew installation will also work with this workaround (haven't tested it)

@jameslamb
Copy link
Collaborator

jameslamb commented Jun 15, 2024

@Zahlii we just released LightGBM v4.4.0, with some fixes to macOS support. Could you please check again and see if that resolves the issue?

pip install 'lightgbm>=4.4.0'

I just ran the example you provided and it worked well for me.

  • macOS: 14.4.1 (23E224)
  • chip: M2
  • Python: 3.11.9
  • Python libraries: lightgbm==4.4.0, numpy==1.26.4, scikit-learn==1.15.0
  • OpenMP: libomp: stable 18.1.7 (bottled) [keg-only]

@trantrikien239
Copy link

@jameslamb the problem is not fixed for me.

brew info libomp
==> libomp: stable 18.1.8 (bottled) [keg-only]
LLVM's OpenMP runtime library
https://openmp.llvm.org/
Installed
/usr/local/Cellar/libomp/18.1.8 (9 files, 1.7MB)
  Poured from bottle using the formulae.brew.sh API on 2024-06-27 at 01:50:02
From: https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/lib/libomp.rb
License: MIT
==> Dependencies
Build: cmake ✘, lit ✘
==> Caveats
libomp is keg-only, which means it was not symlinked into /usr/local,
because it can override GCC headers and result in broken builds.

For compilers to find libomp you may need to set:
  export LDFLAGS="-L/usr/local/opt/libomp/lib"
  export CPPFLAGS="-I/usr/local/opt/libomp/include"
==> Analytics
install: 60,255 (30 days), 181,591 (90 days), 515,596 (365 days)
install-on-request: 11,742 (30 days), 35,590 (90 days), 103,479 (365 days)
build-error: 1 (30 days)

My system is almost exactly as yours (M2 and everything)

@jameslamb
Copy link
Collaborator

sad 😭

ok thank you for letting us know, we'll try to investigate soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants