Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do you have the plan to support Java like xgboost4j? #909

Closed
liulhdarks opened this issue Sep 12, 2017 · 13 comments
Closed

Do you have the plan to support Java like xgboost4j? #909

liulhdarks opened this issue Sep 12, 2017 · 13 comments
Labels

Comments

@liulhdarks
Copy link

Do you have the plan to support Java like xgboost4j?

@ghost
Copy link

ghost commented Oct 11, 2017

I need it too for one of the highly important Microsoft customer...
predict4j - does not work with current version (fails because default_values missing in 2.0.7 model).
pmml conversion - does not work either, fails by double -> int cast somewhere in model

@guolinke
Copy link
Collaborator

the spark version will release on mmlspark (https://github.com/Azure/mmlspark ) soon.

@guolinke
Copy link
Collaborator

ping @rmhasan
the pmml seems is broken for the newer version. could you help to bring it back ?

@vruusmann
Copy link

Exporting LightGBM to PMML, and then scoring using a Java PMML engine should count as a viable option in the meantime.

I've updated my JPMML-LightGBM exporter library to be fully compatible with the latest LightGBM v2.0.7 (including the handling of categorical/binary features and missing values). Better yet, it provides some custom functionality such as limiting the number of trees (similar to the num_iteration parameter of LightGBM's Scikit-Learn API), and compacting individual trees.

Tree compaction involves 1) expanding LightGBM-style binary splits into PMML-style multi-way splits, 2) eliminating half terminal nodes (aka leafs) and 3) eliminating redundant tree splitting predicates. It leads to >50% reduction in PMML file size.

@guolinke
Copy link
Collaborator

cf: microsoft/SynapseML#173

@imatiach-msft
Copy link
Contributor

@vladimir-vilinski @liulhdarks @vruusmann I've checked in the code to generate SWIG Java wrappers to LightGBM repo
To build, you just need to run:

mkdir build ; cd build
cmake -DUSE_SWIG=ON ..
make -j4

The jar file is also available in maven central:
https://repo.maven.apache.org/maven2/com/microsoft/ml/lightgbm/lightgbmlib/
You can import it with sbt via:
"com.microsoft.ml.lightgbm" % "lightgbmlib" % "2.0.120"

I also have a PR open to add LightGBM to MMLSpark, a package for apache spark distributed data processing framework:
microsoft/SynapseML#235

If you have any suggestions for how to improve the SWIG wrappers or have any general questions please let me know.

@guolinke
Copy link
Collaborator

Thanks @imatiach-msft so much!

@chivee chivee reopened this Feb 13, 2018
@chivee
Copy link
Collaborator

chivee commented Feb 13, 2018

@guolinke , let's remaining this open for further discussion.

@imatiach-msft
Copy link
Contributor

@chivee @guolinke - @drdarshan suggested that the Java bindings could be improved by using SWIG typemaps. This would be more customized code but it would remove the need for developers to deal with SWIG pointer types. I think this is an improvement that we could add in the future for developers who use the Java bindings directly (and not our spark-based learners).

@spkaplan
Copy link

spkaplan commented Dec 20, 2018

@imatiach-msft
Copy link
Contributor

imatiach-msft commented Dec 20, 2018

@spkaplan yes, that is the package that I am maintaining. However, the java interface still needs to be improved more, as I mentioned above. I am open to suggestions from the community. Right now the autogenerated wrappers are mainly only used in mmlspark in scala code, but anyone can use the package.

@spkaplan
Copy link

spkaplan commented Dec 20, 2018

@imatiach-msft Thank you for the quick reply! I had accidentally overlooked your previous comment regarding the jar available in maven central. Thank you for pointing that out!

@StrikerRUS
Copy link
Collaborator

Closed in favor of being in #2302. We decided to keep all feature requests in one place.

Welcome to contribute this feature! Please re-open this issue (or post a comment if you are not a topic starter) if you are actively working on implementing this feature.

@lock lock bot locked as resolved and limited conversation to collaborators Mar 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

7 participants