This repository has been archived by the owner on Oct 8, 2019. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 153
Add training and test functions to integrate the native XGBoost library #281
Merged
Merged
Changes from all commits
Commits
Show all changes
43 commits
Select commit
Hold shift + click to select a range
8882b74
Add training & test functions by using the native XGBoost library
maropu fe3f535
Add xgboost4j.jar in core/lib
maropu 14fa18f
Add NativeLibLoader to a load custom-compiled xgboost library
maropu 3f46afa
Update .gitignore
maropu ffdd020
Add a script to make a costom-built xgboost binary
maropu 5d393a5
Use HadoopUtils.getTaskId() to generate unique ids
maropu f5f8c47
Remove unnecessary functions in HiveUtils
maropu f8e1779
Add entries in define-all.hive
maropu 31b8ca6
Add XGBoostUDTF to provide common functionality for XGBoost
maropu b145bf4
Add XGBoostBinaryClassifierUDTF for binary classification
maropu 2a6e03b
Add a `hivemall.xgboost.lib property` for loading user-defined native…
maropu 83e2a8d
Add XGBoostMulticlassClassifierUDTF for multiclass classification
maropu 29e7a46
Fix bugs in define-all-as-permanent.hive
maropu b016ad3
Rename an illegal file name
maropu dff1765
Support XGBoost functions on DataFrame/Spark
maropu 56a1269
Update the XGBoost library
maropu 25ab0ec
Remove system scope in core/pom.xml
maropu 7c345cb
Update import-packages.spark
maropu 9f55332
Add tests for train_xgboost_regr and train_xgboost_classifier
maropu 84c92ae
Update bin/build_xgboost.sh
maropu 6eb4def
Add tests for train_xgboost_multiclass_classifier
maropu 79c4479
Update .travis.yml
maropu 3414fb9
Move xgboost functions into a xgboost submodule
maropu 1352a40
Update bin/build_xgboost.sh
maropu 4e85fe5
Apply revew comments
maropu 60f65f9
Add -q options in .travis.yml
maropu 1754a81
Add a xgboost binary for Linux/x86_64
maropu 2c5e969
Fix bugs in spark/*/pom.xml
maropu 6be073e
Add notations for XGBoost functions in HivemallOps
maropu 7f98d12
Update .travis.yml to reduce # of executed tests
maropu fc9210f
Brush up exception handling
maropu 308ba87
Update compilation options for xgboost
maropu 29b5883
Add more tests for xgboost
maropu 5a043ac
Remove unnecessary dependencies in pom.xml
maropu 2a2c0d0
Build a jar for xgboost with portable binaries
maropu 67b03d2
Remove static-links for libgcc and libstdg++
maropu 73d8090
Add an option to enable static links in bin/build_xgboost.sh
maropu 0ad666f
Move the property of scala.version into topdir/pom.xml
maropu 7bf055f
Fix bugs in bin/build_xgboost.sh
maropu 9b9d440
Fix version numbers for xgboost
maropu a8f4cf2
Add activeByDefault in pom.xml
maropu 826b390
Fix version numbers for spark modules
maropu e6889dc
Add a profile to compile xgboost
maropu File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,3 +15,5 @@ scalastyle-output.xml | |
scalastyle.txt | ||
derby.log | ||
spark/bin/zinc-* | ||
*.dylib | ||
*.so |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
#!/bin/bash | ||
|
||
# Hivemall: Hive scalable Machine Learning Library | ||
# | ||
# Copyright (C) 2015 Makoto YUI | ||
# Copyright (C) 2013-2015 National Institute of Advanced Industrial Science and Technology (AIST) | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
set -eu | ||
set -o pipefail | ||
|
||
# Target commit hash value | ||
XGBOOST_HASHVAL='85443403310e90bd8a90a1f817841520838b4ac7' | ||
|
||
# Move to a top directory | ||
if [ "$HIVEMALL_HOME" == "" ]; then | ||
if [ -e ../bin/${0##*/} ]; then | ||
HIVEMALL_HOME=".." | ||
elif [ -e ./bin/${0##*/} ]; then | ||
HIVEMALL_HOME="." | ||
else | ||
echo "env HIVEMALL_HOME not defined" | ||
exit 1 | ||
fi | ||
fi | ||
|
||
cd $HIVEMALL_HOME | ||
|
||
# Final output dir for a custom-compiled xgboost binary | ||
HIVEMALL_LIB_DIR="$HIVEMALL_HOME/xgboost/src/main/resources/lib/" | ||
rm -rf $HIVEMALL_LIB_DIR >> /dev/null | ||
mkdir -p $HIVEMALL_LIB_DIR | ||
|
||
# Move to an output directory | ||
XGBOOST_OUT="$HIVEMALL_HOME/target/xgboost-$XGBOOST_HASHVAL" | ||
rm -rf $XGBOOST_OUT >> /dev/null | ||
mkdir -p $XGBOOST_OUT | ||
cd $XGBOOST_OUT | ||
|
||
# Fetch xgboost sources | ||
git clone --progress https://github.com/maropu/xgboost.git | ||
cd xgboost | ||
git checkout $XGBOOST_HASHVAL | ||
|
||
# Resolve dependent sources | ||
git submodule init | ||
git submodule update | ||
|
||
# Copy a built binary to the output | ||
cd jvm-packages | ||
ENABLE_STATIC_LINKS=1 ./create_jni.sh | ||
cp ./lib/libxgboost4j.* "$HIVEMALL_LIB_DIR" | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be tested both on spark 1.6 and spark 2.0.
Before committing this change, test was run successfully. Is this change required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, I think so.
IIUC the first tests the
Hivemall
core stuffs (e.g., core, nlp, and mixserv) and the spark-2.0 module.The other tests the spark-1.6 module only because the
Hivemall
core stuff has already been tested in the first test.