IJCAI17 Customer Flow Forecasts on Koubei.com
- Rank of season 1 : 1088
- Rank of season 2 : 537
- [Done] Add weather data in order to get more info
- [Done] Cluster data into different cluster as a new feature(using PCA and KMean, GMM may be slow on such a volumn data, > 60million Records)
- [Done] Add LightGBM algorithm for fast train and optimization
- Gradient Boost Regression optimization if have time(compute take a lot time), deffer
- Change Sklearn implementation to TensorFlow version(sklearn's a little slow within more data and estimator), deffer
jupyter notebook run.ipynb
or
-
python preprocess_data.py
-
sed -i -e 's/,,/,/g' result.txt
-
python process_result.py
-
predict.csv is the final result