Skip to content

Commit

Permalink
Add the HIST and IGMTF model on Alpha360 (microsoft#1040)
Browse files Browse the repository at this point in the history
* Commit the code of HIST and IGMTF on Alpha360

* add stock index

* Update README.md

* delete useless code

* fix the bug of code format with black

* fix pylint bugs

* fix the bugs of pylint

* fix pylint bugs

* fix flake8
  • Loading branch information
Wentao-Xu authored Apr 13, 2022
1 parent aab18de commit a211698
Show file tree
Hide file tree
Showing 11 changed files with 1,149 additions and 0 deletions.
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
Recent released features
| Feature | Status |
| -- | ------ |
| HIST and IGMTF models | :chart_with_upwards_trend: [Released](https://github.com/microsoft/qlib/pull/1040) on Apr 10, 2022 |
| Qlib notebook tutorial | 📖 [Released](https://github.com/microsoft/qlib/pull/1037) on Apr 7, 2022 |
| Ibovespa index data | :rice: [Released](https://github.com/microsoft/qlib/pull/990) on Apr 6, 2022 |
| Point-in-Time database | :hammer: [Released](https://github.com/microsoft/qlib/pull/343) on Mar 10, 2022 |
Expand Down Expand Up @@ -339,6 +340,8 @@ Here is a list of models built on `Qlib`.
- [TCN based on pytorch (Shaojie Bai, et al. 2018)](examples/benchmarks/TCN/)
- [ADARNN based on pytorch (YunTao Du, et al. 2021)](examples/benchmarks/ADARNN/)
- [ADD based on pytorch (Hongshun Tang, et al.2020)](examples/benchmarks/ADD/)
- [IGMTF based on pytorch (Wentao Xu, et al.2021)](examples/benchmarks/IGMTF/)
- [HIST based on pytorch (Wentao Xu, et al.2021)](examples/benchmarks/HIST/)
Your PR of new Quant models is highly welcomed.
Expand Down
3 changes: 3 additions & 0 deletions examples/benchmarks/HIST/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# HIST
* Code: [https://github.com/Wentao-Xu/HIST](https://github.com/Wentao-Xu/HIST)
* Paper: [HIST: A Graph-based Framework for Stock Trend Forecasting via Mining Concept-Oriented Shared InformationAdaRNN: Adaptive Learning and Forecasting for Time Series](https://arxiv.org/abs/2110.13716).
Binary file not shown.
4 changes: 4 additions & 0 deletions examples/benchmarks/HIST/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
pandas==1.1.2
numpy==1.21.0
scikit_learn==0.23.2
torch==1.7.0
92 changes: 92 additions & 0 deletions examples/benchmarks/HIST/workflow_config_hist_Alpha360.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
qlib_init:
provider_uri: "~/.qlib/qlib_data/cn_data"
region: cn
market: &market csi300
benchmark: &benchmark SH000300
data_handler_config: &data_handler_config
start_time: 2008-01-01
end_time: 2020-08-01
fit_start_time: 2008-01-01
fit_end_time: 2014-12-31
instruments: *market
infer_processors:
- class: RobustZScoreNorm
kwargs:
fields_group: feature
clip_outlier: true
- class: Fillna
kwargs:
fields_group: feature
learn_processors:
- class: DropnaLabel
- class: CSRankNorm
kwargs:
fields_group: label
label: ["Ref($close, -2) / Ref($close, -1) - 1"]
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: HIST
module_path: qlib.contrib.model.pytorch_hist
kwargs:
d_feat: 6
hidden_size: 64
num_layers: 2
dropout: 0
n_epochs: 200
lr: 1e-4
early_stop: 20
metric: ic
loss: mse
base_model: LSTM
model_path: "benchmarks/LSTM/model_lstm_csi300.pkl"
stock2concept: "benchmarks/HIST/qlib_csi300_stock2concept.npy"
stock_index: "benchmarks/HIST/qlib_csi300_stock_index.npy"
GPU: 0
dataset:
class: DatasetH
module_path: qlib.data.dataset
kwargs:
handler:
class: Alpha360
module_path: qlib.contrib.data.handler
kwargs: *data_handler_config
segments:
train: [2008-01-01, 2014-12-31]
valid: [2015-01-01, 2016-12-31]
test: [2017-01-01, 2020-08-01]
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
ana_long_short: False
ann_scaler: 252
- class: PortAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
config: *port_analysis_config
4 changes: 4 additions & 0 deletions examples/benchmarks/IGMTF/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# IGMTF
* Code: [https://github.com/Wentao-Xu/IGMTF](https://github.com/Wentao-Xu/IGMTF)
* Paper: [IGMTF: An Instance-wise Graph-based Framework for
Multivariate Time Series Forecasting](https://arxiv.org/abs/2109.06489).
4 changes: 4 additions & 0 deletions examples/benchmarks/IGMTF/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
pandas==1.1.2
numpy==1.21.0
scikit_learn==0.23.2
torch==1.7.0
89 changes: 89 additions & 0 deletions examples/benchmarks/IGMTF/workflow_config_igmtf_Alpha360.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
qlib_init:
provider_uri: "~/.qlib/qlib_data/cn_data"
region: cn
market: &market csi300
benchmark: &benchmark SH000300
data_handler_config: &data_handler_config
start_time: 2008-01-01
end_time: 2020-08-01
fit_start_time: 2008-01-01
fit_end_time: 2014-12-31
instruments: *market
infer_processors:
- class: RobustZScoreNorm
kwargs:
fields_group: feature
clip_outlier: true
- class: Fillna
kwargs:
fields_group: feature
learn_processors:
- class: DropnaLabel
- class: CSRankNorm
kwargs:
fields_group: label
label: ["Ref($close, -2) / Ref($close, -1) - 1"]
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy
kwargs:
model: <MODEL>
dataset: <DATASET>
topk: 50
n_drop: 5
backtest:
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: IGMTF
module_path: qlib.contrib.model.pytorch_igmtf
kwargs:
d_feat: 6
hidden_size: 64
num_layers: 2
dropout: 0
n_epochs: 200
lr: 1e-4
early_stop: 20
metric: ic
loss: mse
base_model: LSTM
model_path: "benchmarks/LSTM/model_lstm_csi300.pkl"
GPU: 0
dataset:
class: DatasetH
module_path: qlib.data.dataset
kwargs:
handler:
class: Alpha360
module_path: qlib.contrib.data.handler
kwargs: *data_handler_config
segments:
train: [2008-01-01, 2014-12-31]
valid: [2015-01-01, 2016-12-31]
test: [2017-01-01, 2020-08-01]
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
ana_long_short: False
ann_scaler: 252
- class: PortAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
config: *port_analysis_config
3 changes: 3 additions & 0 deletions examples/benchmarks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,9 @@ The numbers shown below demonstrate the performance of the entire `workflow` of
| GATs (Petar Velickovic, et al.) | Alpha360 | 0.0476±0.00 | 0.3508±0.02 | 0.0598±0.00 | 0.4604±0.01 | 0.0824±0.02 | 1.1079±0.26 | -0.0894±0.03 |
| TCTS(Xueqing Wu, et al.) | Alpha360 | 0.0508±0.00 | 0.3931±0.04 | 0.0599±0.00 | 0.4756±0.03 | 0.0893±0.03 | 1.2256±0.36 | -0.0857±0.02 |
| TRA(Hengxu Lin, et al.) | Alpha360 | 0.0485±0.00 | 0.3787±0.03 | 0.0587±0.00 | 0.4756±0.03 | 0.0920±0.03 | 1.2789±0.42 | -0.0834±0.02 |
| IGMTF(Wentao Xu, et al.) | Alpha360 | 0.0480±0.00 | 0.3589±0.02 | 0.0606±0.00 | 0.4773±0.01 | 0.0946±0.02 | 1.3509±0.25 | -0.0716±0.02 |
| HIST(Wentao Xu, et al.) | Alpha360 | 0.0522±0.00 | 0.3530±0.01 | 0.0667±0.00 | 0.4576±0.01 | 0.0987±0.02 | 1.3726±0.27 | -0.0681±0.01 |


- The selected 20 features are based on the feature importance of a lightgbm-based model.
- The base model of DoubleEnsemble is LGBM.
Expand Down
Loading

0 comments on commit a211698

Please sign in to comment.