official websit is https://dc.cloud.alipay.com/index#/topic/intro?id=3 I did some clean up, but the code is exactly what I submitted in the competition! Only one type of model used, with different input(word embedding and char embedding) you should be able to get >0.623 locally with 0.05 VP for ensemble.
-
the data was for season1 only(for some security reason, season2 data can not be shared)
-
to run the char model(avg score:0.5729852440408627,prd_thr:0.21000000000000002):
python model.py -m train_model -ms "ELMOEM128_CHAR"
-
to run the word model:(avg score:0.5582690659811482,prd_thr:0.18000000000000002)
python model.py -m train_model -ms "ELMOEM128"
-
to run ensemble:
python model.py -m train_model -ms "ELMOEM128_21,ELMOEM128_CHAR_21"
-
to debug:
python model.py -m train_model -ms "ELMOEM128_CHAR"
-d -
with different validation size:
python model.py -m train_model -ms "ELMOEM128_CHAR"
-vp 0.05
Deep contextualized word representations
Injecting Relational Structural Representation in Neural Networks for Question Similarit