InsCD: A Modularized, Comprehensive and User-Friendly Toolkit for Machine Learning Empowered Cognitive Diagnosis
Shanghai Institute of AI Education, School of Computer Science and Technology
East China Normal University
InsCD, namely Instant Cognitive Diagnosis (Chinese: 时诊), is a highly modularized python library for cognitive diagnosis in intelligent education systems. This library incorporates both traditional methods (e.g., solving IRT via statistics) and deep learning-based methods (e.g., modelling students and exercises via graph neural networks).
- [2024.8.31] InsCD toolkit v1.2 is released. What's New: We implement two new models: symbolic cognitive diagnosis model (SymbolCD) and hypergraph cognitive diagnosis model (HyperCD)
- [2024.7.14] InsCD toolkit v1.1 is released and available for downloading.
- [2024.4.20] InsCD toolkit v1.0 is released.
Git and install with pip:
git clone https://github.com/ECNU-ILOG/inscd.git
cd <path of code>
pip install .
or install the library from pypi
pip install inscd
The following code is a simple example of cognitive diagnosis implemented by inscd. We load build-in datasets, create cognitive diagnosis model, train model and show its performance:
from inscd import listener
from inscd.datahub import NeurIPS20
from inscd.models.neural import NCDM
listener.activate()
datahub = NeurIPS20()
datahub.random_split()
datahub.random_split(source="valid", to=["valid", "test"])
ncdm = NCDM()
ncdm.build(datahub)
ncdm.train(datahub, "train", "valid")
test_results = ncdm.score(datahub, "test", metrics=["acc", "doa"])
For more details, please refer to InsCD Documentation.
We incoporate classical, famous and state-of-the-art methods published or accepted by leading journals and conferences in the field of psychometric, machine learning and data mining. The reason why we call this toolkit "modulaized" is that we not only provide the "model", but also divide the model into two parts (i.e., extractor and interaction function), which enables us to design new models (e.g., extractor of Hypergraph with interaction function of KaNCD). To evaluate the model, we also provide vairous open-source datasets in online or offline scenarios.
Model | Release | Paper |
---|---|---|
Item Response Theory (IRT) | 1952 | Frederic Lord. A Theory of Test Scores. Psychometric Monographs. |
Multidimentional Item Response Theory (MIRT) | 2009 | Mark D. Reckase. Multidimensional Item Response Theory Models. |
Neural Cognitive Diagnosis Model (NCDM) | 2020 | Fei Wang et al. Neural Cognitive Diagnosis for Intelligent Education Systems. AAAI'20. |
Relation Map-driven Cognitive Diagnosis Model (RCD) | 2021 | Weibo Gao et al. RCD: Relation Map Driven Cognitive Diagnosis for Intelligent Education Systems. SIGIR'21 |
Knowledge-association Neural Cognitive Diagnosis (KaNCD) | 2022 | Fei Wang et al. NeuralCD: A General Framework for Cognitive Diagnosis. TKDE. |
Knowledge-sensed Cognitive Diagnosis Model (KSCD) | 2022 | Haiping Ma et al. Knowledge-Sensed Cognitive Diagnosis for Intelligent Education Platforms. CIKM'22. |
Cognitive Diagnosis Model Focusing on Knowledge Concepts (CDMFKC) | 2022 | Sheng Li et al. Cognitive Diagnosis Focusing on Knowledge Concepts. CIKM'22 |
Q-augmented Causal Cognitive Diagnosis Model (QCCDM) | 2023 | Shuo Liu et al. QCCDM: A Q-Augmented Causal Cognitive Diagnosis Model for Student Learning. ECAI'23. |
Self-supervised Cognitive Diagnosis Model (SCD) | 2023 | Shanshan Wang et al. Self-Supervised Graph Learning for Long-Tailed Cognitive Diagnosis. AAAI'23. |
Symbolic Cognitive Diganosis Model (SymbolCD) | 2024 | Junhao Shen et al. Symbolic Cognitive Diagnosis via Hybrid Optimization for Intelligent Education Systems |
Oversmoothing-Resistant Cognitive Diagnosis Framework (ORCDF) | 2024 | Shuo Liu et al. ORCDF: An Oversmoothing-Resistant Cognitive Diagnosis Framework for Student Learning in Online Education Systems. KDD'24. |
Hypergraph Cognitive Diagnosis Model (HyperCDM) | 2024 | Junhao Shen et al. Capturing Homogeneous Influence among Students: Hypergraph Cognitive Diagnosis for Intelligent Education Systems. KDD'24 |
Dataset | Release | Source |
---|---|---|
inscd.datahub.Assist17 |
2018 | https://sites.google.com/view/assistmentsdatamining/dataset |
inscd.datahub.FracSub |
2015 | http://staff.ustc.edu.cn/%7Eqiliuql/data/math2015.rar |
inscd.datahub.Junyi734 |
2015 | https://www.educationaldatamining.org/EDM2015/proceedings/short532-535.pdf |
inscd.datahub.Math1 |
2015 | http://staff.ustc.edu.cn/%7Eqiliuql/data/math2015.rar |
inscd.datahub.Math2 |
2015 | http://staff.ustc.edu.cn/%7Eqiliuql/data/math2015.rar |
inscd.datahub.Matmat |
2019 | https://github.com/adaptive-learning/matmat-web |
inscd.datahub.NeurIPS20 |
2020 | https://eedi.com/projects/neurips-education-challenge |
inscd.datahub.XES3G5M |
2023 | https://github.com/ai4ed/XES3G5M |
Note that we preprocess these datasets and filter invalid response logs. We will continuously update preprocessed datasets to foster the community.
Why I cannot download the dataset when using build-in datasets class (e.g.,
NeurIPS20
ininscd.datahub
)?
Since these datasets are saved in the Google Driver, they may be not available in some countries and regions. You can use proxy and add the following code before using build-in datasets.
os.environ['http_proxy'] = 'http://<IP address of proxy>:<Port of proxy>'
os.environ['https_proxy'] = 'http://<IP address of proxy>:<Port of proxy>'
os.environ['all_proxy'] = 'socks5://<IP address of proxy>:<Port of proxy>'
Contributors are arranged in alphabetical order by first name. We welcome more people to participate in maintenance and improve the community of intelligent education.
Junhao Shen, Mingjia Li, Shuo Liu, Xin An, Yuanhao Liu
If this toolkit is helpful and can inspire you in your reseach or applications, please kindly cite as follows.