👩‍🏫 LLMs are Students at Various Levels

This repository contains the official implementation of "Large Language Models are Students at Various Levels: Zero-shot Question Difficulty Estimation".

Jae-Woo Park^1*, Seong-Jin Park^1*, Hyun-Sik Won¹, Kang-Min Kim^1†
¹The Catholic University of Korea
^*These authors contributed equally to this work. ^†Corresponding Author

This repository includes:

LLaSA Setup.
Question-Solving using Various LLMs.
Question Difficulty Estimation using LLaSA and Zero-shot LLaSA.

Project Structure

├── config  # Configurations, API keys, and constants.
│   ├── __init__.py
│   ├── constants.py
│   └── api_keys.py
├── data  # Contains user-provided raw data and generated processed data.
│   ├── processed  # [Will be generated] Processed files.
│   │   ├── dk_test_ability.csv
│   │   ├── dk_test_difficulty.csv
│   │   ├── dk_test_question.json
│   │   ├── dk_train_ability.csv
│   │   ├── dk_train_difficulty.csv
│   │   ├── dk_train_question.json
│   │   └── dk_whole_question.json
│   └── raw  # [User-provided] Raw data provided by the user.
│       ├── test_question.json
│       ├── test_transaction.csv
│       ├── train_question.json
│       └── train_transaction.csv
├── logs  # [Will be generated] Log files and experiment results.
│   ├── llasa  # LLaSA result logs.
│   │   └── …
│   └── question_solving  # Question-solving result logs.
│       ├── …
│       ├── model_answer_log.csv
│       └── total_results.csv
├── data_setting  # Scripts for data processing.
│   └── …
├── llasa  # LLaSA and Zero-shot LLaSA Frameworks.
│   └── …
├── question_solving  # Scripts for question-solving using LLMs.
│   └── …
└── shells  # Shell scripts for running modules.
    └── …

LLaSA Setup

Installation

To install the R library for Item Response Theory (IRT) on Ubuntu, run:

sudo apt-get update
sudo apt-get install r-base
Rscript requirements.r
cd llms-are-students-of-various-levels

After installation, type R in the terminal to start the R environment.

Set up your Python environment:

pip install torch
pip install -r requirements.txt

Ensure that you download the appropriate version of PyTorch for your system.

Configure config/constants.py and set your API keys in config/api_keys.py.

Dataset

We conducted Question Difficulty Estimation (QDE) using the following two datasets. Any dataset containing questions, answers, and students' question-solving records can be used for this task:

You need a large transaction dataset to use LLaSA effectively because IRT cannot be measured if each question has only a single response record or if a single model has only one response record.

Step 1: Organizing the Dataset Structure

Make sure your dataset follows this structure:

├─ data
│   ├─ raw
│   │   ├─ train_transaction.csv
│   │   ├─ train_question.json
│   │   ├─ test_transaction.csv
│   │   └─ test_question.json

Dataset Structure Details

Here is an example of train_transaction.csv and train_question.json. Please prepare test_transaction.csv and test_question.json in the same format.

train_transaction.csv:

question_id	S1	S2	...	SN
Q1	1	1	...	1
Q2	0	1	...	1

train_question.json:

{
  "question_text": "Choose the correct ...",
  "question_id": 1,
  "choices": ["10", "20", "30", "40"],
  "answer": ["10"]
}

Step 2: Estimating Difficulty and Ability using IRT

Run the following command to estimate student abilities and question difficulties:

sh shells/data_setting/run_irt_setting.sh

Step 3: Adding Hints (Optional)

Generate hints using the GPT API:

sh shells/data_setting/run_hint_setting.sh

Step 4: Merging Datasets

Merge the train and test sets for question-solving:

sh shells/data_setting/run_merge_setting.sh

Question-Solving using Various LLMs

This question-solving process involves LLMs directly solving problems to extract question-solving records. It was developed with reference to the code from Leveraging Large Language Models for Multiple Choice Question Answering.

Step 1: Get Various LLMs Question-Solving Records

Run these scripts to get question-solving records from different LLMs:

sh shells/question_solving/run_local_models.sh
sh shells/question_solving/run_anthropic_models.sh
sh shells/question_solving/run_gpt_models.sh

Step 2: Analyze and Integrate Results

Analyze the results and integrate them into a unified dataset:

sh shells/question_solving/run_analyze.sh
sh shells/question_solving/run_integrate.sh

QDE using LLaSA and Zero-shot LLaSA

LLaSA without LLMDA

Run LLaSA without LLMDA:

sh shells/llasa/run_llasa_without_llmda.sh

LLaSA with LLMDA

Run LLaSA with LLMDA:

sh shells/llasa/run_llasa_with_llmda.sh

Zero-shot LLaSA

Run Zero-shot LLaSA using intuitive input for student levels:

sh shells/llasa/run_zeroshot_llasa.sh

Report Result

Check results of LLaSA and Zero-shot LLaSA:

sh shells/llasa/run_report_1.sh  # LLaSA without LLMDA
sh shells/llasa/run_report_2.sh  # LLaSA with LLMDA
sh shells/llasa/run_report_3.sh  # Zero-shot LLaSA

Citation

@inproceedings{
anonymous2024large,
title={Large Language Models are Students at Various Levels: Zero-shot Question Difficulty Estimation},
author={Anonymous},
booktitle={Submitted to ACL Rolling Review - June 2024},
year={2024},
url={https://openreview.net/forum?id=whRJT6j4EM},
note={under review}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
config		config
data/raw		data/raw
data_setting		data_setting
llasa		llasa
question_solving		question_solving
shells		shells
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
reuqirements.r		reuqirements.r

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

👩‍🏫 LLMs are Students at Various Levels

Table of Contents

Project Structure

LLaSA Setup

Installation

Dataset

Step 1: Organizing the Dataset Structure

Step 2: Estimating Difficulty and Ability using IRT

Step 3: Adding Hints (Optional)

Step 4: Merging Datasets

Question-Solving using Various LLMs

Step 1: Get Various LLMs Question-Solving Records

Step 2: Analyze and Integrate Results

QDE using LLaSA and Zero-shot LLaSA

LLaSA without LLMDA

LLaSA with LLMDA

Zero-shot LLaSA

Report Result

Citation

About

Releases

Packages

Languages

cuk-nlp/llms-are-students-at-various-levels

Folders and files

Latest commit

History

Repository files navigation

👩‍🏫 LLMs are Students at Various Levels

Table of Contents

Project Structure

LLaSA Setup

Installation

Dataset

Step 1: Organizing the Dataset Structure

Step 2: Estimating Difficulty and Ability using IRT

Step 3: Adding Hints (Optional)

Step 4: Merging Datasets

Question-Solving using Various LLMs

Step 1: Get Various LLMs Question-Solving Records

Step 2: Analyze and Integrate Results

QDE using LLaSA and Zero-shot LLaSA

LLaSA without LLMDA

LLaSA with LLMDA

Zero-shot LLaSA

Report Result

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages