This repo contains the LLM-derived dataset of TVShowGuess.
TVShowGuess was proposed in benchmarking language models' skill of understanding fictional characters in narrative stories.
Original TVSG dataset: https://github.com/YisiSang/TVSHOWGUESS/tree/main/dataset
The following data are all generated by ChatGPT (GPT-3.5-turbo-0301).
Data: ./character_dic/*
Generation process:
Source of summary
and characterization
entry should be double-checked with Mo. Mo sent me these files last Dec. They might be directly sourced from Fandom or human re-written.
Data format: The following is a sample of character fandom entry:
"Sheldon Cooper":{
"summary":"Dr. Sheldon Lee Cooper, B.Sc., M.Sc., M.A., Ph.D., Sc.D., is a pathetic Caltech theoretical physicist. Next to his best friend Leonard Hofstadter, he’s the main protagonist of The Big Bang Theory and the titular protagonist of Young Sheldon ...",
"characterization":"Aside from his characteristic idiosyncrasies, unpragmatic obsessions and extreme narcissism, Sheldon believes humans are illogical and attempts to be logical himself. [Even though, in reality, he's actually significantly more illogical than most people in so many ways (i.e.; fear of change, fear of birds, expecting others to change for him, throwing childish tantrums, being immature etc.)] He frequently states that he possesses an eidetic memory although the correct term for this type of recall is hyperthymesia (highly superior autobiographical memory) He also states that he has an IQ of 187, though he claims his IQ cannot be accurately measured by normal tests (further confirming his egotism). Sheldon has a ..."
},
Data: ./persona.csv
Generation process: A bio is a ChatGPT-generated personality summary. User message given to ChatGPT is as follows, no system message was given:
# https://github.com/Oaklight/persona-guessing/blob/e23aa33a3e713619120ea443af88799c17f03ad9/src/bio.py#L9
prompt = f'Please write a bio for the character {charName} from TV series "{showName}" in one paragraph'
Data: ./plot_summ_files/*
Generation process: A scene is of the same definition of the TVSG paper. Each scene script is length-limited at 2500 words, then passed to ChatGPT with 100 heuristic summary length and the following system message and user message:
# https://github.com/Oaklight/persona-guessing/blob/e23aa33a3e713619120ea443af88799c17f03ad9/src/plotsumm.py#L52
message = f"You are good at plot briefing. You will be present with a long plot script. Please summarize the given plot with less than {heuristic_len} words."
prompt = f"{long_plot}\\nPlease summarize it with less than {heuristic_len} words:"
Data format:
Each entry comforms with the format: "scene_id
: plot summary by chatgpt
"
{
"3057": "Sheldon and Leonard go to a high IQ sperm bank to donate sperm for extra money, but Sheldon backs out at the last minute, feeling guilty about potentially committing genetic fraud. They leave without donating.",
"3058": "Two roommates, Leonard and Sheldon, meet their new neighbor, Penny. Leonard is immediately interested in her and invites her over for lunch. Sheldon protests, but they eventually agree to have her over. During the invitation, Sheldon struggles with social cues and inappropriately mentions bowel movements. The scene ends with Penny accepting the invitation and asking what they do for fun.The plot involves a man who discovers a lucrative business opportunity by performing a sexual act for money, which he continues to do in secret until the credits roll.",
...
}
Data: ./memory_files/*
Generation process: For every primary characters speaking in a scene script at "scene_id", the scene script was length-limited at 2500 words then passed to ChatGPT for 1st-person-perspective personal memory, with the following system message and user message:
# https://github.com/Oaklight/persona-guessing/blob/e23aa33a3e713619120ea443af88799c17f03ad9/src/memory.py#L95
message = f"You are playing a imitation game, where you are a specific person and try to concisely reiterate a conversation"
prompt = f"Given a conversation:\n\n{scene_script}\n\nNow, speak as you are {who}, describing in first-person perspective of what you experienced, with no more than {heuristic_len} words."
scene_script
is the scene script at "scene_id". who
is the character name speaking, e.g. "sheldon". heuristic_len is 100.
Data format: Each entry follows the format:
scene_id: {
'character_name': {
long: "1st-perspective memory of scene@scene_id",
short: ""
...
}
long
entry is the 1st-person-perspective memory of scene@scene_id. short entry is reserved for memory summary, but it's discarded.
{
"4861": {
"raj": {
"long": "I was talking to Sheldon about why I can't find a woman to be with. He suggested my fear of being alone is the problem. We also talked about my dating history, including a threesome with a Sailor Moon fan. Sheldon jokingly suggested chemical castration before we said goodnight.",
"short": ""
},
"sheldon": {
"long": "I recall a conversation with Raj in our apartment where he expressed his frustration with women not wanting to be with him. I suggested that his inability to be alone might be the issue, and he mentioned having had dates with eleven women, including a threesome with Howard and a Sailor Moon cosplayer. I hinted at chemical castration, but he decided to work on his fear of being alone instead.",
"short": ""
}
},
...
}
Data: ./memory_summ_files/*
Generation process:
During the experiment, we proposed to use multiple pieces of memory to guess the TV show. To this end, we first form mega_memory
of at most 20 previous scenes, where the character of interest (who
) is one of the primary characters in the scene. Then, we pass the mega memory, length-limited at 2500 words, to ChatGPT for memory summary. The following system message and user message are used:
# https://github.com/Oaklight/persona-guessing/blob/e23aa33a3e713619120ea443af88799c17f03ad9/src/memory.py#L281
message = f"You are good at memory briefing. You will be present with a recent memory from someone. Please summarize the given memory with less than {heuristic_len} words, in the first-person perspective."
prompt = f"[{who}]: {long_memory}\nPlease summarize it with less than {heuristic_len} words:"
Each mega memory is md5 hashed to form a unique uuid
, used as the key of the memory summary entry.
Data format: The following is an example of memory summary:
"02c74d55-b8ba-de37-8d42-821391514a26": {
"who": "howard",
"uuid": "02c74d55-b8ba-de37-8d42-821391514a26",
"mega_memory": "I suggested to Stuart that he could earn some money by getting humiliated verbally. Raj was going to let Stuart stay the night but has to cancel with Emily. Stuart also complained about everyone sounding like insurance companies, police, firemen or therapists.\nI helped Stu take care of Mrs Wolowitz and we're leaving now. Stu loves her and even calls her Debbie. Something feels weird about it but I'm not sure why.\nRaj thanked me for the ride, joking about the car windows. I teased him back, but then we realized I wasn't taking him to work. I explained my mom's situation with Stuart, and Raj made a joke. Inside, Stuart and my mom were together, which surprised me. I confronted Stuart about not telling me, and we argued. Raj mentioned communication, we left. In the car, we discussed Stuart living with my mom, debated, made a joke, and I called my mom. Later, tension with Stuart, and I felt frustrated.\nSo, Sheldon brought up Stalin trying to make supersoldiers with gorillas, and we all had some interesting animal suggestions. Then, Bernardette was trying to push Penny to study for her new job, but Penny wasn't having it. Oh, and there was some awkwardness.\nRaj was trying to come up with a cute couple's nickname, while Sheldon didn't enjoy being made to teach a class, despite Leonard pointing out its advantages.\nI remember trying to convince Sheldon that I was smart enough to take his graduate-level physics class, but he kept throwing difficult questions at me. Raj was there with cookies and Leonard and Howard were watching.\nSheldon and I were in a classroom, about to start a class. I told him that if he intends to make this class difficult, I'm out. If not, I'm willing to give it a shot. He agreed, and we began with the Brachistochrone problem and Euler-Lagrange theorems. Sheldon teased me when I got stumped, but then said he'd grade on a curve. I started singing when I realized he wasn't a good teacher, and even made a spitball to shoot at him. Accidentally hit him in the mouth with it.\nI argued with Sheldon about dropping his class and violating the sanctity of his mouth. We quizzed each other on technicalities; I gave him a hard time, but he managed to keep up.",
"summ_memory": "I helped Stuart take care of Mrs. Wolowitz, but felt weird about their close relationship. Raj and I got into a mix-up with the car ride, leading to an argument with Stuart. Sheldon brought up an interesting topic about gorillas and supersoldiers. There was tension between Penny and Bernadette about studying for a new job. Sheldon didn't enjoy teaching a class, but I tried convincing him to let me join. We ended up arguing, and I accidentally hit him with a spitball. Despite the conflict, we quizzed each other and he kept up."
},
for ease of training supervised models using our summary data and previous scene data (just the previous scene entry), we processed them into merged files and created split files for train, dev, test.
test_season_dict = {
"FRIENDS": 9,
"The_Big_Bang_Theory": 8,
"Frasier": 10,
"Gilmore_Girls": 5,
"The_Office": 8,
}
For each show show_name
, any scene before test_season_dict[show_name]
is considered as train sample.
The first half of test season scenes are used as test set, second half of them are used as dev set.
- unzip
./tvsg_original/merged.zip
into./tvsg_original/merged
- cd to project root dir
python src/merge.py
python src/split.py