In ShareGPT, why the conversation from human is accumulated? #11

qmpham · 2023-04-05T15:13:30Z

Line 329 data_loading.py

chiayewken · 2023-04-05T19:55:08Z

Hi, to train the model to generate GPT-like responses, we set the target sequence as the GPT response and input/source sequence as the previous dialog history.

qmpham · 2023-04-06T08:54:14Z

But LLAMA has input's max_len of only 2048 tokens

chiayewken · 2023-04-06T09:31:17Z

This can be handled by the data loader/tokenizer. For example, we truncate the input on the left side if it exceeds the max length:

flan-alpaca/data_loading.py

Line 151 in c90aad7

def __getitem__(self, i: int) -> dict:

qmpham · 2023-04-06T13:53:32Z

yes, I understand. but why interested in having long input while the model's capacity is only 2048. You might risk of truncating the question to which the target is addressed

chiayewken · 2023-04-07T01:38:01Z

That's true, the dialog commonly exceeds the maximum sequence length while training. However, we can mitigate this by truncating inputs on the left side, so that the most recent dialog history on the right is preserved:

flan-alpaca/data_loading.py

Line 136 in c90aad7

"""

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In ShareGPT, why the conversation from human is accumulated? #11

In ShareGPT, why the conversation from human is accumulated? #11

qmpham commented Apr 5, 2023

chiayewken commented Apr 5, 2023

qmpham commented Apr 6, 2023

chiayewken commented Apr 6, 2023

qmpham commented Apr 6, 2023

chiayewken commented Apr 7, 2023

In ShareGPT, why the conversation from human is accumulated? #11

In ShareGPT, why the conversation from human is accumulated? #11

Comments

qmpham commented Apr 5, 2023

chiayewken commented Apr 5, 2023

qmpham commented Apr 6, 2023

chiayewken commented Apr 6, 2023

qmpham commented Apr 6, 2023

chiayewken commented Apr 7, 2023