Killed when using long list of training data, How to solve it ? #154

housebaby · 2024-10-15T09:52:38Z

System Info

what does split mean hear,as no difference between train or others?

    self.data_list = []
    if split == "train":
        with open(dataset_config.train_data_path, encoding='utf-8') as fin:
            for line in fin:
                data_dict = json.loads(line.strip())
                self.data_list.append(data_dict)
    else:
        with open(dataset_config.val_data_path, encoding='utf-8') as fin:
            for line in fin:
                data_dict = json.loads(line.strip())
                self.data_list.append(data_dict)

Information

The official example scripts
My own modified scripts

🐛 Describe the bug

Error logs

Expected behavior

If I change the long training list to small list, it works
How to adapt to large training dataset?

The text was updated successfully, but these errors were encountered:

ddlBoJack · 2024-10-27T10:59:29Z

split means your train, val, and test sets here

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Killed when using long list of training data, How to solve it ? #154

Killed when using long list of training data, How to solve it ? #154

housebaby commented Oct 15, 2024

ddlBoJack commented Oct 27, 2024

Killed when using long list of training data, How to solve it ? #154

Killed when using long list of training data, How to solve it ? #154

Comments

housebaby commented Oct 15, 2024

System Info

Information

🐛 Describe the bug

Error logs

Expected behavior

ddlBoJack commented Oct 27, 2024