Splitting the data is wrong !? #194

Amrusama · 2024-07-02T12:06:57Z

I generated a non-iid version of FashionMINST for 15 clients using the following command
python generate_FashionMNIST.py noniid - pat
The output of the data distribution on the terminal is as follows:

I printed the number of data points for each class in every client and I plotted the distribution of every client class and it didn't match the output on the terminal.

The text was updated successfully, but these errors were encountered:

TsingZ0 · 2024-07-03T11:44:09Z

Please differentiate between the "entire set" and the "training set." The training set is almost 75% of the entire set for each client.

Amrusama · 2024-07-04T11:02:44Z

@TsingZ0 Thank you for the explanation. I suggest including this information in the dataset generation section. For instance, the entire training set of FashionMNIST comprises 70,000 samples. My intention was for PFLib to provide a non-iid version of the entire training set.

TsingZ0 · 2024-07-11T02:57:05Z

The test set can be reshuffled after it has been split.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Splitting the data is wrong !? #194

Splitting the data is wrong !? #194

Amrusama commented Jul 2, 2024 •

edited

Loading

TsingZ0 commented Jul 3, 2024

Amrusama commented Jul 4, 2024

TsingZ0 commented Jul 11, 2024

Splitting the data is wrong !? #194

Splitting the data is wrong !? #194

Comments

Amrusama commented Jul 2, 2024 • edited Loading

TsingZ0 commented Jul 3, 2024

Amrusama commented Jul 4, 2024

TsingZ0 commented Jul 11, 2024

Amrusama commented Jul 2, 2024 •

edited

Loading