missing MAX_SEN_LEN and EPOCH #9

luckystar1992 · 2019-12-10T11:11:50Z

Compare to the original C code released by Google, MAX_SEN_LEN and EPOCH is missing, which caused these two problem.

[1] In the sub training process, each process read lines from file start and end. Once the input file contains only the one line (for example text8 corpus), following code snippets would caused bug.

 while fi.tell() < end:
        line = fi.readline().strip()
        # Skip blank lines
        if not line:
            continue

line = fi.readline().strip() would load the whole tokens from start.

[2] EPOCH would create embedding with more training samples.

The text was updated successfully, but these errors were encountered:

luckystar1992 mentioned this issue Dec 10, 2019

About Multiprocessing #6

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

missing MAX_SEN_LEN and EPOCH #9

missing MAX_SEN_LEN and EPOCH #9

luckystar1992 commented Dec 10, 2019

missing MAX_SEN_LEN and EPOCH #9

missing MAX_SEN_LEN and EPOCH #9

Comments

luckystar1992 commented Dec 10, 2019