Adding semaphore to lower memory usage #1 #4

thomasw21 · 2021-07-16T00:24:11Z

Related to: Adding semaphore to lower memory usage #1

This is linked to the problem of increasing memory usage when preprocessing dataset. I believe the issue is that the imap is much faster than the single thread write. Consequently the memory usage is going up. In this PR, we suggest to use a global semaphore, that limits the number of samples stored in memory, ie we wait for the consumer to process X amount of samples before allowing the generator to generate more samples.

Same PR, just rebased on the correct branch.

TevenLeScao · 2021-07-17T12:34:17Z

Hey, as discussed, this is significant slower than just running with less workers. We'll stick with that for now to not complexify the code.

Thomas added 7 commits July 15, 2021 14:11

Adding semaphore to lower memory usage

3693232

Make semaphore sharable

9878638

Add back \n

548e7a8

Turns out there's a possiblity to deadlock, use imap_unordered instead

4de0380

Convert to tensor in multiprocessing

e287525

This should be faster

7dd7576

Acquire lock via chunks instead

832f379

TevenLeScao closed this Jul 17, 2021

TevenLeScao mentioned this pull request Jul 20, 2021

WIP: Preprocessing using child process to write instead of aggregating values in main process #3

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding semaphore to lower memory usage #1 #4

Adding semaphore to lower memory usage #1 #4

Uh oh!

thomasw21 commented Jul 16, 2021

Uh oh!

TevenLeScao commented Jul 17, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Adding semaphore to lower memory usage #1 #4

Adding semaphore to lower memory usage #1 #4

Uh oh!

Conversation

thomasw21 commented Jul 16, 2021

Uh oh!

TevenLeScao commented Jul 17, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants