Skip to content

[Windows] Use multiprocessing.dummy on Windows for parallelism (plus adding Azure test pipeline.)#251

Closed
seanyen wants to merge 3 commits intoros-infrastructure:masterfrom
seanyen:windows_port_multiprocessing
Closed

[Windows] Use multiprocessing.dummy on Windows for parallelism (plus adding Azure test pipeline.)#251
seanyen wants to merge 3 commits intoros-infrastructure:masterfrom
seanyen:windows_port_multiprocessing

Conversation

@seanyen
Copy link
Copy Markdown
Contributor

@seanyen seanyen commented Feb 9, 2019

This is an attempt to address the discussion in #250.

The change is to propose using multiprocessing.dummy (which is a threading wrapper) on Windows to avoid the restriction Safe importing of main module (which requires the downstream tools to be updated to work). This way can keep parallelism but not requires the downstream tools to be modified.

Plus, this change adds an Azure pipeline yaml to exercise the test suite on Windows machines.

@seanyen
Copy link
Copy Markdown
Contributor Author

seanyen commented Feb 9, 2019

# This restriction requires downstream tools to be modified to
# work. Instead, let's fallback to threading wrapper to get
# around it and still keep the parallelism on Windows.
import multiprocessing.dummy as multiprocessing
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the impact of using threading for this on Windows? Is the performance still better than without?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well... ran some benchmark tests today and in a test case where it has 1000 packages (and each has dummy 1000 <build_depend> nodes), the result looks not good for threading wrapper in general on Windows...

On CPython 2.7:

  • Disable parallel: 26 sec
  • Using multiprocessing: 7.5 sec
  • Using multiprocessing.dummy: 98.5 sec

On Anaconda Python 3.7:

  • Disable parallel: 12.8 sec
  • Using multiprocessing: 4.8 sec
  • Using multiprocessing.dummy: 14.5 sec

(My environment is using Intel® Xeon® Processor E5-1650 v4 6-cores CPU.)
(My test case is test_find_packages_with_large_amount_packages in https://github.com/seanyen-msft/catkin_pkg/blob/windows_port_multiprocessing_disabled/test/test_packages.py)

Copy link
Copy Markdown
Member

@dirk-thomas dirk-thomas Feb 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The results don't look like this should be merged as is.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That make senses. Since now we explored some approaches, any pointers what to investigate next at this moment?

@seanyen
Copy link
Copy Markdown
Contributor Author

seanyen commented Feb 25, 2019

Closed it because it is very likely multiprocessing.dummy not going to be an acceptable solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants