Skip to content

Enabled filtering of paths in the CPATH variable generated for the wrapper#1436

Merged
boegel merged 7 commits intoeasybuilders:developfrom
damianam:tensorflow_filter_cpath
Jul 5, 2018
Merged

Enabled filtering of paths in the CPATH variable generated for the wrapper#1436
boegel merged 7 commits intoeasybuilders:developfrom
damianam:tensorflow_filter_cpath

Conversation

@damianam
Copy link
Member

@damianam damianam commented Jun 13, 2018

As we know, bazel pulls a lot of stuff on its own. In the existing setup, these versions might conflict with packages already installed and included in the CPATH variable generated for the icc wrapper. This PR enables package filtering there. What this does not do is filtering CPATH when using gcc-based toolchains. Input? Should we extend the filter to work also under other toolchains where a wrapper is not generated?

@damianam damianam added this to the 3.6.2 milestone Jun 13, 2018
# filter out paths from CPATH
wrapper_filter = self.cfg['wrapper_filter']
cpath = os.getenv('CPATH').split(':')
filtered_cpath = [path for fil in wrapper_filter for path in cpath if not fil in path]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test for membership should be 'not in'

@boegel
Copy link
Member

boegel commented Jun 13, 2018

@damianam Do you have an example of how you use this, and which exact problem/errors it fixes?

@damianam
Copy link
Member Author

We have protobuf pulled in as a dependency in our Python module. The version that tensorflow pulls in is incompatible with ours (slightly different API). So via CPATH our protobuf leaks into the bazel setup and compilation fails. Setting wrapper_filter = ['protobuf'] in the easyconfig fixes it.

@boegel
Copy link
Member

boegel commented Jun 14, 2018

@akesandgren Thoughts on this?

@akesandgren
Copy link
Contributor

It should do the same regardless of toolchain used or people will get confused
But wouldn't you need to filter LD_LIBRARY_PATHS too then?

@damianam
Copy link
Member Author

Agreed. We should filter CPATH in gcc toolchains too. Seems logical.

Regarding LD_LIBRARY_PATH, I don't think so. That is a runtime environment variable. LIBRARY_PATH on the other hand......

icc_wrapper_txt = INTEL_COMPILER_WRAPPER % {
'compiler_path': which('icc'),
'cpath': os.getenv('CPATH'),
'cpath': cpath,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

undefined name 'cpath'

package_filter = self.cfg['package_filter']
for var in ['CPATH', 'LIBRARY_PATH']:
path = os.getenv(var).split(':')
filtered_path = [p for fil in package_filter for p in path if fil not in p]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@damianam Please log the value of the environment variable before and after filtering.

extra_vars = {
# see https://developer.nvidia.com/cuda-gpus
'cuda_compute_capabilities': [[], "List of CUDA compute capabilities to build with", CUSTOM],
'package_filter': [[], "List of paths to filter out", CUSTOM],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually a list of filter patterns for paths (since you're checking with if fil not in p), please clarify that and maybe also rename to cpath_libpath_filter_patterns to make it more explicit?


tmpdir = tempfile.mkdtemp(suffix='-bazel-configure')

# filter out paths from CPATH and LIBRARY_PATH
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add an extra comment line or two explaining why this may be needed

for var in ['CPATH', 'LIBRARY_PATH']:
path = os.getenv(var).split(':')
filtered_path = [p for fil in package_filter for p in path if fil not in p]
os.environ[var] = ':'.join(filtered_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use setvar from easybuild.tools.environment (which logs), and use os.pathsep

setvar(var, os.pathsep.join(filtered_path))

self.log.info("$%s old value was %s" % (var, path))
filtered_path = os.pathsep.join([p for fil in path_filter for p in path if fil not in p])
self.log.info("$%s new value is %s" % (var, filtered_path))
setvar(var, filtered_path)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

undefined name 'setvar'

@damianam damianam changed the title Enabled filtering of paths in the CPATH variable generated for the wrapper (WIP) Enabled filtering of paths in the CPATH variable generated for the wrapper Jul 4, 2018
@damianam
Copy link
Member Author

damianam commented Jul 4, 2018

@boegel @akesandgren please review again.

@akesandgren
Copy link
Contributor

Code looks sane, though I don't know what one would put in the filter. (Literally that is)

@damianam
Copy link
Member Author

damianam commented Jul 4, 2018

Our easyconfig has this: package_filter = ['protobuf']. Without it the protobuf package we have included as part of the other stuff that gets pulled in by Python leaks in and breaks the compilation.

# might conflict with dependencies on the system and/or installed with EB. For example: protobuf
path_filter = self.cfg['path_filter']
if len(path_filter > 0):
self.log.info("Filtering $CPATH and $LIBRARY_PATH")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@damianam Please change to:

if path_filter:
    self.log.info("Filtering $CPATH and $LIBRARY_PATH with path filter %s", path_filter)

path = os.getenv(var).split(':')
self.log.info("$%s old value was %s" % (var, path))
filtered_path = os.pathsep.join([p for fil in path_filter for p in path if fil not in p])
self.log.info("$%s new value is %s" % (var, filtered_path))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@damianam env.setvar already logs both new & old value, so this extra logging is overkill?

if len(path_filter > 0):
self.log.info("Filtering $CPATH and $LIBRARY_PATH")
for var in ['CPATH', 'LIBRARY_PATH']:
path = os.getenv(var).split(':')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also use os.pathsep here, like you do below:

path = os.getenv(var).split(os.pathsep)

@boegel
Copy link
Member

boegel commented Jul 5, 2018

Reviewed & tested with an existing TensorFlow easyconfig, lgtm, thanks @damianam!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants