Transform won't yield memory in tfx after transform and it takes up total memory #227

axelning · 2021-03-08T10:15:18Z

If the bug is related to a specific library below, please raise an issue in the
respective repo directly:

TensorFlow Data Validation Repo

TensorFlow Model Analysis Repo

TensorFlow Transform Repo

TensorFlow Serving Repo

System information

Have I specified the code to reproduce the issue
(Yes/No): yes
Environment in which the code is executed (e.g., Local
(Linux/MacOS/Windows), Interactive Notebook, Google Cloud, etc): - TensorFlow
version (you are using): 2.3.2- TFX Version: 0.26.1- Python version:3.6.7

Describe the current behavior
In tfx transform module it calls tensorflow_transform> beam >impl.py:1058

schema = schema_inference.infer_feature_schema_v2(
      structured_outputs,
      metadata_fn.get_concrete_function(),
      evaluate_schema_overrides=False)

this will call infer_feature_schma_v2 in schema_inference.py :163

in this function, tf2_utils.supply_missing_inputs(structured_inputs, batch_size=1) in line 195 will tries to convert inputs to tensor and will not release the gpu memory when finished. By default this operation takes 7715 MB on my singlee Tesla p40

When I run into OOM because the following training starts to apply for the GPU, and after I stop the whole process and continue, cause the transform has been saved and the trainning goes successful, which means this part does not need to keep in the GPU from when it ends.

The text was updated successfully, but these errors were encountered:

arghyaganguly · 2021-03-09T10:41:49Z

duplicate tfx#3343

arghyaganguly · 2021-03-09T10:43:51Z

@zoyahav , shall we track this issue here or in tfx#3343 ?

zoyahav · 2021-03-10T16:46:53Z

Let's keep it here for now.

@axelning are you able to check if the issue occurs with CPU as well?

axelning · 2021-04-06T12:20:12Z

Let's keep it here for now.

@axelning are you able to check if the issue occurs with CPU as well?

by setting the growth limitation and worker_num limitation， this issue can be circumvented
and in cpu just bcz i got 32GB memory， so this issue is not reproduced during running。

still，the memory management of gpu is keeping emerging，may be some architect optimization is needed

arghyaganguly self-assigned this Mar 9, 2021

arghyaganguly assigned zoyahav and unassigned arghyaganguly Mar 9, 2021

arghyaganguly added stat:awaiting tensorflower type:bug labels Mar 9, 2021

zoyahav assigned varshaan Mar 10, 2021

arghyaganguly added stat:awaiting response and removed stat:awaiting tensorflower labels Mar 17, 2021

arghyaganguly added stat:awaiting tensorflower and removed stat:awaiting response labels Apr 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transform won't yield memory in tfx after transform and it takes up total memory #227

Transform won't yield memory in tfx after transform and it takes up total memory #227

axelning commented Mar 8, 2021

arghyaganguly commented Mar 9, 2021

arghyaganguly commented Mar 9, 2021 •

edited

Loading

zoyahav commented Mar 10, 2021

axelning commented Apr 6, 2021

Transform won't yield memory in tfx after transform and it takes up total memory #227

Transform won't yield memory in tfx after transform and it takes up total memory #227

Comments

axelning commented Mar 8, 2021

arghyaganguly commented Mar 9, 2021

arghyaganguly commented Mar 9, 2021 • edited Loading

zoyahav commented Mar 10, 2021

axelning commented Apr 6, 2021

arghyaganguly commented Mar 9, 2021 •

edited

Loading