Use updated symbolic_helper.check_training_mode #11900

jingyanwangms · 2022-06-18T00:27:43Z

Description:
symbolic_helper._training_mode no longer exists, according to https://github.com/pytorch/pytorch/blob/master/torch/onnx/symbolic_helper.py#L62, the new api is symbolic_helper.check_training_mode
Motivation and Context
Gives module 'torch.onnx.symbolic_helper' has no attribute '_training_mode' right now

justinchuby · 2022-06-18T00:32:32Z

orttraining/orttraining/python/training/ortmodule/_custom_autograd_function_exporter.py

            )
        inplace = kwargs["inplace"]
-        training_mode = symbolic_helper._training_mode
+        training_mode = symbolic_helper.check_training_mode


_training_mode is a private field not exposed to users. Do you need access to the current training mode during export?

(check_training_mode is a function which returns None. It doesn't seem fitting to be used here)

Same question here, does check_training_mode return the training mode originally returned by _training_mode?

Do you need access to the current training mode during export?
I think so, how to access current training mode? I can give that a try

+1 - it's blocking ROCm test - it's also using torch 1.12 at the moment. What's the right workaround?

Created pytorch/pytorch#79950 cc @BowenBao

_training_mode is a private field not exposed to users. Do you need access to the current training mode during export?

(check_training_mode is a function which returns None. It doesn't seem fitting to be used here)

One alternative is simply convert it to public/property :)

I would like to be cautious about exposing internal states. Once we do that, changes will be very hard. I am hoping we remove more globals in the future.

@justinchuby @ytaous
FYI it's torch.onnx._globals.GLOBALS.training_mode not torch.onnx._globals.GLOBAL.training_mode
I'll update this PR

good catch sorry about that

No worries. The PR is updated now

orttraining/orttraining/python/training/ortmodule/_custom_autograd_function_exporter.py

pengwa · 2022-06-24T05:19:49Z

orttraining/orttraining/python/training/ortmodule/_custom_autograd_function_exporter.py

        inplace = kwargs["inplace"]
-        training_mode = symbolic_helper._training_mode
+        # TODO move to public API once exporter team exposes that
+        training_mode = _globals.GLOBALS.training_mode


just confirm, is this compatible with older version PyTorch? If not, should we add torch version check here?

It is recently introduced. I would just extract it into a function (something like get_exporter_training_mode) and do a try catch there.

lgtm-com · 2022-07-12T20:03:04Z

This pull request introduces 1 alert and fixes 1 when merging 9d3ce75 into 6e05101 - view on LGTM.com

new alerts:

1 for Unused import

fixed alerts:

1 for Unused import

ytaous · 2022-07-12T21:04:50Z

orttraining/orttraining/python/training/ortmodule/_custom_autograd_function_exporter.py

+        else:
+            from torch.onnx import _globals
+
+            training_mode = _globals.GLOBALS.training_mode


per discussion, let's use 1.11, and swap the logic, thx

lgtm-com · 2022-07-12T21:24:55Z

This pull request introduces 1 alert and fixes 1 when merging f707a07 into a6fd1a3 - view on LGTM.com

new alerts:

1 for Unused import

fixed alerts:

1 for Unused import

titaiwangms · 2022-07-12T23:41:38Z

Following up on this.

Co-authored-by: Jingyan Wang, Baiju Meswani

* support optimizer opt for deepspeed 0.5.9 * resolve comments * resolve comments * FP16_Optimizer Support for more Deepspeed Versions (#12046) * fp16_optimizer for more ds versions * change ds version * bugfix * fix bug * Fix unused function warning for decodeMIDR(). (#12069) Changed from static function defined in header to function declared in header and defined in separate .cc file. * pin protobuf version to be compatible with onnx (#12132) Co-authored-by: Ashwini Khade <[email protected]@orttrainingdev10.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> * RoiAlign CPU EP add warning for max mode with samples != 1 (#12136) * RoiAlign add warning about incorrect max summation when sample size not 1 * include coreml_provider_factory.h in macos build instead of coreml_ex… (#12138) include coreml_provider_factory.h in macos build instead of coreml_execution_provider.h * List 3.10 as supported python version and remove 3.6 (#12141) list 3.10 as supported python version and remove 3.6 Co-authored-by: Randy Shuai <[email protected]> * Use updated symbolic_helper.check_training_mode (#11900) Co-authored-by: Jingyan Wang, Baiju Meswani * Fix GH issue 12151 by using inverse perms for updating DQ axis attribute (#12158) * Fix GH issue 12151. Need to use inverse perms for updating that axis to what is used for transposing the input. This only applies if the DQ node is doing per-axis dequantization. * fixing positions for beam search gpt2 (#12156) * fixing positions for beam search gpt2 Co-authored-by: Tianlei Wu <[email protected]> * remove wrong placed libs (#12201) * Add file mapping for windows platform. (#12183) * Add file mapping for windows platform. * Add unit test for file mapping for windows. Also add an error message for mis-aligned offset * Add unit test for file mapping for windows. Also add an error message for mis-aligned offset * Update data type to avoid warnings * Compitable data type to avoid warnings. Update CreatFileMapping2 condition for winml compiling. * Add type conversion to avoid warnings for X86 release build. Co-authored-by: Ting Cao <[email protected]> * Fix bug where onnxruntime_USE_NCCL flag would default to ON (#12195) Fix bug where onnxruntime_USE_NCCL flag would default to ON, causing ORT to not build properly. New functionality: flag is ON when training is enabled and NCCL is not disabled. Flag is OFF otherwise Co-authored-by: zhijxu <[email protected]> Co-authored-by: zhijxu <zhijxu> Co-authored-by: Vincent Wang <[email protected]> Co-authored-by: Edward Chen <[email protected]> Co-authored-by: Ashwini Khade <[email protected]> Co-authored-by: Ashwini Khade <[email protected]@orttrainingdev10.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: Dwayne Robinson <[email protected]> Co-authored-by: Carson Swope <[email protected]> Co-authored-by: Randy Shuai <[email protected]> Co-authored-by: jingyanwangms <[email protected]> Co-authored-by: Scott McKay <[email protected]> Co-authored-by: Viswanath Boga <[email protected]> Co-authored-by: leqiao-1 <[email protected]> Co-authored-by: caoting-dotcom <[email protected]> Co-authored-by: Ting Cao <[email protected]> Co-authored-by: Sean Murray <[email protected]>

jingyanwangms requested review from justinchuby and pengwa June 18, 2022 00:28

justinchuby reviewed Jun 18, 2022

View reviewed changes

This was referenced Jun 18, 2022

Remove reference to internals in torch.onnx #11901

Closed

[ONNX] Expose export training mode to public pytorch/pytorch#79950

Closed

ytaous reviewed Jun 21, 2022

View reviewed changes

orttraining/orttraining/python/training/ortmodule/_custom_autograd_function_exporter.py Outdated Show resolved Hide resolved

pengwa reviewed Jun 24, 2022

View reviewed changes

Jingyan Wang added 2 commits July 12, 2022 19:47

use udpated symbolic_helper api

3e4d3e0

Use _globals.GLOBALS.training_mode

cc5cb48

baijumeswani force-pushed the jingywa/fix-symbolic-helper-api branch from 95948fb to 9d3ce75 Compare July 12, 2022 19:48

ytaous reviewed Jul 12, 2022

View reviewed changes

Version check to support torch <= 1.11 and > 1.11

f707a07

baijumeswani force-pushed the jingywa/fix-symbolic-helper-api branch from 9d3ce75 to f707a07 Compare July 12, 2022 21:06

ytaous approved these changes Jul 12, 2022

View reviewed changes

baijumeswani added the release:1.12 label Jul 12, 2022

baijumeswani merged commit a9d0d33 into master Jul 13, 2022

baijumeswani deleted the jingywa/fix-symbolic-helper-api branch July 13, 2022 00:26

RandySheriffH pushed a commit that referenced this pull request Jul 18, 2022

Use updated symbolic_helper.check_training_mode (#11900)

4de96ad

Co-authored-by: Jingyan Wang, Baiju Meswani

Use updated symbolic_helper.check_training_mode #11900

Use updated symbolic_helper.check_training_mode #11900

Uh oh!

Conversation

jingyanwangms commented Jun 18, 2022

Uh oh!

justinchuby Jun 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ytaous Jun 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

justinchuby Jun 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lgtm-com bot commented Jul 12, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lgtm-com bot commented Jul 12, 2022

Uh oh!

titaiwangms commented Jul 12, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

justinchuby Jun 18, 2022 •

edited

Loading

ytaous Jun 21, 2022 •

edited

Loading

justinchuby Jun 27, 2022 •

edited

Loading