How can I confirm that the deterministic environment variable is working? #2

cklsoft · 2019-09-17T13:31:03Z

I've use the newest NGC container, and specify os.environ['TF_DETERMINISTIC_OPS'] = '1' at the begin of my main entry. But I don't know wether the environ is work or not. I have set tf log level to debug and didn't find any tf determinism relate log.

The text was updated successfully, but these errors were encountered:

duncanriach · 2019-09-17T22:12:59Z

Hi @cklsoft, there is currently nothing printed to the logs by TensorFlow that confirms that the environment variable has been acted upon. I have made a note to potentially add this in the future.

cklsoft · 2019-09-18T03:13:28Z

there is currently nothing printed to the logs by TensorFlow that confirms that the environment variable has been acted upon. I have made a note to potentially add this in the future.

Will TF_DETERMINISTIC_OPS increment the number of graph nodes?

duncanriach · 2019-09-18T18:05:12Z

That's a great idea.

Yes, if you observe the number of graph nodes when running with TF_DETERMINISTIC_OPS not set (or set to '0' or 'false') and then observe them again with TF_DETERMINISTIC_OPS set to '1' or 'true' then you should see the number of graph nodes increase with the current implementation (NGC 19.06, NGC 19.07, and stock TF 1.14).

Note that TF_DETERMINISTIC_OPS is sticky in the python process; it's queried and then cached by TensorFlow the first time it's used. So, to operate without it, you need to run from scratch.

duncanriach · 2019-10-31T04:15:58Z

The ultimate test is whether your weights at the end of training change from run to run.

For Keras models, you can call the following at the end of training, and make sure it produces the same result on two consecutive runs:

def summarize_keras_weights(model):
  weights = model.get_weights()
  summary = sum(map(lambda x: x.sum(), weights))
  print("Summary of weights: %.13f" % summary)

If you're not using Keras, it would look something like this:

def summarize_weights(session):
  if hasattr(session, 'raw_session'): session = session.raw_session()
  weights = session.run(tf.trainable_variables())
  summary = sum(map(lambda x: x.sum(), weights))
  print("Summary of weights: %.13f" % summary)

It's also good to confirm that your weights are the same, on both runs, before training starts.

Please note that while the above code is based on code I've used, the code as given above has not been tested. It may contain bugs and/or may not work on more recent versions of TensorFlow or Keras.

duncanriach · 2020-01-17T22:51:37Z

This question has been answered, and there is nothing else to be done here. Closing.

duncanriach closed this as completed Jan 17, 2020

duncanriach changed the title ~~How to confirm determinism environ is work?~~ [question] How can I confirm that the deterministic environment variable is working? Jan 17, 2020

duncanriach added the question Further information is requested label Jan 17, 2020

duncanriach changed the title ~~[question] How can I confirm that the deterministic environment variable is working?~~ How can I confirm that the deterministic environment variable is working? Jan 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can I confirm that the deterministic environment variable is working? #2

How can I confirm that the deterministic environment variable is working? #2

cklsoft commented Sep 17, 2019

duncanriach commented Sep 17, 2019

cklsoft commented Sep 18, 2019

duncanriach commented Sep 18, 2019

duncanriach commented Oct 31, 2019 •

edited

Loading

duncanriach commented Jan 17, 2020

How can I confirm that the deterministic environment variable is working? #2

How can I confirm that the deterministic environment variable is working? #2

Comments

cklsoft commented Sep 17, 2019

duncanriach commented Sep 17, 2019

cklsoft commented Sep 18, 2019

duncanriach commented Sep 18, 2019

duncanriach commented Oct 31, 2019 • edited Loading

duncanriach commented Jan 17, 2020

duncanriach commented Oct 31, 2019 •

edited

Loading