Update from upstream #1

phamthuonghai · 2019-06-27T07:59:51Z

No description provided.

…ons. This does slow down collect, because now we are calling policy over different shapes each time, i.e. t = 1 to T batch of observations is shaped (b, t) + OBS and t increases every iteration. But this sems to be unavoidable, unless we know we just want the last time-step's predictions. PiperOrigin-RevId: 244204123

…ice evals; correct input pipeline for that. PiperOrigin-RevId: 244224385

PiperOrigin-RevId: 244259605

This amounts to running a recurrence relation to compute rewards_to_go and gae_advantages in reverse. Added a test which fails on the current code. PiperOrigin-RevId: 244288574

PiperOrigin-RevId: 244346333

…feed forward layer; code readability improvements. PiperOrigin-RevId: 244346833

PiperOrigin-RevId: 244393153

…ove now-redundant tpu config. PiperOrigin-RevId: 244393232

…ff before crashing. PiperOrigin-RevId: 244393243

PiperOrigin-RevId: 244396062

…lags. TODO(afrozm): Use gin :) PiperOrigin-RevId: 244398248

NOTE: This isn't used anywhere right now though. PiperOrigin-RevId: 244402893

We may re-add some of the parts we are removing now, but it will be easier to refactor with less parts. All models (MLP, ResNet, TransformerLM) work fine, other Transformers need refactoring anyway. Keeping the slax Shared layer so as to use it as the base layer later (we will want to enable sharing weights by object as that's very natural). PiperOrigin-RevId: 244427175

interactively and in other places if need be. PiperOrigin-RevId: 244438577

PiperOrigin-RevId: 244439471

PiperOrigin-RevId: 244441999

PiperOrigin-RevId: 244631294

PiperOrigin-RevId: 244689120

PiperOrigin-RevId: 244696681

…ace. * Errors out while computing gradients even with a constant small learning rate (1e-4) and cutting trajectories at 100 steps only (this early stopping is not in this change). * This happens with both SGD and Adam. * Any ideas why? * It goes slightly further with SGD than Adam, but this is possibly random. PiperOrigin-RevId: 244697209

…pure python. PiperOrigin-RevId: 244699352

- Additional logging in ppo.py for min/max/avg rewards. PiperOrigin-RevId: 244756017

…stead of (apply, init) = net. PiperOrigin-RevId: 244769076

PiperOrigin-RevId: 244780716

… with stax any more. PiperOrigin-RevId: 244798179

…nto account. PiperOrigin-RevId: 244887513

Rename hparam_sets for mixture transformer Bugfix: Fixing tf.squeeze call to handle case where only 1 element is present in batch Change implementation to add mixture embeddings to original vocab embedding matrix, and use bottom method to retrieve them Change implementation to add mixture embeddings directly to decoder_input Added new problem for multi-spelling dataset PiperOrigin-RevId: 244915958

PiperOrigin-RevId: 244926626

- Previous tests pass observations as (B, T) + OBS. - But actually should be (B, T+1) + OBS. - More shape checking in the code. - We now assert on those shapes in the code. - More logging in collect. - Mainly for timing, this is on vlog 2, rest of the logging is vlog 1. PiperOrigin-RevId: 244947311

PiperOrigin-RevId: 244952793

PiperOrigin-RevId: 253615467

PiperOrigin-RevId: 253669173

- Dropout and attention masking no longer store in global memory constants that have the same shape as the activations - layers.one_hot no longer stores a large intermediate quantity in global memory PiperOrigin-RevId: 253851993

PiperOrigin-RevId: 253874773

PiperOrigin-RevId: 253881109

…ce optionally type_ids in transformer_layers.transformer_prepare_encoder. PiperOrigin-RevId: 253905794

PiperOrigin-RevId: 254044123

PiperOrigin-RevId: 254098007

…aid workaround. PiperOrigin-RevId: 254105522

…ng fast decoding. (#1602) * Using partial targets at inference time. * Saving attention history to Transformer's cache during fast decoding.

PiperOrigin-RevId: 254119136

PiperOrigin-RevId: 254125422

PiperOrigin-RevId: 254135712

- Explicitly track n_inputs and n_outputs for each layer. - Implement stack sementics via Serial combinator. - Define/implement variable-width sublayer semantics for Parallel op. - Remove lingering dependencies on non-sequence structure of args. - Remove Select and Branch ops. - Update all models to use the modified/restricted set of ops. PiperOrigin-RevId: 254217127

`probs_parameter`, `logits_parameter`. In the future properties `probs` `logits` will return `None` if that's how the distribution was parameterized. PiperOrigin-RevId: 254318793

PiperOrigin-RevId: 254322989

…ocstring and code. * Modify github function docstring problem to generate samples with "embed_code" feature, where 0 indicates that the input is docstring, and 1 for code. PiperOrigin-RevId: 254522155

conv2dflipout/denseflipout follow original source code. lstmcellflipout is mostly a cp+paste of tf.keras.layers.lstmcell, but with flipout perturbations. PiperOrigin-RevId: 254569516

Also, for Attention layers, make n_heads have default value 1. PiperOrigin-RevId: 254820673

PiperOrigin-RevId: 254869334

PiperOrigin-RevId: 255015841

PiperOrigin-RevId: 255117361

PiperOrigin-RevId: 255208944

PiperOrigin-RevId: 255212058

PiperOrigin-RevId: 255212377

PiperOrigin-RevId: 255214277

PiperOrigin-RevId: 255279570

… hit an error if we were unable to complete writing the model file for any reason, so fall back on the previous model file if one exists and pick up from there. PiperOrigin-RevId: 255335440

afrozenator and others added 30 commits April 18, 2019 09:56

Trax debugging: pass mode to models, always jit eval and do multi-dev…

13b4876

…ice evals; correct input pipeline for that. PiperOrigin-RevId: 244224385

Bigger config for Transformer in trax.

693c41b

PiperOrigin-RevId: 244259605

Fix a numerical overflow issue in PPO.

ed760b1

This amounts to running a recurrence relation to compute rewards_to_go and gae_advantages in reverse. Added a test which fails on the current code. PiperOrigin-RevId: 244288574

In multi-core TRAX, split random generator properly across cores.

33104de

PiperOrigin-RevId: 244346333

Cleaning TRAX Transformer: use dropout everywhere, pull out residual …

e48cf23

…feed forward layer; code readability improvements. PiperOrigin-RevId: 244346833

Adding a few new RLMB configs

9624e41

PiperOrigin-RevId: 244393153

Allow to specify batch size divided by number of devices in trax, rem…

ac01285

…ove now-redundant tpu config. PiperOrigin-RevId: 244393232

Add some debugging logs to collect that dump out a whole bunch of stu…

c5faade

…ff before crashing. PiperOrigin-RevId: 244393243

Split rngs per device in the main loop, not the update function.

a015800

PiperOrigin-RevId: 244396062

Allow to configure learning rate and number of optimizer steps from f…

70c7380

…lags. TODO(afrozm): Use gin :) PiperOrigin-RevId: 244398248

Adding a policy and value function that produces both policy and value.

f22ab2a

NOTE: This isn't used anywhere right now though. PiperOrigin-RevId: 244402893

Extract out dumping params into a function. Then it can be used from pdb

40ef465

interactively and in other places if need be. PiperOrigin-RevId: 244438577

Add flag to debug nans.

e76f6fa

PiperOrigin-RevId: 244439471

Add imagenet64 sequence generation task

e56072d

PiperOrigin-RevId: 244441999

TRAX layer refactor, make our layers classes.

e6d2a3e

PiperOrigin-RevId: 244631294

Fix imports

f1e6f37

PiperOrigin-RevId: 244689120

Fix bug where tfds was not deterministic even when random seed was set.

8372e5d

PiperOrigin-RevId: 244696681

Rename stax_base to core and fork lax shape computations as they are …

aecd5d6

…pure python. PiperOrigin-RevId: 244699352

- Add the option to terminate collect at a specified maximum time-step.

809bc09

- Additional logging in ppo.py for min/max/avg rewards. PiperOrigin-RevId: 244756017

TRAX: Change stax API to net.params(shape, rng) and net(x, params) in…

698088b

…stead of (apply, init) = net. PiperOrigin-RevId: 244769076

Use LogSoftmax instead of Softmax in the policy network.

169b3b4

PiperOrigin-RevId: 244780716

Rename trax/stax --> trax/layers since trax layers are not compatible…

e6f32c7

… with stax any more. PiperOrigin-RevId: 244798179

t2t_decoding with option score_file now takes FLAGS.checkpoint_path i…

9abac7c

…nto account. PiperOrigin-RevId: 244887513

Fixed typo referencing tf.Dataset

a68e020

PiperOrigin-RevId: 244926626

Add support for multiple passes of scheduled sampling.

e8c8192

PiperOrigin-RevId: 244952793

Oscar Ramirez and others added 29 commits June 17, 2019 10:59

Update gym version.

2bccf1c

PiperOrigin-RevId: 253615467

Add MemoryEfficientTrainer

b2615aa

PiperOrigin-RevId: 253669173

Optimize memory usage in trax

b3a667e

- Dropout and attention masking no longer store in global memory constants that have the same shape as the activations - layers.one_hot no longer stores a large intermediate quantity in global memory PiperOrigin-RevId: 253851993

Move towards py3 compatibility.

eb6d825

PiperOrigin-RevId: 253874773

Reversible Transformer

8c23cbb

PiperOrigin-RevId: 253881109

Allow for overwriting transformer_prepare_encoder/decoder and introdu…

1af4cb3

…ce optionally type_ids in transformer_layers.transformer_prepare_encoder. PiperOrigin-RevId: 253905794

Introduce Rainbow. (#1607)

ddf0ef2

Merge of PR #1607

f3c91c6

PiperOrigin-RevId: 254044123

never increase batch size in update_hparams_for_tpu

09bf410

PiperOrigin-RevId: 254098007

Expand an error message with a known workaround; add flag to enable s…

6036fd8

…aid workaround. PiperOrigin-RevId: 254105522

Storing encoder-decoder attention history at Transformer's cache duri…

0964f5c

…ng fast decoding. (#1602) * Using partial targets at inference time. * Saving attention history to Transformer's cache during fast decoding.

Merge of PR #1602

336d72f

PiperOrigin-RevId: 254119136

Reduce dimensionality of tensors in eval metrics on TPUs.

e6fbef3

PiperOrigin-RevId: 254125422

Bring back eval metrics for video problems.

a32cf71

PiperOrigin-RevId: 254135712

Update use of TFP distributions' probs, logits properties to to use

e6dacfa

`probs_parameter`, `logits_parameter`. In the future properties `probs` `logits` will return `None` if that's how the distribution was parameterized. PiperOrigin-RevId: 254318793

Fix a minor typo in problem.py

f778c62

PiperOrigin-RevId: 254322989

* Modify similarity transformer model to predict embedding for both d…

2a754e5

…ocstring and code. * Modify github function docstring problem to generate samples with "embed_code" feature, where 0 indicates that the input is docstring, and 1 for code. PiperOrigin-RevId: 254522155

Add Conv2DFlipout, DenseFlipout, LSTMCellFlipout.

902d49d

conv2dflipout/denseflipout follow original source code. lstmcellflipout is mostly a cp+paste of tf.keras.layers.lstmcell, but with flipout perturbations. PiperOrigin-RevId: 254569516

Change layer names according to s/MultiHeadedAttention/Attention/.

d9f8074

Also, for Attention layers, make n_heads have default value 1. PiperOrigin-RevId: 254820673

Add bits and nats test to latent_layers.

20e1c45

PiperOrigin-RevId: 254869334

Add SIGMOID_ACCURACY metric

eb5fe49

PiperOrigin-RevId: 255015841

Use tf.linalg.inv on TPU

beb2695

PiperOrigin-RevId: 255117361

Fix Reversible Transformer

61469ee

PiperOrigin-RevId: 255208944

Fix wasteful memory allocations in dropout shape calculation

54c1d3d

PiperOrigin-RevId: 255212058

Fix mode kwargs for AtariCNN and NeuralGPU

da6fd08

PiperOrigin-RevId: 255212377

Remove all dropout in reversible transformer

e1cf771

PiperOrigin-RevId: 255214277

Internal

eb048f6

PiperOrigin-RevId: 255279570

Try restoring the model in reverse chronological order in PPO, we may…

7860d26

… hit an error if we were unable to complete writing the model file for any reason, so fall back on the previous model file if one exists and pick up from there. PiperOrigin-RevId: 255335440

phamthuonghai merged commit 0d149ea into phamthuonghai:master Jun 27, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update from upstream #1

Update from upstream #1

Uh oh!

phamthuonghai commented Jun 27, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

27 participants

Update from upstream #1

Update from upstream #1

Uh oh!

Conversation

phamthuonghai commented Jun 27, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

27 participants