Skip to content

Conversation

@phamthuonghai
Copy link
Owner

No description provided.

afrozenator and others added 30 commits April 18, 2019 09:56
…ons.

This does slow down collect, because now we are calling policy over different shapes each time, i.e. t = 1 to T  batch of observations is shaped (b, t) + OBS and t increases every iteration.

But this sems to be unavoidable, unless we know we just want the last time-step's predictions.

PiperOrigin-RevId: 244204123
…ice evals; correct input pipeline for that.

PiperOrigin-RevId: 244224385
PiperOrigin-RevId: 244259605
This amounts to running a recurrence relation to compute rewards_to_go and
gae_advantages in reverse.

Added a test which fails on the current code.

PiperOrigin-RevId: 244288574
…feed forward layer; code readability improvements.

PiperOrigin-RevId: 244346833
PiperOrigin-RevId: 244393153
…ove now-redundant tpu config.

PiperOrigin-RevId: 244393232
…ff before

crashing.

PiperOrigin-RevId: 244393243
…lags.

TODO(afrozm): Use gin :)

PiperOrigin-RevId: 244398248
NOTE: This isn't used anywhere right now though.
PiperOrigin-RevId: 244402893
We may re-add some of the parts we are removing now, but it will be easier to refactor with less parts.

All models (MLP, ResNet, TransformerLM) work fine, other Transformers need refactoring anyway.

Keeping the slax Shared layer so as to use it as the base layer later
(we will want to enable sharing weights by object as that's very natural).

PiperOrigin-RevId: 244427175
interactively and in other places if need be.

PiperOrigin-RevId: 244438577
PiperOrigin-RevId: 244439471
PiperOrigin-RevId: 244441999
PiperOrigin-RevId: 244689120
…ace.

 * Errors out while computing gradients even with a constant small learning rate (1e-4) and cutting trajectories at 100 steps only (this early stopping is not in this change).
 * This happens with both SGD and Adam.
   * Any ideas why?
   * It goes slightly further with SGD than Adam, but this is possibly random.

PiperOrigin-RevId: 244697209
 - Additional logging in ppo.py for min/max/avg rewards.

PiperOrigin-RevId: 244756017
…stead of (apply, init) = net.

PiperOrigin-RevId: 244769076
… with stax any more.

PiperOrigin-RevId: 244798179
Rename hparam_sets for mixture transformer
Bugfix: Fixing tf.squeeze call to handle case where only 1 element is present
in batch
Change implementation to add mixture embeddings to original vocab embedding
matrix, and use bottom method to retrieve them
Change implementation to add mixture embeddings directly to decoder_input
Added new problem for multi-spelling dataset

PiperOrigin-RevId: 244915958
PiperOrigin-RevId: 244926626
    - Previous tests pass observations as (B, T) + OBS.
    - But actually should be (B, T+1) + OBS.
 - More shape checking in the code.
    - We now assert on those shapes in the code.
 - More logging in collect.
    - Mainly for timing, this is on vlog 2, rest of the logging is vlog 1.

PiperOrigin-RevId: 244947311
Oscar Ramirez and others added 29 commits June 17, 2019 10:59
PiperOrigin-RevId: 253615467
PiperOrigin-RevId: 253669173
- Dropout and attention masking no longer store in global memory constants that have the same shape as the activations
- layers.one_hot no longer stores a large intermediate quantity in global memory

PiperOrigin-RevId: 253851993
PiperOrigin-RevId: 253874773
PiperOrigin-RevId: 253881109
…ce optionally type_ids in transformer_layers.transformer_prepare_encoder.

PiperOrigin-RevId: 253905794
PiperOrigin-RevId: 254044123
…aid workaround.

PiperOrigin-RevId: 254105522
…ng fast decoding. (#1602)

* Using partial targets at inference time.

* Saving attention history to Transformer's cache during fast decoding.
PiperOrigin-RevId: 254119136
PiperOrigin-RevId: 254135712
  - Explicitly track n_inputs and n_outputs for each layer.
  - Implement stack sementics via Serial combinator.
  - Define/implement variable-width sublayer semantics for Parallel op.
  - Remove lingering dependencies on non-sequence structure of args.
  - Remove Select and Branch ops.
  - Update all models to use the modified/restricted set of ops.

PiperOrigin-RevId: 254217127
`probs_parameter`, `logits_parameter`. In the future properties `probs`
`logits` will return `None` if that's how the distribution was parameterized.

PiperOrigin-RevId: 254318793
PiperOrigin-RevId: 254322989
…ocstring and code.

* Modify github function docstring problem to generate samples with "embed_code" feature, where 0 indicates that the input is docstring, and 1 for code.

PiperOrigin-RevId: 254522155
conv2dflipout/denseflipout follow original source code. lstmcellflipout is mostly a cp+paste of tf.keras.layers.lstmcell, but with flipout perturbations.

PiperOrigin-RevId: 254569516
Also, for Attention layers, make n_heads have default value 1.

PiperOrigin-RevId: 254820673
PiperOrigin-RevId: 254869334
PiperOrigin-RevId: 255015841
PiperOrigin-RevId: 255117361
PiperOrigin-RevId: 255208944
PiperOrigin-RevId: 255212377
PiperOrigin-RevId: 255214277
PiperOrigin-RevId: 255279570
… hit an error if we were unable to complete writing the model file for any reason, so fall back on the previous model file if one exists and pick up from there.

PiperOrigin-RevId: 255335440
@phamthuonghai phamthuonghai merged commit 0d149ea into phamthuonghai:master Jun 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.