Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modify the zero-size values in KJT for dynamic shape compatibility #2250

Closed
wants to merge 3 commits into from

Conversation

TroyGarden
Copy link
Contributor

Summary:

context

  • dynamic shape usually has a minimum value requirement: dynamic_shape >= 2
  • however, in reality, the actual KJT._values could be empty
  • this issue was discribed in D57998381

run the local-default every a few time one could get an error for some of the dynamic_shape being zero
This is because in some corner case (not very rare though), the some dynamic_shape dim of the sample_input could be zero,
and 0-size dynamic shape is handled differently during torch.export. Bascially it will assume this dynamic shape should always be zero.

  • error log: P1462233278
[rank0]:   - Not all values of vlen5 = L['args'][0][0].event_id_list_features_seqs['marketplace']._values.size()[0] in the specified range are valid because vlen5 was inferred to be a constant (0).
[rank0]: Suggested fixes:
[rank0]:   vlen5 = 0

method

  • padding the kjt._values with the minimum required size (2, )
  • in the case of empty values, kjt._lengths and kjt._offsets should all be zeros
  • it doesn't affect the true logic/mathematic values of the kjt

issues

  1. exported_program.module can't take in empty-value input.
  2. deserialized unflattened model can't take in empty-value input, which could happen in real data.
  3. deserialized unflattened model can't take in altered input, which could be a potential workaround if can't resolve Column-wise sharder #2.

NOTE: Please check the in-line comments in the test file for details

Other Concerns

  1. the inconsistency in the KJT (lengths are zeros, but values is non empty) might be incompatible with some downstream functions/operators, will need more tests to confirm.

Differential Revision: D45410437

Huanyu He added 2 commits July 26, 2024 19:28
Summary:
* Intention
1. The retention time for `unagi_newsie_delivery` and `unagi_row_newsie_delivery` tables are too short to monitor the traffic/volume abnormality.
2. Use ODS to log the following volume: `usecase`, `medium`, `policy`, `decision`, `reason`

* Changes
1. add toODS for the following columns: `usecase`, `medium`, `policy`, `decision`, and `reason`
2. Fix a typo in prevous ods

Differential Revision: D53558783
Summary:
Pull Request resolved: pytorch#2202

# context
* Discussed this ctx.new_dynamic_size() with Angela and she told me it's just to register a brand new dynamic shape for the batch_size. so I traced back why my registering batch_size as dynamic shape has so many problem.
* all the complexity really comes from this line(s): serializer.py: batch_size = id_list_features.stride()
* we get the batch_size from this function, which is derived from either "_maybe_compute_stride_kjt" or other sources.
* Especially this line stride = lengths.numel() // len(keys), explained a lot of error messages I encountered before.
* this correlation between the lengths.numel() and the batch_size really complicates the dynamic shape guards. for this I made up many artificial workarounds like: utils.py

# details
* we still declare/register the dynamic shape in the mark_dynamic_kjt function,
* but pass the _dynamic_batch_size to kjt and retrieve it from the serializer's meta_forward.

Differential Revision: D59201188
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 27, 2024
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D45410437

…ytorch#2250)

Summary:
Pull Request resolved: pytorch#2250

# context
* dynamic shape usually has a minimum value requirement: `dynamic_shape >= 2`
* however, in reality, the actual KJT._values could be empty
* this issue was discribed in D57998381

> run the local-default every a few time one could get an error for some of the dynamic_shape being zero
This is because in some corner case (not very rare though), the some dynamic_shape dim of the `sample_input` could be zero,
and 0-size dynamic shape is handled differently during torch.export. **Bascially it will assume this dynamic shape should always be zero.**
* error log: P1462233278
```
[rank0]:   - Not all values of vlen5 = L['args'][0][0].event_id_list_features_seqs['marketplace']._values.size()[0] in the specified range are valid because vlen5 was inferred to be a constant (0).
[rank0]: Suggested fixes:
[rank0]:   vlen5 = 0
```

# method
* padding the kjt._values with the minimum required size `(2, )`
* in the case of empty values, kjt._lengths and kjt._offsets should all be zeros
* it doesn't affect the true logic/mathematic values of the kjt

# issues
1. exported_program.module can't take in empty-value input.
2. deserialized unflattened model can't take in empty-value input, which could happen in real data.
3. deserialized unflattened model can't take in altered input, which could be a potential workaround if can't resolve pytorch#2.

NOTE: Please check the in-line comments in the test file for details

# Other Concerns
1. the inconsistency in the KJT (lengths are zeros, but values is non empty) might be incompatible with some downstream functions/operators, will need more tests to confirm.
2.

Differential Revision: D45410437
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D45410437

@TroyGarden TroyGarden deleted the export-D45410437 branch August 8, 2024 22:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants