Skip to content

Conversation

@dan-zheng
Copy link
Contributor

@dan-zheng dan-zheng commented Aug 17, 2019

Canonicalize JVPs/VJPs to return maximally abstracted linear map functions
with @in_guaranteed parameters and an @out result.

This is a necessary step towards re-enabling LoadableByAddress:
linear map type is no longer computed based on the original function type.

Summary of changes:

  • SILGen
    • Extra reabstraction logic for JVP/VJPs and differential/pullbacks to
      handle maximally indirect linear maps.
    • Related forum discussion.
  • Differentiation transform
    • Use only tangent/adjoint buffers in differential/pullback generation.

See TF-11 for more info regarding LoadableByAddress.
See TF-625 for more info regarding maximally abstracted linear maps.


Initial patch by @rxwei - commits lost because resolving conflicts got too complicated, sorry.

Canonicalize JVPs/VJPs to return maximally abstracted linear map functions
with `@in_guaranteed` parameters and an `@out` result.

This is a necessary step towards re-enabling LoadableByAddress:
linear map type is no longer computed based on the original function type.

See TF-11 for more info regarding LoadableByAddress.
See TF-625 for more info regarding maximally abstracted linear maps.
@dan-zheng dan-zheng added the tensorflow This is for "tensorflow" branch PRs. label Aug 17, 2019
@dan-zheng dan-zheng force-pushed the ad-all-indirect branch 3 times, most recently from a690449 to bf6b528 Compare September 21, 2019 03:03
- Change differential generation to use only tangent buffers.
- Change pullback visitors to use only adjoint buffers.
- Change adjoint of active value propagation to use only adjoint buffers.
- Mark all tangent/adjoint value helpers as `[[deprecated]]`.
  - They are not deleted because helpers may become useful after
    SIL opaque values are introduced.
Will re-add in follow-up for separation of concerns.
@dan-zheng dan-zheng force-pushed the ad-all-indirect branch 2 times, most recently from 9dbb1d6 to 361ae93 Compare September 21, 2019 03:32
@dan-zheng
Copy link
Contributor Author

@swift-ci Please test tensorflow

@dan-zheng
Copy link
Contributor Author

@swift-ci Please test tensorflow

@dan-zheng
Copy link
Contributor Author

It would be good to test the performance impact of this patch. Using adjoint buffers instead of adjoint values may incur a performance penalty.

Remove unused `AdjointValue` utilities for materialization and addition.
`AdjointValue` itself is preserved to implement symbolic zero buffer optimization.
Address review feedback.
Standardize variable naming.
`tuple_extract` is no longer testable after Differentiation has been moved
before OwnershipModelEliminator. Old reproducers for `tuple_extract` now
generate `destructure_tuple`.

Deleting untestable code is prudent.
Avoid unnecessary local tangent struct struct allocation.
Generate `struct_element_addr` into existing struct tangent buffer.
dan-zheng and others added 4 commits September 25, 2019 18:25
There are no known tests for `tuple_extract` visitors.
Re-adding `tuple_extract` visitors may be necessary when differentiation
supports inout parameters.

This Gist shows cases where SILGen produces `struct_extract`:
https://gist.github.com/dan-zheng/1343673d2d4d20d403306283b42d522b
Some cleanup will be refactored to a separate patch.
@dan-zheng
Copy link
Contributor Author

A less invasive approach for re-enabling LoadableByAddress was found: #27923

Maximal indirection will not be pursued further: the differentiation transform should not be forced to generate maximally indirect code to workaround issues caused by LoadableByAddress.

@dan-zheng dan-zheng closed this Dec 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

tensorflow This is for "tensorflow" branch PRs.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants