[TF] TensorFlow/TensorFlowCore Refactoring #109

eaplatanios · 2019-04-21T19:57:10Z

Moves all of the TensorFlow module here. All tests pass locally.

rxwei · 2019-04-21T19:58:53Z

Could you create a PR that contains all the changes? I reverted #70 and #106 in order to maintain a working build in this repo.

rxwei · 2019-04-21T19:59:05Z

Sorry for the inconvenience!

eaplatanios · 2019-04-21T20:05:00Z

@rxwei No worries. I hope I did the merge correctly :) The only error that remains when I try to compile this now is the ambiguous subscript.

eaplatanios · 2019-04-21T20:57:23Z

@rxwei I tried removing the subscript by putting it in a #if COMPILING_TENSORFLOW_MODULE/#endif block but now we get a crash from swift::SILFunctionBuilder::addFunctionAttributes(swift::SILFunction*, swift::DeclAttributes&, swift::SILModule&, swift::SILDeclRef), which I've also seen before due to ambiguous functions. I believe this is because all of the extension functions are ambiguous in this case.

eaplatanios · 2019-04-21T22:06:14Z

Also, shouldn't we add @_semantics("autodiff.nonvarying") to all functions that we know to be non-varying? E.g., .>, .<, .==, etc. It would also be nice to expose this to users using a nicer name. Is that possible?

rxwei · 2019-04-21T23:39:04Z

Also, shouldn't we add @_semantics("autodiff.nonvarying") to all functions that we know to be non-varying? E.g., .>, .<, .==, etc. It would also be nice to expose this to users using a nicer name. Is that possible?

I think it's better not to do that because we want non-differentiable functions to produce an error instead of silently giving a zero gradient at runtime. This will help ensure a predictable programming model. The @_semantics("autodiff.nonvarying") attribute can be considered as equivalent to @noDerivative, and I think they should only apply to properties, not functions. Also, adding an attribute on everything often implies that it should be defaulted, but we've tried zero-gradient-by-default and some initial users didn't like it.

My preference is that we explore adding a good API first before considering changing language defaults or library behavior. I see no problem adding a NoDerivative type today, because eventually we can just add a @propertyDelegate attribute on it when that's available. I'm interested in hearing what you think!

eaplatanios · 2019-04-22T00:09:45Z

I see your point and I agree. I'll look into the GPU error as soon as I can.

In the meantime, it turns out it's quite hard to debug TF errors in S4TF because no stack trace is every printed. Oftentimes I get a fatal error about mismatching shapes, etc, but no other information and it's very hard to debug. Is there a way to have S4TF print the whole stack trace whenever a fatal error occurs?

rxwei · 2019-04-22T00:50:02Z

Yeah, that is a known pain point. I think that @pschuh has more context on the current status of pretty-printing TensorFlow runtime errors.

eaplatanios · 2019-04-22T03:45:50Z

Even just a plain stack trace dump would be super helpful at this point to be honest. Because right now it's hard to know which part of the user's code is causing the error.

rxwei · 2019-04-22T03:47:12Z

You can step through the code using LLDB.

eaplatanios · 2019-04-22T03:49:54Z

Yes, that is what I'm doing currently, but I feel that won't be an option for a lot of ML researchers using S4TF.

Shashi456 · 2019-04-22T04:02:18Z

I'm sorry to divert the thread, @eaplatanios but generally vjps for a function can be thought of returning the dual for the automatic differentiation?
so if i wanted to write a vjp for a function which squares a number, i would write the vjp such that it returns 2*x?
I was just looking over your min max vjp function definitions. I'm still trying to wrap my head around writing vjps.

rxwei · 2019-04-22T04:02:24Z

Yes, that is what I'm doing currently, but I feel that won't be an option for a lot of ML researchers using S4TF.

I and the team really agree, and it's definitely on our TODO list.

rxwei · 2019-04-22T04:09:20Z

I'm sorry to divert the thread, @eaplatanios but generally vjps for a function can be thought of returning the dual for the automatic differentiation?

A VJP function represents an efficient derivative function that takes original arguments and computes both

the original result, and
a closure (pullback) that takes a vector and chains it together with the function’s Jacobian, producing vector-Jacobian products, i.e. gradient along the vector.

so if i wanted to write a vjp for a function which squares a number, i would write the vjp such that it returns 2*x?

Say if you have a square(_:) function:

func square(_ x: Float) -> Float {
    return x * x
}

You can define a derivative in the form of a function that has a @differentiating attribute.

@differentiating(square)
func squareDerivative(_ x: Float) -> (value: Float, pullback: (Float) -> Float) {
    return (x * x, { v in v * 2 * x })
}

Note here that it's not just 2x, but a linear combination function { v in v * 2 * x }. v is the "vector" which you can interpret as whatever's chaining with the current derivative. For scalar operations, it's multiplication or element-wise multiplication. For matrix operations (e.g. TensorFlow's matmul(_:_:)), it's matrix multiplication.

Shashi456 · 2019-04-22T04:25:18Z

@rxwei thanks a lot for the explanation.

eaplatanios · 2019-05-28T15:34:22Z

@rxwei I added a couple separate PRs, #136 for the tests, #137 for the bindings fetching script, and swiftlang/swift#25091 for replacing InitTensorFlowRuntime with a Swift implementation. I believe that once these are merged, then we may be able to merge this one directly since it mostly consists of moving over everything from apple/swift. I'm not sure how a partial move would work without adding a significant amount of work to properly separate things for each PR. Please let me know what you think.

rxwei · 2019-05-28T17:38:11Z

Thanks! @bgogul and I will help you merge these PRs this week.

eaplatanios · 2019-05-30T06:25:17Z

@rxwei @bgogul All tests pass locally for me and I have started the CI tests. Tests will probably run successfully with the current toolchain but I tested locally with one built from the corresponding PR swiftlang/swift#24452.

Given the discussion and issues with #137 I believe we should just go ahead with merging this PR directly as partial moves cause more issues that may be hard to work around, whereas this seems to pass all tests fine already.

eaplatanios · 2019-05-30T06:32:31Z

The CI tests seem to fail and I can't tell why. Is there a way to test wit the toolchain from swiftlang/swift#24452 directly?

rxwei · 2019-05-30T06:34:25Z

The CI tests seem to fail and I can't tell why. Is there a way to test wit the toolchain from apple/swift#24452 directly?

There's no way to do that until the PRs have been merged. Let me build a toolchain locally and test things (~20 minutes).

eaplatanios · 2019-05-30T06:38:04Z

Sounds good, thanks! One minor thing I just noticed. There is a test in LayerTests that fails due to a difference in some numbers. I'm not sure which one is correct and why the numbers are different now, but I pushed the changed numbers (lines 203-212 in Tests/TensorFlowTests/LayerTests.swift) so you can take a look.

rxwei · 2019-05-30T06:58:38Z

Sounds good, thanks! One minor thing I just noticed. There is a test in LayerTests that fails due to a difference in some numbers. I'm not sure which one is correct and why the numbers are different now, but I pushed the changed numbers (lines 203-212 in Tests/TensorFlowTests/LayerTests.swift) so you can take a look.

I think verifying the numeric results against Python TF is the best way to know.

rxwei · 2019-05-30T07:18:22Z

I'm getting a bunch of undefined identifier errors when building a toolchain using this PR and the corresponding apple/swift PR.

/usr/local/src/swift-build/tensorflow-swift-apis/Sources/DeepLearning/Optimizer.swift:175:60: error: use of unresolved identifier 'Tensor'
        for kp in alpha.recursivelyAllWritableKeyPaths(to: Tensor<Float>.self) {
                                                           ^~~~~~

Shashi456 · 2019-05-30T07:28:54Z

@eaplatanios @rxwei Sorry to suddenly jump into the convo. So the RNN layer tests seems to have some sorta variability in output and with #130 we did change those values again. Could we just comment it out for now and look into it later?

eaplatanios · 2019-05-30T07:57:28Z

@rxwei that’s because you’re trying to build it with the old checkout of swift-apis. You’d need to checkout this PR in the tensorflow-swift-apis directory.

eaplatanios · 2019-05-30T07:58:20Z

@Shashi456 I don’t see why there should be any variability. Thanks for that information. I’ll try to look into this tomorrow.

rxwei · 2019-05-30T07:59:05Z

@rxwei that’s because you’re trying to build it with the old checkout of swift-apis. You’d need to checkout this PR in the tensorflow-swift-apis directory.

Oops, just figured that. I fetched the PR but didn't checkout.

rxwei · 2019-05-30T08:51:49Z

All tests are passing. I'm about to merge this PR, update the commit hash in the other PR, merge it, and start a new draft PR to test CI once a new toolchain is built.

Shashi456 · 2019-05-30T17:51:01Z

@rxwei I'm sorry for asking this again, but what does this PR exactly do? and why was it needed in the long run?

Moves the entirety of the `TensorFlow` module to tensorflow/swift-apis. Friend PR: tensorflow/swift-apis#109.

eaplatanios · 2019-05-30T18:27:49Z

@Shashi456 This PR mainly moves all of the TensorFlow module in stdlib from apple/swift (i.e., the compiler repository) to this repository. There is also some other minor changes and support for a couple new ops and VJPs, as well as a restructuring of the Operators.swift file to a directory.

This move allows us to makes changes to the stdlib and test them without requiring us to recompile the Swift compiler (which is expensive) and it also separates the compiler code from the stdlib code more clearly.

eaplatanios added 3 commits April 21, 2019 14:32

Minor edits.

a55402c

Minor bug fix.

874ad80

Fixed some compiler errors.

21904cc

Merged upstream changes.

b8ecc2a

Added a couple of missing function overloads.

8cb09bc

eaplatanios added 3 commits April 21, 2019 18:23

Bug fix.

84fd441

Added VJPs for min and max.

c543e76

Minor edits.

262a569

Bug fix.

9513dc6

eaplatanios mentioned this pull request Apr 22, 2019

Restructured the operators source files and added support for multiple new operators. #70

Merged

Bug fix.

7c8d0ea

Brought the dataset ops from the stdlib.

a231f6b

eaplatanios changed the title ~~Fixed #70.~~ Restructuring stdlib. Apr 23, 2019

This was referenced Apr 23, 2019

[TF] Moved most Tensor APIs to 'tensorflow/swift-apis'. swiftlang/swift#24161

Closed

Rewrite of the code generation script. tensorflow/swift-bindings#26

Merged

eaplatanios mentioned this pull request May 28, 2019

[TF] Removed 'InitTensorFlowRuntime' swiftlang/swift#25091

Closed

Minor edits.

76c06ad

This was referenced May 28, 2019

Port tests from apple/swift #136

Merged

Added a script that clones the TF bindings. #137

Closed

Fixed indentation in the 'Runtime.swift' file.

bd1c327

eaplatanios mentioned this pull request May 28, 2019

[TF] Removed the TF bindings copy from the CMakeLists file. swiftlang/swift#25097

Closed

eaplatanios added 4 commits May 28, 2019 18:22

Added back the bindings code.

92a3a0c

Fixed a couple of tests.

b900f8f

Minor edits.

cc9cf86

Merged upstream changes.

915cd38

Modified one test so that it passes.

e538b56

rxwei merged commit 1d484e1 into tensorflow:master May 30, 2019

rxwei pushed a commit to swiftlang/swift that referenced this pull request May 30, 2019

[TF] Moved the TensorFlow module to swift-apis. (#24452)

7665dd9

Moves the entirety of the `TensorFlow` module to tensorflow/swift-apis. Friend PR: tensorflow/swift-apis#109.

eaplatanios mentioned this pull request May 31, 2019

[TF] Add derivative for Tensor(concatenating:alongAxis:). swiftlang/swift#25001

Closed

rxwei mentioned this pull request May 31, 2019

Correct 'XCTAssertEqual(_:_:)' argument order. #153

Closed

[TF] TensorFlow/TensorFlowCore Refactoring #109

[TF] TensorFlow/TensorFlowCore Refactoring #109

Uh oh!

Conversation

eaplatanios commented Apr 21, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rxwei commented Apr 21, 2019

Uh oh!

rxwei commented Apr 21, 2019

Uh oh!

eaplatanios commented Apr 21, 2019

Uh oh!

eaplatanios commented Apr 21, 2019

Uh oh!

eaplatanios commented Apr 21, 2019

Uh oh!

rxwei commented Apr 21, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eaplatanios commented Apr 22, 2019

Uh oh!

rxwei commented Apr 22, 2019

Uh oh!

eaplatanios commented Apr 22, 2019

Uh oh!

rxwei commented Apr 22, 2019

Uh oh!

eaplatanios commented Apr 22, 2019

Uh oh!

Shashi456 commented Apr 22, 2019

Uh oh!

rxwei commented Apr 22, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rxwei commented Apr 22, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Shashi456 commented Apr 22, 2019

Uh oh!

eaplatanios commented May 28, 2019

Uh oh!

rxwei commented May 28, 2019

Uh oh!

eaplatanios commented May 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eaplatanios commented May 30, 2019

Uh oh!

rxwei commented May 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eaplatanios commented May 30, 2019

Uh oh!

rxwei commented May 30, 2019

Uh oh!

rxwei commented May 30, 2019

Uh oh!

Shashi456 commented May 30, 2019

Uh oh!

eaplatanios commented May 30, 2019

Uh oh!

eaplatanios commented May 30, 2019

Uh oh!

rxwei commented May 30, 2019

Uh oh!

rxwei commented May 30, 2019

Uh oh!

Shashi456 commented May 30, 2019

Uh oh!

eaplatanios commented May 30, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

eaplatanios commented Apr 21, 2019 •

edited

Loading

rxwei commented Apr 21, 2019 •

edited

Loading

rxwei commented Apr 22, 2019 •

edited

Loading

rxwei commented Apr 22, 2019 •

edited

Loading

eaplatanios commented May 30, 2019 •

edited

Loading

rxwei commented May 30, 2019 •

edited

Loading