-
Notifications
You must be signed in to change notification settings - Fork 136
[TF] TensorFlow/TensorFlowCore Refactoring #109
Conversation
|
Sorry for the inconvenience! |
|
@rxwei No worries. I hope I did the merge correctly :) The only error that remains when I try to compile this now is the ambiguous subscript. |
|
@rxwei I tried removing the subscript by putting it in a |
|
Also, shouldn't we add |
I think it's better not to do that because we want non-differentiable functions to produce an error instead of silently giving a zero gradient at runtime. This will help ensure a predictable programming model. The My preference is that we explore adding a good API first before considering changing language defaults or library behavior. I see no problem adding a |
|
I see your point and I agree. I'll look into the GPU error as soon as I can. In the meantime, it turns out it's quite hard to debug TF errors in S4TF because no stack trace is every printed. Oftentimes I get a |
|
Yeah, that is a known pain point. I think that @pschuh has more context on the current status of pretty-printing TensorFlow runtime errors. |
|
Even just a plain stack trace dump would be super helpful at this point to be honest. Because right now it's hard to know which part of the user's code is causing the error. |
|
You can step through the code using LLDB. |
|
Yes, that is what I'm doing currently, but I feel that won't be an option for a lot of ML researchers using S4TF. |
|
I'm sorry to divert the thread, @eaplatanios but generally vjps for a function can be thought of returning the dual for the automatic differentiation? |
I and the team really agree, and it's definitely on our TODO list. |
A VJP function represents an efficient derivative function that takes original arguments and computes both
Say if you have a func square(_ x: Float) -> Float {
return x * x
}You can define a derivative in the form of a function that has a @differentiating(square)
func squareDerivative(_ x: Float) -> (value: Float, pullback: (Float) -> Float) {
return (x * x, { v in v * 2 * x })
}Note here that it's not just |
|
@rxwei thanks a lot for the explanation. |
|
@rxwei I added a couple separate PRs, #136 for the tests, #137 for the bindings fetching script, and swiftlang/swift#25091 for replacing |
|
Thanks! @bgogul and I will help you merge these PRs this week. |
|
@rxwei @bgogul All tests pass locally for me and I have started the CI tests. Tests will probably run successfully with the current toolchain but I tested locally with one built from the corresponding PR swiftlang/swift#24452. Given the discussion and issues with #137 I believe we should just go ahead with merging this PR directly as partial moves cause more issues that may be hard to work around, whereas this seems to pass all tests fine already. |
|
The CI tests seem to fail and I can't tell why. Is there a way to test wit the toolchain from swiftlang/swift#24452 directly? |
There's no way to do that until the PRs have been merged. Let me build a toolchain locally and test things (~20 minutes). |
|
Sounds good, thanks! One minor thing I just noticed. There is a test in |
I think verifying the numeric results against Python TF is the best way to know. |
|
I'm getting a bunch of undefined identifier errors when building a toolchain using this PR and the corresponding apple/swift PR. /usr/local/src/swift-build/tensorflow-swift-apis/Sources/DeepLearning/Optimizer.swift:175:60: error: use of unresolved identifier 'Tensor'
for kp in alpha.recursivelyAllWritableKeyPaths(to: Tensor<Float>.self) {
^~~~~~ |
|
@eaplatanios @rxwei Sorry to suddenly jump into the convo. So the RNN layer tests seems to have some sorta variability in output and with #130 we did change those values again. Could we just comment it out for now and look into it later? |
|
@rxwei that’s because you’re trying to build it with the old checkout of swift-apis. You’d need to checkout this PR in the tensorflow-swift-apis directory. |
|
@Shashi456 I don’t see why there should be any variability. Thanks for that information. I’ll try to look into this tomorrow. |
Oops, just figured that. I fetched the PR but didn't |
|
All tests are passing. I'm about to merge this PR, update the commit hash in the other PR, merge it, and start a new draft PR to test CI once a new toolchain is built. |
|
@rxwei I'm sorry for asking this again, but what does this PR exactly do? and why was it needed in the long run? |
Moves the entirety of the `TensorFlow` module to tensorflow/swift-apis. Friend PR: tensorflow/swift-apis#109.
|
@Shashi456 This PR mainly moves all of the This move allows us to makes changes to the stdlib and test them without requiring us to recompile the Swift compiler (which is expensive) and it also separates the compiler code from the stdlib code more clearly. |
Moves all of the
TensorFlowmodule here. All tests pass locally.Friend PR: swiftlang/swift#24452.