-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Table not initialized when serving model #237
Comments
What is the version of TFT being used? |
apache-beam[gcp]==2.28.0 |
Re: "We noticed that the exported model/assets directory does not include the intermediate vocabulary used by the above BM25 transformation" --> is this the model exported post training or the output of TFT? If the former, could you clarify if the file exists in the transform output? |
hi @varshaan! Thank you for looking into this. The TFT dataflow job does export the assets. I see the vocab file under transform_fn/assets/needle_vocabulary However, these vocab files do not appear in the trained model's model/assets/ directory. Both the TFT job and Training jobs were successful. We only noticed the error when attempting to reload and inference the model. I also managed to reproduce the issue using this transformation:
In both cases (bm25 and tfidf), it seems to fail at prediction time on the
|
Since the table does exist in the Transform output, do you mind sharing the code snippet for how the trained model is being exported? In particular, is the tft_layer assigned to an attribute of the exported model [1]? I am assuming this is a Keras model from the stacktrace. |
yep, it's a Keras model. The TFT layer is attached as an attribute of the keras model:
This is the bit of code where we export the model https://gist.github.com/awadalaa/bcafb5da46ced7d9373f0d51ce389aa3#file-gistfile1-txt-L24 |
hi @varshaan I put together a small example repository that consistently reproduces the issue based on the census example you linked: https://github.com/awadalaa/TFTReproduceIssue you can clone the repo and run this to reproduce the problem:
|
Hi, That repro has 2 keras models. The "full_model" [1] does not track the tft layer. Adding [1] https://github.com/awadalaa/TFTReproduceIssue/blob/main/trainer/model.py#L69 |
@awadalaa Does that fix the problem? If so then we should close this issue. |
thank you @rcrowe-google and @varshaan! Attaching the tft_layer to the full_model does unblock us! I'm not sure if the issue should be closed though. It was unexpected because the |
My understanding is that Keras expects that all resources that need to be tracked are tracked by the main object that is being saved (in this case the full_model). I suspect it isn't common that the signatures are on a model different from the one being saved. I will try and verify this and get back to you. |
Posting for @awadalaa
We are blocked on experimenting with a new Tensorflow model in production because it fails to inference with this error:
tensorflow.python.framework.errors_impl.FailedPreconditionError: Table not initialized.
We have narrowed down the issue to a bit of our code that applies a bm25 transformation in a Tensorflow-Transform job. As part of applying that transformation, it learns and applies a vocabulary however when we inference the model it fails to initialize the table from that vocabulary file on this line. Here is the BM25 code we are using and the line where it fails:
https://gist.github.com/awadalaa/e9290cf6674884d8e197fe315ed7d832#file-gistfile1-txt-L176-L177
More background:
We run a Tensorflow-Transform Beam/Dataflow job that executes this transformation and saves the transform graph. Later when we train our model, we save it with a signature that applies the TFT layer: transformed_features = model.tft_layer(parsed_features). We noticed that the exported model/assets directory does not include the intermediate vocabulary used by the above BM25 transformation although it does include every other vocabulary file learned in the TFT job. Any ideas why the above transformation would fail to export the vocabulary assets for a saved model?
Stack trace here:
Function call stack:
signature_wrapper
The text was updated successfully, but these errors were encountered: