-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tfio style test for tensorflow_io/BigTable #171
Comments
@henrytansetiawan 👍 The PubSub emulator is similar, I could help with BigTable emulator if needed. |
Hello @yongtang, can I please work on adding tests for gcloud bigtable using bigtable emulator? If that is not being taken up already? |
@ricky3dec That would be great! 👍 Let me know if you run into any issues. For pubsub, at one point I encountered some issue with credential file on macOS so that was disabled at the moment. We definitely plan to re-enable it very soon. Help would be appreciated as well. |
Sure @yongtang , please assign the issue to me. Could you please point me to "original contrib.BigTable has its own internal test"? Would like to know what kind of tests were written earlier. Though I tried a bit to find out, but looks like this structure has changed over last few releases. Thanks! |
Thanks @ricky3dec! It seems that google Bigtable has a emulator (similar to PubSub): I think that might make our life a little easy. Not sure about the original tensorflow.contrib.bigtable's internal test. Though from my past experience quite a few I am not sure we need to implement C++ mock as it may not exactly match the google cloud behavior, and it is (unnecessarily) harder to implement. I would suggest to look into Bigtable emulator, and add python test against emulator. This might be a better way to test bigtable. |
Yes @yongtang , I had gone through the emulator part, as you had originally suggested above. I just wanted to know, if there are already some baseline tests that minimally we should have. But considering your comments, would try to write tests that cover basic usage of the operation that interfaces with "BigTable" alongwith some exception scenarios. Would be sending out the initial review soon. Thanks again! |
Thanks @ricky3dec! We have ongoing discussions about standardize tests in #65 So far You can also suggest any input to make the tests more consistent. There were some confusions about operations in tf.data.Dataset but now it mostly sorted out:
I would suggest you to start with one (maybe test 1) for inference-compatible test), then we could expand to add more supports with more tests. You can submit multiple PRs to make the process easier. |
Sure @yongtang , would go through the guidelines above, and yes it makes sense to have reviews in mini batch mode :-) |
@ricky3dec I added a PR #667 to split tests into different feature supporting groups. It should be easier now to add additional tests (for different feature-compatibilities). You can take a look and see if it makes sense. You can also add parametrized tests on top of it one by one (for Bigtable, etc). |
Sure @yongtang , thanks for following up on this and guiding me. 🔖 |
@ricky3dec The PubSub tests have been re-enabled. |
Thanks @yongtang. Hope to send the initial review soon, didn't get chance to work on this until recently. |
Hello @yongtang , finally got chance to work on it now.:-) Need some clarification. I was just adding a simple test using Bigtable emulator. I have already created the script for starting docker instance of Bigtable emulator and was adding a test in "test_io_dataset_eager.py" as suggested by you earlier. However I observe most of the tests in this file are for operations that are under "tensorflow_io/core/python/experimental" and all of them have their stream methods defined for e.g : "tfio.experimental.IODataset.stream().from_video_capture". So just wanted to know if I have to create a similar invocation "from_bigtable", if yes where is the right place: tensorflow_io/core/python/ops/io_dataset.py OR Please let me know if I am missing anything. Thanks |
@ricky3dec I think On a separate note, (not immediately related to Bigtable now, but soon-to-be), I am reconsidering the grouping of APIs. At the beginning of the project, we start with one-ops in one directory/module. So we ends up with many modules tensorflow_io.kafka, tensorflow_io.kinesis, tensorflow_io.azfs, tensorflow_io.lmdb..., which quickly grows into an unmanageable situation. Since then, we try to consolidate the API to group dataset into @ricky3dec For BigTable, I think stay with |
sure, thanks @yongtang ! |
Hello @yongtang , I raised an Initial PR for review only, request you to help with some clarifications I asked in PR. Thanks |
The original contrib.BigTable has its own internal test, but it would be fantastic if we add tensorflow_io style test using BigTable emulator.
The text was updated successfully, but these errors were encountered: