Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading a csv file in what-if tool to explore data #1794

Closed
ellehoej opened this issue Jan 31, 2019 · 26 comments
Closed

Loading a csv file in what-if tool to explore data #1794

ellehoej opened this issue Jan 31, 2019 · 26 comments

Comments

@ellehoej
Copy link

ellehoej commented Jan 31, 2019

Hi,
I'm trying to use the functionality in the What-If Tool, as described in the Readme file, of loading a .csv file in order to just explore and visualise the data.
However, when I specify the file path to the csv (in this case to "adult.csv" which is in the Tensorboard logdir folder) in "Path to examples", I keep getting a RequestNetworkError:

Request failed: RequestNetworkError: RequestNetworkError: 400 at /data/plugin/whatif/examples_from_path?examples_path=adult.csv&max_examples=1000&sampling_odds=1&sequence_examples=false

I have tried to change the formatting and the location of the file with no success.

Is it possible for you to elaborate on the process of loading a .csv file or perhaps give an example?

Thanks,
Mads

@jameswex
Copy link
Contributor

jameswex commented Feb 1, 2019

Please try using the full path to the file on your machine, as opposed to a relative path from the logdir. I believe that will work, and if it does, I will update the README to be more explicit about the csv path.

@tomalbrecht
Copy link

tomalbrecht commented Feb 4, 2019

Same Problem here with tf1.12 and full path (OSX Mojave).

Request failed: RequestNetworkError: RequestNetworkError: 400 at /data/plugin/whatif/examples_from_path?examples_path=%2FUsers%2Ftom%2Ftitanic.csv&max_examples=1000&sampling_odds=1&sequence_examples=false

@ellehoej
Copy link
Author

ellehoej commented Feb 4, 2019

I've tried full (and relative) path on Linux and Windows (tensorboard 1.12.2) but I still get the same error as above.

@jameswex
Copy link
Contributor

jameswex commented Feb 4, 2019

Does the device/container on which the tensorboard server is executing have access to that file path? The file is loaded by the server itself and the path must be something accessible by the server, not the client/browser connecting to the tensorboard. The documentation isn't clear on this, and I will fix it to be clear. And perhaps we should enable CSV upload from a client so the server doesn't need to be able to load the CSV directly.

Locally I've verified that if I build and run tensorboard from HEAD on my machine ("bazel run tensorboard:tensorboard -- --logdir /tmp") and then navigate to the WIT tab on that tensorboard, and provide a CSV file that has been downloaded to that local machine as the path to examples ("/Users/jwexler/Downloads/test.csv"), that the data points from the CSV load correctly in the tool.

Another thing to check out is the new Jupyter/colab notebook mode where you use WIT directly inside a notebook instead of TensorBoard. In that case, you can load a CSV and convert it to a list of tf.Examples for use in WIT, as shown in this example notebook: https://colab.research.google.com/github/tensorflow/tensorboard/blob/master/tensorboard/plugins/interactive_inference/What_If_Tool_Notebook_Usage.ipynb

@ellehoej
Copy link
Author

ellehoej commented Feb 5, 2019

Thanks for the effort @jameswex. It might be something with access, although Tensorboard picks up logfiles from the location of the csv.
I've tried through a docker container on Linux and more simple through Tensorboard in conda on Windows. I've installed through pip or conda.
Both times the server sees the logfiles, but not the csv.
The jupyter plugin is great, but it sounds good with the possibility of csv upload for quick inspection of a data file.

@ellehoej ellehoej closed this as completed Feb 5, 2019
@ellehoej ellehoej reopened this Feb 5, 2019
@jameswex
Copy link
Contributor

jameswex commented Feb 5, 2019

Oh, I just realized this bug mentions version 1.12. CSV mode wasn't added until TensorBoard 1.12.2 I believe. Please try upgrading your TB version.

@ellehoej
Copy link
Author

ellehoej commented Feb 5, 2019

I'm using 1.12.2. 😊

@jameswex
Copy link
Contributor

jameswex commented Feb 5, 2019

Do you see any errors being printed to stderr/stdlog by the tensorboard server being run, when loading the csv? Also, is it the adult.csv from kaggle https://www.kaggle.com/uciml/adult-census-income/version/3 ?

@ellehoej
Copy link
Author

ellehoej commented Feb 5, 2019

Yes, it's a modified version of the adult data from the original source where I've added headings.
I've tried with other csv's as well.
I don't get any other error than the one specified above. If I deliberately add quotation marks around the path I get an error in the terminal about the server not being able to find the file / wrong formatting.

@jameswex
Copy link
Contributor

jameswex commented Feb 5, 2019

Let's debug offline and then update this thread when we figure it out. Send me an email at jwexler [at] google.com, thanks

@tomalbrecht
Copy link

Any news on this?

  • We ran whatif on a anaconda setup and native (Windows and Linux).
  • We trained a model and could use whatif - but not the facets part
  • On windows we tried different directories (not the protected root directory)
  • We tried admin rights as well to deal with messed up permissions.

No success. In what kind of setup did this plugin ran in your test installation?

@jameswex
Copy link
Contributor

@tomalbrecht what specifically is failing for you? You say whatif is working but not the facets part. What part isn't working as expected? Is it just the loading of data from csv, or is it something else as well?

I've run the tool on both MacOS and Linux, usually with TensorFlow/TensorBoard installed in virtualenvs.

@ellehoej
Copy link
Author

@tomalbrecht I've been on vacation until yesterday, so haven't had time to help debugging. I'll send @jameswex a mail in a moment.

@tomalbrecht
Copy link

tomalbrecht commented Feb 25, 2019

@jameswex: Yes, you were correct. In my former setup somehow I got an output somehow. But we could not change the input values at the data point editor.

Right now I tried to repeat this behavior and failed (Tensorflow 1.12.0) on MacOS (Mojave 10.14.3 (18D109)). I am not able to reproduce this behavior. When I try to load a csv I get the following error. The path seems HTML encoded. Is this intended?
I get no tensorboard error on my terminal.

I'll try with tensorflow version 1.13.

Request failed: RequestNetworkError: RequestNetworkError: 400 at /data/plugin/whatif/examples_from_path?examples_path=%2FUsers%2Ftom%2Fdata%2FX_train.csv&max_examples=1000&sampling_odds=1&sequence_examples=false

@jameswex
Copy link
Contributor

@tomalbrecht if you run TensorBoard with the runtime command line paramater "--alsologtostderr", what (if any) error output do you see when TensorBoard tries to load the csv?

@tomalbrecht
Copy link

tomalbrecht commented Feb 25, 2019

@jameswex Using TensorBoard 1.12.2 now. No error output at all:

iMac:~ tom$ tensorboard --alsologtostderr --logdir ./tmp

@jameswex
Copy link
Contributor

I just checked out the code in all the pip packages of TensorBoard for various versions. Turns out I was incorrect. The code that adds csv mode didn't make the 1.12.2 cut. It will be in 1.13. You can also find it if you pip install tf-nightly (to get the nightly latest build of TF and TensorBoard instead of the officially released versions).

@nfelt
Copy link
Contributor

nfelt commented Feb 25, 2019

TB 1.13.0 was released this morning, so you can install the latest tensorboard to get the CSV mode changes.

@tomalbrecht
Copy link

Well. Something changed. Running tf 1.13.1 and tb 1.13.0 now. The whatif-part seems to work. But the facets part is missing now. No error messages to terminal though. I used the titanic dataset from Kaggle.

bildschirmfoto 2019-02-26 um 13 12 49

bildschirmfoto 2019-02-26 um 13 12 18

@ellehoej
Copy link
Author

I also just got it running on Linux using the nightly-gpu-py3 docker image with TensorBoard 1.13.0a20190225 :)

image

image
Curiosly, Tensorboard looks a little different from @tomalbrecht's screenshot.

It didn't work on Anaconda Windows using tf-nightly or Tensorflow/TensorBoard 1.13.0. When starting Tensorboard, I received this error message, which could be something with Windows time formatting :

TensorBoard 1.13.0a20190225 at http://DK1802338:6006 (Press CTRL+C to quit)
Traceback (most recent call last):
  File "c:\users\mellehoej\appdata\local\continuum\anaconda3\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "c:\users\mellehoej\appdata\local\continuum\anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\mellehoej\AppData\Local\Continuum\anaconda3\Scripts\tensorboard.exe\__main__.py", line 9, in <module>
  File "c:\users\mellehoej\appdata\local\continuum\anaconda3\lib\site-packages\tensorboard\main.py", line 62, in run_main
    app.run(tensorboard.main, flags_parser=tensorboard.configure)
  File "c:\users\mellehoej\appdata\local\continuum\anaconda3\lib\site-packages\absl\app.py", line 300, in run
    _run_main(main, args)
  File "c:\users\mellehoej\appdata\local\continuum\anaconda3\lib\site-packages\absl\app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "c:\users\mellehoej\appdata\local\continuum\anaconda3\lib\site-packages\tensorboard\program.py", line 228, in main
    self._register_info(server)
  File "c:\users\mellehoej\appdata\local\continuum\anaconda3\lib\site-packages\tensorboard\program.py", line 274, in _register_info
    manager.write_info_file(info)
  File "c:\users\mellehoej\appdata\local\continuum\anaconda3\lib\site-packages\tensorboard\manager.py", line 268, in write_info_file
    payload = "%s\n" % _info_to_string(tensorboard_info)
  File "c:\users\mellehoej\appdata\local\continuum\anaconda3\lib\site-packages\tensorboard\manager.py", line 128, in _info_to_string
    for k in _TENSORBOARD_INFO_FIELDS
  File "c:\users\mellehoej\appdata\local\continuum\anaconda3\lib\site-packages\tensorboard\manager.py", line 128, in <dictcomp>
    for k in _TENSORBOARD_INFO_FIELDS
  File "c:\users\mellehoej\appdata\local\continuum\anaconda3\lib\site-packages\tensorboard\manager.py", line 50, in <lambda>
    serialize=lambda dt: int(dt.strftime("%s")),
ValueError: Invalid format string

@tomalbrecht
Copy link

tomalbrecht commented Feb 26, 2019

Updated to nighty TensorBoard 1.13.0a20190225. Same behavior and layout as @ellehoej. But I got it running somehow.

  1. Load your data (e.g. train.csv)
  2. Performance & Fairness
  3. Select your label (e.g. Survived for Titanic dataset)
  4. Switch to Features

voilá Feature Stats are shown.

@jameswex
Copy link
Contributor

Glad to hear it is mainly functioning in 1.13+. The updated What-If Tool visuals you both see in latest nightly builds are from a visual redesign that is being completed this week. The latest version of the pip-installable witwidget for notebooks will be updated with this redesign this week as well.

@ellehoej For the windows startup failure for Tensorboard, I think you should open a separate github issue.

I'll look into what the feature stats aren't showing up immediately upon data loading. Thank you both for your patience

@tomalbrecht
Copy link

thank you for your help and work :-)

@ellehoej
Copy link
Author

Thanks for your help again @jameswex
While typing in the title for the Windows issue, I found that somebody already opened it 7 hours ago:
#1895

@jameswex
Copy link
Contributor

Fixed facets overview initial rendering issue with csv mode. This fix will be in the next tb-nightly, or can wait for next official release.

Closing this bug as the initial issue and facets rendering issue are both gone from HEAD.

@nfelt nfelt mentioned this issue Mar 5, 2019
@Nithyashree-coder
Copy link

convert csv to .tfrecord file and give the path in the Path to examples

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants