You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Sep 3, 2022. It is now read-only.
If a text file in GCS has any non-ascii characters, attempting to read_stream the file results in the following error message: UnicodeDecodeError: 'ascii' codec can't decode byte 0x8e in position 54628: ordinal not in range(128)
I suggest adding an errors argument, like in Python 3's built-in open function. The option to ignore encoding errors or replace malformed data would make reading text files from GCS in DataLab much easier. The workaround I resorted to was to download the text locally and clean it before re-uploading to GCS
pydatalab/google/datalab/storage/_object.py
Lines 190 to 205 in d903190
If a text file in GCS has any non-ascii characters, attempting to read_stream the file results in the following error message:
UnicodeDecodeError: 'ascii' codec can't decode byte 0x8e in position 54628: ordinal not in range(128)
I suggest adding an
errors
argument, like in Python 3's built-in open function. The option to ignore encoding errors or replace malformed data would make reading text files from GCS in DataLab much easier. The workaround I resorted to was to download the text locally and clean it before re-uploading to GCSAdditionally, the download and read_lines functions are not documented on the included storage documentation (http://localhost:8081/notebooks/datalab/docs/tutorials/Storage/Storage%20APIs.ipynb)
The text was updated successfully, but these errors were encountered: