This repository has been archived by the owner on Nov 22, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 799
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
Do not delete this pull request or issue due to inactivity.
label
Jul 12, 2019
Titousensei
force-pushed
the
export-D16232774
branch
from
July 13, 2019 00:06
11a767c
to
f1011ef
Compare
Titousensei
added a commit
to Titousensei/pytext
that referenced
this pull request
Jul 13, 2019
Summary: Pull Request resolved: facebookresearch#777 TSVDataSource documentation for delimiter param says Change to "," for csv, but csv files often have quoted fields and TSVDataSource does not support quoted fields. We cannot blindly force all fields starting with quotes to be treated as quoted fields, because it drastically changes the behavior: unclosed quoted fields will merge with the next row, swallowing \n characters until we find the closing quote. Some data sets might contain unclosed fields with quotes and rely on the current behavior. This diff adds a parameter to the TSVDataSource config that allows users to specify whether they want quoted fields. The default is False, which is the current behavior. Differential Revision: D16232774 fbshipit-source-id: 3286c6cceb04ec182a155595a38961a20b2c1c04
Resolves issue #747 |
Titousensei
force-pushed
the
export-D16232774
branch
from
July 19, 2019 22:59
f1011ef
to
63fd5be
Compare
Titousensei
added a commit
to Titousensei/pytext
that referenced
this pull request
Jul 19, 2019
Summary: Pull Request resolved: facebookresearch#777 TSVDataSource documentation for delimiter param says Change to "," for csv, but csv files often have quoted fields and TSVDataSource does not support quoted fields. We cannot blindly force all fields starting with quotes to be treated as quoted fields, because it drastically changes the behavior: unclosed quoted fields will merge with the next row, swallowing \n characters until we find the closing quote. Some data sets might contain unclosed fields with quotes and rely on the current behavior. This diff adds a parameter to the TSVDataSource config that allows users to specify whether they want quoted fields. The default is False, which is the current behavior. Differential Revision: D16232774 fbshipit-source-id: 8a152feaf22f25fbef6892906c55704452624815
Titousensei
force-pushed
the
export-D16232774
branch
from
July 22, 2019 18:10
63fd5be
to
8c1f88c
Compare
Titousensei
added a commit
to Titousensei/pytext
that referenced
this pull request
Jul 22, 2019
Summary: Pull Request resolved: facebookresearch#777 TSVDataSource documentation for delimiter param says Change to "," for csv, but csv files often have quoted fields and TSVDataSource does not support quoted fields. We cannot blindly force all fields starting with quotes to be treated as quoted fields, because it drastically changes the behavior: unclosed quoted fields will merge with the next row, swallowing \n characters until we find the closing quote. Some data sets might contain unclosed fields with quotes and rely on the current behavior. This diff adds a parameter to the TSVDataSource config that allows users to specify whether they want quoted fields. The default is False, which is the current behavior. Differential Revision: D16232774 fbshipit-source-id: 652f91e1462a010185934083f4dbcf82b99e8428
Titousensei
force-pushed
the
export-D16232774
branch
from
July 23, 2019 18:17
8c1f88c
to
671b06b
Compare
Titousensei
added a commit
to Titousensei/pytext
that referenced
this pull request
Jul 23, 2019
Summary: Pull Request resolved: facebookresearch#777 TSVDataSource documentation for delimiter param says Change to "," for csv, but csv files often have quoted fields and TSVDataSource does not support quoted fields. We cannot blindly force all fields starting with quotes to be treated as quoted fields, because it drastically changes the behavior: unclosed quoted fields will merge with the next row, swallowing \n characters until we find the closing quote. Some data sets might contain unclosed fields with quotes and rely on the current behavior. This diff adds a parameter to the TSVDataSource config that allows users to specify whether they want quoted fields. The default is False, which is the current behavior. Differential Revision: D16232774 fbshipit-source-id: 0110293a19f1179cee70b53060123e685a7af988
Summary: Pull Request resolved: facebookresearch#777 TSVDataSource documentation for delimiter param says Change to "," for csv, but csv files often have quoted fields and TSVDataSource does not support quoted fields. We cannot blindly force all fields starting with quotes to be treated as quoted fields, because it drastically changes the behavior: unclosed quoted fields will merge with the next row, swallowing \n characters until we find the closing quote. Some data sets might contain unclosed fields with quotes and rely on the current behavior. This diff adds a parameter to the TSVDataSource config that allows users to specify whether they want quoted fields. The default is False, which is the current behavior. Differential Revision: D16232774 fbshipit-source-id: 5bd625f95b8795d7d5cd07774d41b099b4b3766e
Titousensei
force-pushed
the
export-D16232774
branch
from
July 24, 2019 01:02
671b06b
to
18d2688
Compare
This pull request has been merged in 7ae0b2a. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
TSVDataSource documentation for delimiter param says Change to "," for csv,
but csv files often have quoted fields and TSVDataSource does not support
quoted fields.
We cannot blindly force all fields starting with quotes to be treated as
quoted fields, because it drastically changes the behavior: unclosed quoted
fields will merge with the next row, swallowing \n characters until we find
the closing quote. Some data sets might contain unclosed fields with quotes
and rely on the current behavior.
This diff adds a parameter to the TSVDataSource config that allows users to
specify whether they want quoted fields. The default is False, which is the
current behavior.
Differential Revision: D16232774