This repository has been archived by the owner on Nov 22, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 799
refactor ScriptTensorizer with general tensorize API #1117
Closed
chenyangyu1988
wants to merge
1
commit into
facebookresearch:master
from
chenyangyu1988:export-D18386345
Closed
refactor ScriptTensorizer with general tensorize API #1117
chenyangyu1988
wants to merge
1
commit into
facebookresearch:master
from
chenyangyu1988:export-D18386345
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
Do not delete this pull request or issue due to inactivity.
label
Nov 8, 2019
This pull request was exported from Phabricator. Differential Revision: D18386345 |
chenyangyu1988
added a commit
to chenyangyu1988/pytext
that referenced
this pull request
Nov 8, 2019
…h#1117) Summary: Pull Request resolved: facebookresearch#1117 This diff introduced a general API for handling different inputs. In most general case, we would expect inputs to be either 1) multiple rows, each row contains a list of text (in most case it is single sentence or a pair) ===> List[List[str]] 2) multiple rows, each row contains a list of pre-processes tokens (in most case it is single sentence or a pair) ===> List[List[List[str]]] For single sentence classification task, we would expect inputs to be either 1) multiple rows, each row contains a single text ===> List[str] 2) multiple rows, each row contains a single pre-processed tokens ===> List[List[str]] This refactoring provides two general API 1) def tensorize( self, texts_list: Optional[List[List[str]]] = None, tokens_list: Optional[List[List[List[str]]]] = None, ) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]: 2) def tensorize_single( self, texts_list: Optional[List[str]] = None, tokens_list: Optional[List[List[str]]] = None, ): And internally it will automate handle the passed inputs is texts or tokens Differential Revision: D18386345 fbshipit-source-id: 90d19a7d8dad57d16f274a3b389445fa71c8d105
chenyangyu1988
force-pushed
the
export-D18386345
branch
from
November 8, 2019 22:29
3af3589
to
6a42620
Compare
This pull request was exported from Phabricator. Differential Revision: D18386345 |
chenyangyu1988
added a commit
to chenyangyu1988/pytext
that referenced
this pull request
Nov 8, 2019
…h#1117) Summary: Pull Request resolved: facebookresearch#1117 This diff introduced a general API for handling different inputs. In most general case, we would expect inputs to be either 1) multiple rows, each row contains a list of text (in most case it is single sentence or a pair) ===> List[List[str]] 2) multiple rows, each row contains a list of pre-processes tokens (in most case it is single sentence or a pair) ===> List[List[List[str]]] For single sentence classification task, we would expect inputs to be either 1) multiple rows, each row contains a single text ===> List[str] 2) multiple rows, each row contains a single pre-processed tokens ===> List[List[str]] This refactoring provides two general API 1) def tensorize( self, texts_list: Optional[List[List[str]]] = None, tokens_list: Optional[List[List[List[str]]]] = None, ) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]: 2) def tensorize_single( self, texts_list: Optional[List[str]] = None, tokens_list: Optional[List[List[str]]] = None, ): And internally it will automate handle the passed inputs is texts or tokens Differential Revision: D18386345 fbshipit-source-id: 7f767e3958d3053801137e454f15a1dd5ae37757
chenyangyu1988
force-pushed
the
export-D18386345
branch
from
November 8, 2019 22:30
6a42620
to
bdf049a
Compare
This pull request was exported from Phabricator. Differential Revision: D18386345 |
…h#1117) Summary: Pull Request resolved: facebookresearch#1117 This diff introduced a general API for handling different inputs. In most general case, we would expect inputs to be either 1) multiple rows, each row contains a list of text (in most case it is single sentence or a pair) ===> List[List[str]] 2) multiple rows, each row contains a list of pre-processes tokens (in most case it is single sentence or a pair) ===> List[List[List[str]]] For single sentence classification task, we would expect inputs to be either 1) multiple rows, each row contains a single text ===> List[str] 2) multiple rows, each row contains a single pre-processed tokens ===> List[List[str]] This refactoring provides two general API 1) def tensorize( self, texts_list: Optional[List[List[str]]] = None, tokens_list: Optional[List[List[List[str]]]] = None, ) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]: 2) def tensorize_single( self, texts_list: Optional[List[str]] = None, tokens_list: Optional[List[List[str]]] = None, ): And internally it will automate handle the passed inputs is texts or tokens Differential Revision: D18386345 fbshipit-source-id: 0061b0968b908c1e7d08bc2f73759e1ebd74b9f4
chenyangyu1988
force-pushed
the
export-D18386345
branch
from
November 8, 2019 22:34
bdf049a
to
4e28314
Compare
This pull request was exported from Phabricator. Differential Revision: D18386345 |
This pull request has been merged in 5f5b164. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
This diff introduced a general API for handling different inputs.
In most general case, we would expect inputs to be either
For single sentence classification task, we would expect inputs to be either
This refactoring provides two general API
def tensorize(
self,
texts_list: Optional[List[List[str]]] = None,
tokens_list: Optional[List[List[List[str]]]] = None,
) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]:
def tensorize_single(
self,
texts_list: Optional[List[str]] = None,
tokens_list: Optional[List[List[str]]] = None,
):
And internally it will automate handle the passed inputs is texts or tokens
Differential Revision: D18386345