Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The _NFP suffix of some lemmas. #3

Closed
foxik opened this issue Oct 14, 2015 · 2 comments
Closed

The _NFP suffix of some lemmas. #3

foxik opened this issue Oct 14, 2015 · 2 comments

Comments

@foxik
Copy link
Member

foxik commented Oct 14, 2015

Some lemmas with NFP part of speech have _NFP suffix, for example see first 20 NFP tokens in the training data:

1   ---------  ---------_NFP  PUNCT  NFP  _  0   root       _  _
1   ***        ***            PUNCT  NFP  _  0   root       _  _
1   +          +              SYM    NFP  _  17  punct      _  _
60  +          +              SYM    NFP  _  55  punct      _  _
9   -          -_NFP          PUNCT  NFP  _  11  punct      _  _
1   ...        ...            SYM    NFP  _  0   root       _  _
1   ...        ...            SYM    NFP  _  0   root       _  _
1   ...        ...            SYM    NFP  _  0   root       _  _
1   ...        ...            SYM    NFP  _  0   root       _  _
9   :-RRB-     :-rrb-_NFP     SYM    NFP  _  5   discourse  _  _
1   -          -_NFP          PUNCT  NFP  _  3   punct      _  _
9   :-RRB-     :-rrb-_NFP     SYM    NFP  _  3   discourse  _  _
1   =          =              SYM    NFP  _  3   punct      _  _
1   ==         ==             SYM    NFP  _  5   discourse  _  _
20  ***        ***            PUNCT  NFP  _  21  punct      _  _
22  ***        ***            PUNCT  NFP  _  21  punct      _  _
18  :-RRB-     :-rrb-_NFP     SYM    NFP  _  3   discourse  _  _
1   *          *              PUNCT  NFP  _  2   punct      _  _
12  *          *              PUNCT  NFP  _  2   punct      _  _
1   *          *              PUNCT  NFP  _  2   punct      _  _

Is this deliberate, or it is some kind of error or compatibility issue?

Personally I would drop the _NFP suffix of lemmas.

EDIT: I updated the example to use -RRB- instead of ), I performed grep on wrong branch.

@manning manning added the bug label Oct 16, 2015
@manning
Copy link
Contributor

manning commented Oct 16, 2015

Hi @foxik, this is indeed an error, thanks! "NFP" is a new Penn treebank POS tag for Non-Final Punctuation, and I guess something is going wrong with our pipeline on it, and it's living on in the lemmas. This should be fixed.

@manning
Copy link
Contributor

manning commented Oct 28, 2015

Fixed in our version 1.2 release candidate.

@manning manning closed this as completed Oct 28, 2015
amir-zeldes added a commit that referenced this issue Jun 9, 2020
  * Fixes #88
  * Implemented using the following DepEdit script:

```
#no special adverbial status for WH adverb subordinations
lemma=/^(when|how|where|while|why|whenever|wherever)$/&func=/mark/&head=/(.*)/&pos=/ADV/	none	#1:pos=SCONJ
lemma=/^(when|how|where|while|why|whenever|wherever)$/&func=/advmod/&head=/(.*)/	none	#1:func=mark;#1:pos=SCONJ;#1:head2=$2:mark

#exception for WH adverbs in questions, identified by question mark and not being an advcl
func!=/advcl/;lemma=/^(when|how|where|while|why|whenever|wherever)$/&pos=/SCONJ/&func=/mark/&head=/(.*)/;text=/^(\?+)!?$/	#1>#2;#2.*#3	#2:func=advmod;#2:pos=ADV;#2:head2=$2:advmod

#exception for 'why not'
func=/root/;lemma=/why/&func=/mark/&head=/(.*)/;lemma=/not/\t#1>#2;#2.#3\t#2:func=advmod;#2:pos=ADV;#2:head2=$1:advmod

#exception for do support
func=/root/;lemma=/^(why|how|when|where)$/&func=/mark/&head=/(.*)/;lemma=/do/\t#1>#2;#2.#3\t#2:func=advmod;#2:pos=ADV;#2:head2=$2:advmod
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants