-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible to get parse tree output similar to AllenNLP? #11
Comments
the official xml output of Alpino are dependency structures, not parse
trees. So it is not just the format that is different.
There are other - less documented - output formats. With the option
end_hook=syntax you get something that looks close to what you describe.
Alpino -notk end_hook=syntax -parse Dit is een prachtige zin
0| [ @top_cat [ @start [ @max [ @root [ @np [ @pron [ @det Dit ] ] ] [
@optpunct ] [ @sv1 [ @v is ] [ @v2_vp [ @vpx [ @vproj [ @pred [ @np [ @det
een ] [ @n [ @A prachtige ] [ @n zin ] ] ] ] [ @vproj [ @vc [ @vb [ @v ] ]
] ] ] ] ] ] ] ] ] [ @optpunct ] ]
Gertjan
…On Wed, Feb 23, 2022 at 3:14 PM Julián Venhuizen ***@***.***> wrote:
Is it possible to have Alpino output the parse tree in the following
format:
In: "Several theories about the higher prevalence in males have been
investigated, but the cause of the difference is unconfirmed; one theory is
that females are underdiagnosed."
Out: (S (S (S (NP (NP (JJ Several) (NNS theories)) (PP (IN about) (NP (NP
(DT the) (JJR higher) (NN prevalence)) (PP (IN in) (NP (NNS males)))))) (VP
(VBP have) (VP (VBN been) (VP (VBN investigated))))) (, ,) (CC but) (S (NP
(NP (DT the) (NN cause)) (PP (IN of) (NP (DT the) (NN difference)))) (VP
(VBZ is) (ADJP (JJ unconfirmed))))) (: ;) (S (NP (CD one) (NN theory)) (VP
(VBZ is) (SBAR (IN that) (S (NP (NNS females)) (VP (VBP are) (ADJP (JJ
underdiagnosed))))))) (. .))
This output is currently achieved through the use of AllenNLP and a minimal
span-based neural constituency parser <https://arxiv.org/abs/1705.03919>.
However, as I'm also working with Dutch data I intend to use the Alpino
parser. If the above output isn't conceivable I suspect I have to go over
the XML output and work something out myself.
—
Reply to this email directly, view it on GitHub
<#11>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADJF4NL63F4WV24IQA7Q3LU4TTSLANCNFSM5PEPFVRA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thank you. That does indeed look similar. Do you have any documentation on the meaning of the tags in your example? I am unable to find anything online. It would me help a lot if I were to 'translate' these tags to the Penn Treebank bracket labels used in my example output above. |
nope, these labels were used internally. Not documenten, I fear.
GJ
…On Wed, Mar 2, 2022 at 3:40 PM Julián Venhuizen ***@***.***> wrote:
Thank you. That does indeed look similar. Do you have any documentation on
the meaning of the tags in your example? I am unable to find anything
online. It would me help a lot if I were to 'translate' these tags to the
Penn Treebank bracket labels used in my example output above.
—
Reply to this email directly, view it on GitHub
<#11 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADJF4OIDCQBRAG3IW2IPLDU554WXANCNFSM5PEPFVRA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you commented.Message ID:
***@***.***>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Is it possible to have Alpino output the parse tree in the following format:
In: "Several theories about the higher prevalence in males have been investigated, but the cause of the difference is unconfirmed; one theory is that females are underdiagnosed."
Out: (S (S (S (NP (NP (JJ Several) (NNS theories)) (PP (IN about) (NP (NP (DT the) (JJR higher) (NN prevalence)) (PP (IN in) (NP (NNS males)))))) (VP (VBP have) (VP (VBN been) (VP (VBN investigated))))) (, ,) (CC but) (S (NP (NP (DT the) (NN cause)) (PP (IN of) (NP (DT the) (NN difference)))) (VP (VBZ is) (ADJP (JJ unconfirmed))))) (: ;) (S (NP (CD one) (NN theory)) (VP (VBZ is) (SBAR (IN that) (S (NP (NNS females)) (VP (VBP are) (ADJP (JJ underdiagnosed))))))) (. .))
This output is currently achieved through the use of AllenNLP and a minimal span-based neural constituency parser. However, as I'm also working with Dutch data I intend to use the Alpino parser. If the above output isn't conceivable I suspect I have to go over the XML output and work something out myself.
The text was updated successfully, but these errors were encountered: