- 
                Notifications
    You must be signed in to change notification settings 
- Fork 4
SaarlandMweDiscussion
Discussion of MWEs, inspired by Ann's participation in PARSEME.
We have started a new page on MWEs (MweTop) to which we will link various relevant things.
- 
things with spaces - some interfacing with morphology
 
- 
things made into a single predicate - look up
 
- 
things recognized as larger things with idiom matchin - 
determiner-less PPs in hospital - also occurs outside/slightly genericy
- semi-lexicalized
 
- 
idiom thingies keep tabs on - some work at NTU/CSLI on possessive idioms
 + note: not marked as a unit in the MRS output + supported by LKB and ACE 
- 
different types of idiomaticity detless_pp vs flexible idioms 
- 
we have paraphrase rules for many of these - but not perfect out of your tiny mind
 
 
- 
- 
how does the interface with chart mapping/tokenization 
- 
what about the idiomatic/non-idiomatic distinction - we don't enforce it perfectly
 
- 
we maybe have more examples of MWEs with structure than anyone else - although we don't have as many examples as e.g. in wordnet
 
- 
SRG: words with spaces, verb+particle, idioms (take into account) 
- 
Matrix: no idioms (FCB: there is documentation on the wiki) 
- 
NorSource: not yet 
- 
Burger: some types for verb+complement 
- 
Jacy: all kinds, even documentation - 
not so good with things like te-nakareba-narimasen, complex pps 
 
- 
- 
Hegram: nothing 
- 
MCG: nothing - Chengyu (four character idioms)
- treat them as non-compositional
- NTU has a list of these with some more information (with help from Mike and Ning)
- there are also non-Chengyu idioms
 
 
- Chengyu (four character idioms)
- 
we can have both internal and external modification (for some idioms) - the cat kicked all nine buckets (Mike)
 
- 
a lot of regional use 
- 
treat proverbial the same as fucking (can go anywhere) 
- 
in general adding MWEs adds ambiguity so we tend not to add them - 
if they help in parse-selection it would be worth putting them in 
- 
even very common things like Thank you and good morning 
 
- 
- 
institutionalized phrases traffic light/traffic signal 
- 
light verbs/light verby idioms give a rat's arse [about] 
- 
proverbs --- how do we handle these a stitch in time saves nine) - interestingly cross-lingually
- often contains frozen bits of older grammars
 
- 
fixed foreign phrases (que sera sera) - interesting to see if there are differences
- in flexibility between old English vs foreign
 
 
- interesting to see if there are differences
- 
NPIs are on the edge of this phenomenon 
- 
things like you may wish to -> you should (post-process) 
- 
If you like currently words-with-space in ERG 
- MWEs with structure in wordnets
- Lots of work in Japan, e.g. on idiom/literal (Chikara)
Home | Forum | Discussions | Events