Unlike other languages, Khmer Word Segmentation is way more complected. Because the Khmer language does not have any standard rule on how we are using space to separate between each word(space are used for easier reading). Moreover, Khmer word can have different meaning with the order of words when it will form. Khmer word could also be a join of two or more Khmer words together.
Because of uncertain rule of spacing and the complicated structure above, which it is hard to segment Khmer Word.
Ref:
- word segmentations: user to input string of sentences and submit then it response with list of words in those sentences.
- words checking: user submit sentences then it response with sentences and some suggestion word
- words contribution: allow user input Khmer words with it function(noun, verb,...) then we use it to train our model