You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
when I checked the numeral dimension, I figured out that there is an issue with the numeral extractor.
in Persian, صدرا is a first name, and صد means hundred. when the name is fed into the Duckling the output is 100را which is not correct. I tried some other examples and found it in another example when the verb بده (which means give) is converted to ب10 cause ده means ten in Persian.
How can I change the configuration of Duckling to enforce rules to apply just to tokens that are space separated?
The text was updated successfully, but these errors were encountered:
AmirMohamadBabaee
changed the title
extract incorrect number from text in Persian
[Persian][Numeral] Extract incorrect number from text
Feb 8, 2023
Hello @AmirMohamadBabaee , I had the same error. "بده" contains the substring "ده", so it matches "ده" in the regex. I solved this problem by adding ^ to beginning and $ to end of regex. it ensures "ده" is the entity only if it's an standalone word and not substring of another word.
for example for ruleToNineteen function , you must write your regex like this:
regex "(?:^|\s)(صفر|یک|سه|چهارده|چهار|پنج|شی?ش|هفت|هشت|نه|یازده|دوازده|سیزده|پ(ا|و)نزده|ش(ا|و)نزده|هی?فده|هی?جده|نوزده|ده|دو)(?:$|\s)"
i used two slashes before s, i don't know why it's not showing properly, but make sure to use it correctly (\ + \ + s).
when you test again , you see that the problem is fixed, I hope :)
when I checked the numeral dimension, I figured out that there is an issue with the numeral extractor.
in Persian,
صدرا
is a first name, andصد
means hundred. when the name is fed into the Duckling the output is100را
which is not correct. I tried some other examples and found it in another example when the verbبده
(which means give) is converted toب10
causeده
means ten in Persian.How can I change the configuration of Duckling to enforce rules to apply just to tokens that are space separated?
The text was updated successfully, but these errors were encountered: