Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Convert double letter translation to capital letter as very hard to understand what the translation is because of duplicate, for example: kh - is it k and h or kh? tskh - is it t,s,kh or ts,k,h or ts,kh, etc... 0xa2 Hebrew bible puncheation mark, should be ignored. 0xc6 Opposite Nun, same as 'n'. 0xba Hulam Haser, vawel as 'o'. 0xbf Makaf Raphe, same as Makaf (0xbe). 0xc5 Hebrew bible puncheation mark, should be ignored. 0xc7 Makaf katan, vowel as 'o'. 0xd0 Aleph, sounds as AHA must exist to make string readbale. Distinguish from '`' use capital A to distinguish from 'a' vowel. 0xf5 Splitted Vave, same as 'v'. 0xf6 Opposite Nun, same as 'n'. 0xf7 Small Kuf, same as 'q'. Signed-off-by: Alon Bar-Lev <[email protected]>
- Loading branch information
81f938d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alonbl reverse-nun is a punctuation mark and it should not be transliterated to n
81f938d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the notice, not that I thought anyone will actually use it :)
Feel free to submit a pull request as simple as:
81f938d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally I believed it should be marked as unknown (None) rather than ignore (''). but now I see that the Paragraph sign ¶ is transliterated to P so your choice seems to be consistent. However I do believe that these choices could be an issue and would like to have an option to avoid replacing punctuation marks by regular letters. opened an issue for the more general case.
also @alonbl could you kindly refer me to why you added 05f5, 05f6, 05f7 - as i could not find these in the unicode specification.