Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

could you add chinese support? #145

Closed
jjwt opened this issue Apr 24, 2014 · 11 comments
Closed

could you add chinese support? #145

jjwt opened this issue Apr 24, 2014 · 11 comments
Labels

Comments

@jjwt
Copy link

jjwt commented Apr 24, 2014

Thank you for the excellent plugin.
In the help, I find that:

|Easymotion| can match multibyte Japanese character with alphabetical input.

I think,in the same way, chinese should also be supported.
Then after a few day's work, I replace the content of file

vim-easymotion/autoload/EasyMotion/migemo/utf8.vim

with here,(I don't know how to upload file here) and put below in my vimrc:

let g:EasyMotion_use_migemo = 1

and it works!
However, replacing is not a good way, so could you add chinese support
based on this?

Things may be like this:

  1. add a new dirctory named as migemo_chinese
  2. add utf8.vim cp936.vim gb18030.vim (for utf8,cp936 and gb18030
    are used often in China)
  3. add an option like:
    let g:EasyMotion_use_migemo_chinese = 1
@yuex
Copy link
Contributor

yuex commented Apr 24, 2014

@jjwt , hi

For Chinese, things like what you mentioned have already been done, for example, check out vimim. Perhaps, implementing it in easymotion may be a kind of reinventing the wheel. And even if no other plugin have done it yet, I still don't think easymotion should do it. In my opinion, the feature you mentioned is more related to a language plugin not a motion plugin like easymotion. Perhaps, it's much better if easymotion can provide a way to let the user-defined vimscript function overwrite the function that returns the list of matched positions. But it's just a opinion of mine

@jjwt
Copy link
Author

jjwt commented Apr 24, 2014

@yuex hi

it's much better if easymotion can provide a way to let the user-defined vimscript function overwrite the function that returns the list of matched positions.

  1. that's really much better.
    It exists in all non-english languages
    to fast move in non-ascii words and megemo may offer a general
    solution. Not only chinese or japanese,
    maybe you can provide a interface (a function?) to handle
    a charactor-words formated file offered by user ?
  2. for vimim, I know it, but it has quit a few problems and has not
    been updated for 2 years. And in my opinion, it is rather like an
    ime than fast move plugin.

@haya14busa
Copy link
Member

Hi, @jjwt @yuex

Perhaps, it's much better if easymotion can provide a way to let the user-defined vimscript function overwrite the function that returns the list of matched positions

Yeah, I agree. Maybe this issue is related with #135. It's not difficult to implement, but there are some problems I should consider more, so please wait a little.

As for chinese migemo, I don't know a nice dictionary which convert alphabet to chinese. Actually, I didn't make EasyMotion's Japanese dictionary, it's just generated from the external cmigemo command line tool.

If you want EasyMotion to integrate with chinese, please let me know the conversion tool from alphabet to chinese.

@jjwt
Copy link
Author

jjwt commented May 13, 2014

hi, @haya14busa
sorry for reply so late.

There are many Chinese input methods for computers, most of which
fall into one of two categories: phonetic readings or root shapes.
for above i use phonetic-based, which we call Pinyin Input method.
see more here

Every single chinese can be expressed several characters, ex:
"我" can be expressed as "wo",
and i use the first character, so the maps is :
w:[我]
The same for other single-chinese words

as i say above, there are many different input methods, and from here
i know it is the same for japanese :)

different methods need different characters-words map dictionary,
which are already in our operation system. i do it in my computer as below:

  1. my computer is xp for simple-chinese, other system should be similar

  2. cd to c:\WINDOWS\system32\ and i find several MB files:

    • WINPY.MB
    • WINSP.MB
    • winwb86.MB
    • WINZM.MB

    here, WINPY.MB is the one, in which "PY" means "Pinyin".

  3. run c:\Program Files\Windows NT\Accessories\Imegen.exe,

  4. go to "逆转换" tab page

    (it is in my computer, for english it may be "reverse transform")

  5. click "浏览"

    (it is in my computer, for english it may be "browse")

  6. select the WINPY.MB as above

  7. click "逆转换"

    (it is in my computer, for english it may be "reverse transform")
    as a result, a file named "WINPY.TXT" is generated in
    c:\WINDOWS\system32\

  8. for my computer, the first 23 lines are below:

    [Description]
    Name=全拼
    MaxCodes=12
    MaxElement=1
    UsedCodes=abcdefghijklmnopqrstuvwxyz
    WildChar=?
    NumRules=3
    [Rule]
    ca4=p10+p20+p30+p40
    ce2=p10+p20
    ce3=p10+p20+p30
    [Text]
    啊a
    阿a
    呵a
    吖a
    嗄a
    腌a
    锕a
    錒a
    阿爸aba
    阿昌achang
    厑aes
    
  9. so the next are remove non-single-words and merge and sort them by a-z
    i write below in convert_mb.py:

    import vim
    vim.command(r'1,12d')
    vim.command(r'v/\v^\W\w/d')
    b = vim.current.buffer
    c = {}
    for i, v in enumerate(b):
        word, pys = v[0], [j[0] for j in v[1:].split(' ')]
        for key in pys:
            try:
                c[key].add(word)
            except:
                c[key] = set([word])
    d = []
    for key in sorted(c.keys()):
        d.append(r"\ '{}' : '\%([{}]\)',".format(key, ''.join(c[key])))
        d.append(r"\ '{}' : '\%([{}]\)',".format(key.upper(), ''.join(c[key])))
    b[:] = sorted(d)
    

    then i open the txt file and run:

    :py3file convert_mb.py
    
  10. you know what is the next now.

hope it helps, and the method can also do for japanese.
i count the wors in dictionary, it is only about 26000, so i think
the mb file just cover GBK code, maybe i need another mb file which
can cover gb18030 code.

@jjwt
Copy link
Author

jjwt commented May 13, 2014

hi, @haya14busa

If you want EasyMotion to integrate with chinese, please let me know the conversion tool from alphabet to chinese.

the above may be too tedious, in short,for xp:

  1. tool: imegen.exe
  2. dictionary file: winpy.mb
  3. method: reverse transform and a little operations in vim.

@weakish
Copy link

weakish commented Nov 28, 2014

There are many Chinese input methods for computers

@jjwt IMO, although different input methods exist (there are even input methods convert English words to Chinese), the equivalent of Romaji for Japanese (as used by cmigemo) is Pinyin.

@weakish
Copy link

weakish commented Nov 28, 2014

As for chinese migemo, I don't know a nice dictionary which convert alphabet to chinese.

@haya14busa There are a lot dictionaries.

For example, this one contains both Traditional (zh_TW) and Simplified (zh_CN) Chinese characters.

@haya14busa
Copy link
Member

For example, this one contains both Traditional (zh_TW) and Simplified (zh_CN) Chinese characters.

Thanks @weakish .
It seems good to use as a conversion dictionary. Can you make dictionary for vim script like cmigemo's one?

@weakish
Copy link

weakish commented Dec 2, 2014

Can you make dictionary for vim script like cmigemo's one?

I have converted it to:

  1. a cmigemo compatible dictionary file

    I've tested it with cmigemo. It works.

  2. zh_utf8.vim

    A builtin version so a vim EasyMotion user can use basic functions without cmigemo installed.

    I refer to the format to Japaneses ones in EasyMothion repository and auto-convert it with [this Rakefile].(https://github.com/weakish/migemo-dict-zh/blob/master/Rakefile.rb)
    But I cannot speak VimL, so I am not sure whether its format is correct.
    Please point out errors if there are any.

These dictionaries are UTF-8 only.
It's trivial to convert them into other Chinese encoding.
Please tell me if you think it's necessary to add other encoding.

@haya14busa
Copy link
Member

pull-requests are welcome but I don't have any plan to work on this now.

@VimWei
Copy link

VimWei commented Jun 17, 2021

Is there any update? The following plugin work fine independently: Plug 'ppwwyyxx/vim-PinyinSearch'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants