Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with bicleaner-classify #77

Closed
dbradley407 opened this issue Feb 23, 2023 · 6 comments
Closed

Issues with bicleaner-classify #77

dbradley407 opened this issue Feb 23, 2023 · 6 comments
Assignees

Comments

@dbradley407
Copy link

dbradley407 commented Feb 23, 2023

I have been trying to get the classify function to work on a small sample file. I have the en-fr training data from the site for metadata. I have been running the following command:

~/.local/bin/bicleaner-classify en-fr_short.txt corpus.en-fr.classified ./en-fr/en-fr.yaml

It completes with errors, but leaves the output file blank and with the following error display:

OSError: Cannot read model '/home/daniel.bradley/Documents/VSCODE/bicleanerdocker/en-fr/model.lm.en' (lm/model.cc:49 in void lm::ngram::detail::{anonymous}::CheckCounts(const std::vector<long unsigned int>&) threw FormatLoadException because counts.size() > 6'. This model has order 7 but KenLM was compiled to support up to 6. If your build system supports changing KENLM_MAX_ORDER, change it there and recompile. With cmake: cmake -DKENLM_MAX_ORDER=10 .. With Moses: bjam --max-kenlm-order=10 -a Otherwise, edit lm/max_order.hh.)

during install I used the command:
pip install https://github.com/kpu/kenlm/archive/master.zip --install-option="--max_order 7"
so I was under the impression max order was already set for 7, is there something I am missing? Thanks in advance.

@ZJaume
Copy link
Member

ZJaume commented Feb 23, 2023

It seems that the pip install you are running is not installing kenlm in the right place or that you previously had kenlm installed and running from .local is not sourcing the right kenlm installation.

@dbradley407
Copy link
Author

I have uninstalled and reinstalled all the instances of kenlm on my device but keep getting the same error. I also moved the bicleaner-classify to my working directory have been running it directly from there. Where is the correct place to install kenlm and how can I check where the program is accessing it?

@ZJaume
Copy link
Member

ZJaume commented Feb 24, 2023

Please, install everything in a virtual environment.

python -m venv venv
source venv/bin/activate
pip install pip==22.3.1
pip install bicleaner
pip install https://github.com/kpu/kenlm/archive/master.zip --install-option="--max_order 7"

@ZJaume
Copy link
Member

ZJaume commented Feb 24, 2023

There have been some changes in latest kenlm that I'm not even able to install (it takes a lot of time) to see if I get the same error. Please could you try to install a previous version?

pip uninstall kenlm
pip install https://github.com/kpu/kenlm/archive/ba17d213a27bbc7263cc30d4f293967aa1021cff.zip --install-option="--max_order 7"

@dbradley407
Copy link
Author

Thanks for all the advice, it seems the older version of Kenlm actually works. I will just use that for the moment. Thanks again!

@ZJaume
Copy link
Member

ZJaume commented Feb 24, 2023

I'll keep this open until KenLM fix it or we change the default install instructions

@ZJaume ZJaume reopened this Feb 24, 2023
@ZJaume ZJaume self-assigned this Feb 24, 2023
@ZJaume ZJaume closed this as completed Mar 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants