-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimised eif_new.py #24
base: master
Are you sure you want to change the base?
Conversation
Is this still an active project? |
That's a good question @wundermahn . If you want optimised Python version, you can get it directly from my fork. |
Hi there, this would be the fix for my problem as well, would it? I am currently trying to pickle the isolationForest model and failing due to som Cython issue:
|
hi @psmgeelen , yes, you can't save models from Cython version. Try my fork - it has a performance similar to Cython version, but is implemented in Python (with Numba optimisations). |
@lpryszcz , you are the best! I will get on it now! So I really only need the EDIT: It works out of the box, I love the script! Small questions though, does it make sense to have a threshold that is always 0.5? Instead you could just push the values directly. |
I'm glad it works for you :) And thanks for the recommendation @psmgeelen . I'd be more than happy to contribute to scikit-learn given there is interest from their side. |
I've optimised Python version so it matches performance with C++ version and allow saving the models.
There is runtime examle added to Notebooks/comparison_py_cxx.ipynb
The code was rewritten entirely. Some functions are optimised with numba.
The iForest is now a numpy array, which allow fast computation and model dump with low storage footprint.