Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question - Is there any benchmarking ClassImbalance.jl's ability to handle datasets of different sizes? #69

Open
00krishna opened this issue Dec 22, 2019 · 1 comment

Comments

@00krishna
Copy link

00krishna commented Dec 22, 2019

Describe the bug

This is just a question about Classimbalance.jl's ability to handle different size datasets. I was working with the python imbalance-learn package, and it keeps crashing when I give it a dataset of more than 2-3 million rows. In the case of imbalanced data, this is to be expected since it takes so many false examples to get a positive one. I can find creative ways to "thin" the dataset, but I was just wondering if there were any tests on how the julia package handles larger datasets?

Thanks.

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior

Screenshots

Desktop (please complete the following information):

  • OS: [e.g. iOS] - Ubuntu 18.04 LTS x64
  • Browser [e.g. chrome, safari] Firefox,
  • Version [e.g. 22] 71

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

NA
Additional context

@DilumAluthge
Copy link
Member

We currently don't have any benchmarks, but a pull request to add some benchmarks would be welcome!

I think there is room to improve the performance of this package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants