-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HierarchicalClustering Losing Values #11
Comments
The bug occurs when there are two equal values in the list (unless they’re consecutive, so that the second one gets the index of the first when the first is removed). Instead of this (cluster.py around line 637):
Try this:
You could also obtain both the values first and only then delete them, but the above seems simpler. |
Thanks for the update. I'll see to it in the near future. Have you tested your proposed solution? I'm not on my dev-box right now. Also... this is a fairly old piece of library. I'll have to dig bag in an understand my own code again... with any luck it should not take all too long ^_^ Also, you could propose a pull-request to speed things up ;) |
It’s working for me, but I have not thoroughly tested it, which is also the reason why I chose to propose it here rather than filing a pull request right away. |
I think you could also optimize the performance if you found a way to not recalculate the parts of the matrix that could not have changed. |
I tried your proposed fix, but that caused another test to fail. I have been looking into this and I find that my initial implementation of the algorithm could be revised. And more importantly should be better tested. I am looking into this. |
Sorry for the delay... We had a death in the family, and I still am a bit shaken. But this problem in keeps me awake at night... I have been working on a better unit-test (see http://michel.albert.lu). Alas, implementing a test with that data set did not reproduce the error! I have now taken up the original values from the unit-test, and while writing up I realised that one value is duplicated ( |
Oh, I’m sorry to hear that, my condolences. |
Hi, and thanks for the library!
I’m losing values though:
I’ll try to understand the algorithm to see where the 30 went, but maybe you’re quicker.
In fluffiness,
Telofy
The text was updated successfully, but these errors were encountered: