-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NotRegularMatrix exception for certain dataframes #32
Comments
Weird bug. |
I think it's failing because a matrix inverse is being computed, and possibly the determinant is very close to zero which is why it's that |
Here's some info I found. I printed all the matrices whose inverse the algorithm was computing. Here's the result: ...
Matrix[[-8.459899447643453e-14, -5.75239855749016e-12], [-5.75239855749016e-12, -10927.800950741155]]
Matrix[[-3.1308289294429086e-14, -2.128675014034775e-12], [-2.128675014034775e-12, -10927.800950740906]]
Matrix[[-1.1546319456101584e-14, -7.842171356742226e-13], [-7.842171356742226e-13, -10927.800950740813]]
Matrix[[-4.218847493575589e-15, -2.865041537347675e-13], [-2.865041537347675e-13, -10927.800950740779]]
Matrix[[-1.3322676295501873e-15, -8.997247391562266e-14], [-8.997247391562266e-14, -10927.800950740766]]
Matrix[[-6.661338147750937e-16, -4.4986236957811335e-14], [-4.4986236957811335e-14, -10927.800950740762]]
Matrix[[-0.0, -0.0], [-0.0, -10927.80095074076]]
ExceptionForMatrix::ErrNotRegular: Not Regular Matrix
from /home/ubuntu/.rvm/gems/ruby-2.2.3/gems/backports-3.6.8/lib/backports/1.9.2/stdlib/matrix.rb:933:in `block in inverse_from' In the end it is computing inverse of |
@agisga might this be an issue with the algorithm or is it loss of precision in some of the calculations? |
It seems to me that the algorithm is theoretically okay, because it gives correct results most of the time. Maybe it fails because it accumulates numerical error quickly, when the input matrix is not well conditioned. Especially, since you mention matrix inverses, it sounds to me like the algorithm is not well optimized. It should be changed such that instead of computing matrix inverses, linear systems are solved (here is a very concise summary why). Solving a linear system is faster and numerically more stable than finding a matrix inverse. Unfortunately right now I don't have the time to look at the algorithm in detail. I hope I can find the time to look at the algorithm in detail eventually. Probably it would be best to rewrite it such that it utilizes matrix decompositions and linear solvers provided by nmatrix-lapacke. |
Thanks for the explanation. I'm getting the same thing in case another example is helpful. data = Daru::DataFrame.from_csv 'recruitment_failures.csv'
glm = Statsample::GLM.compute data, 'failed_recruitment', :logistic |
Statsample::GLM.compute
is failing for certain dataframes.Get dataframe used in the above code here
The text was updated successfully, but these errors were encountered: