Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't compute block entropy when k > 31 #83

Open
kjaquier opened this issue Sep 1, 2019 · 1 comment
Open

Can't compute block entropy when k > 31 #83

kjaquier opened this issue Sep 1, 2019 · 1 comment

Comments

@kjaquier
Copy link

kjaquier commented Sep 1, 2019

This seems like an overflow problem where the base b is multiplied k times without any check in block_entropy.c.

I got the issue using PyInform's block entropy function, but the issue clearly seems to be due to Inform.

Code:

from pyinform.blockentropy import block_entropy

x = (np.random.random([100]) > .5).astype(np.uint8)
for k in range(1, 50):
    print(k, block_entropy(x, k))

Output:

1 0.9953784388202257
2 1.9878129812393763
...
31 6.129283016944973

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-311-af5262631fbd> in <module>
      3 x = (np.random.random([100]) > .5).astype(np.uint8)
      4 for k in range(1, 50):
----> 5     print(k, block_entropy(x, k))

~\AppData\Roaming\Python\Python36\site-packages\pyinform\blockentropy.py in block_entropy(series, k, local)
    109         _local_block_entropy(data, c_ulong(n), c_ulong(m), c_int(b), c_ulong(k), out, byref(e))
    110     else:
--> 111         ai = _block_entropy(data, c_ulong(n), c_ulong(m), c_int(b), c_ulong(k), byref(e))
    112 
    113     error_guard(e)

OSError: exception: access violation writing 0x00000217586956EC

A better solution would be to use the biggest int type available, or at least raise an appropriate error message.

Obviously memory and computational complexity are always going to be limiting factors here. Any suggestion for working around this? Curve-fitting has been suggested here, but in my case I don't think that block entropy converges fast enough to a "fittable" curve (referring to the fact that it is supposed to converge to a straight line with a slope corresponding to the entropy rate, as k goes to infinity).

@dglmoore
Copy link
Contributor

dglmoore commented Sep 3, 2019

@kjaquier Sorry I just saw this. You must be running on a machine with quite a bit of memory given it looks like you hit an index overflow before you ran into a memory overflow! We've fixed this problem in inform_transfer_entropy (0d50faa), but haven't propagated the changes to the rest of Inform. It'll probably take me a couple of weeks to take care of this and get it worked into PyInform.

In the meantime, there is a workaround that should postpone this error. The idea is to construct the k-history time series and then coalesce that time series to effectively reduce the base. From there you can just compute the k=1 block entropy on that to get essentially the same result (up to the usual floating-point madness). That is essentially apply inform_black_box and then inform_coalesce.

That being said, inform_black_box hasn't been propagated to PyInform yet, and both functions will likely suffer from this same index overflow problem. This Gist is a pure Python implementation should work, but keep in mind that it's only lightly tested and doesn't do any kind of error checking.

I'll try to get all of these changes rolled out as soon as possible.

@dglmoore dglmoore closed this as completed Sep 3, 2019
@dglmoore dglmoore reopened this Sep 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants