divide by zero error #4

richielo · 2018-11-08T19:19:39Z

Hello, thank you for the work. I am facing the issue of dividing by zero error in the line below when calling the sample function to sample memory. Any idea why?

is_weight /= is_weight.max()

The text was updated successfully, but these errors were encountered:

stormont · 2019-02-21T01:53:36Z

It's caused by this and it's actually raising from the np.power in the line above.

I've forked and partially fixed the issues and made a couple other changes (plus made the PER memory more configurable).

I avoid the division by zero by changing the beta step to always stay just below 1: np.min([1. - self.e, ...]).
I also found that the SumTree sometimes pulls uninitialized samples in its batch (these show up as simply a 0), which can cause exceptions down the line if you don't guard for it. I haven't root caused that yet, but I just discard those samples when they happen and raise a warning. It happens rarely enough that discarding shouldn't cause any impact on training. The warning looks like this:

/Users/{user}/repos/per/prioritized_memory.py:48: UserWarning: Pulled 1 uninitialized samples
  warnings.warn('Pulled {} uninitialized samples'.format(uninitialized_samples))

I'm happy to PR my changes into this repo if @rlcode wants them.

emunaran · 2019-03-20T11:04:57Z

Hi guys,
@rlcode - many thanks for your work. I also observed uninitialized samples pulling and got the mentioned unwanted "0". I didn't figure out yet the problem and I wonder if the sampling process works as it supposed to work.
@stormont did you figure out the root caused?

josiahls · 2019-08-24T21:43:20Z

As referenced in (Schaul et al., 2015), as TD error approaches 0 we will have divide by zero errors. They fix this via:

Where epsilon is a small value to prevent this. I think you are missing this from your algorithm? I am pretty confident that if you have been testing on cartpole you with never run into this issue, however in discrete state spaces (like mazes) this becomes a real problem.

yougeyxt · 2019-09-04T00:40:10Z

Hello, I also find that the uninitialized samples will be sampled and got the unwanted data "0". I tried to find out the root caused but failed. Did you guys figure out the reason? @stormont @emunaran
Many thanks!

yougeyxt · 2019-09-04T00:55:24Z

Also, according to the paper, when store a new transition (s, a, r, s_) to the memory the priority should be the maximum priority among the leaf node right? But in the code it used the TD error of the s and s_ which is different from the paper. I am wondering whether this is a bug or not.

Jspujol · 2019-11-07T16:14:29Z

Hello, I also find that the uninitialized samples will be sampled and got the unwanted data "0". I tried to find out the root caused but failed. Did you guys figure out the reason? @stormont @emunaran
Many thanks!

Hi there! I faced the same issue and what I did is to sample another value of that same interval, until it is not an integer (given that the capacity is initialized to np.zeros ). In the prioritized memory I added the following:

for i in range(n):
            a = segment * i
            b = segment * (i + 1)
            while True:
                s = random.uniform(a, b)
                (idx, p, data) = self.tree.get(s)
                if not isinstance(data, int):
                    break
            priorities.append(p)
            batch.append(data)
            idxs.append(idx)

This did the trick for me. Hope it does the same to you.

being-aerys · 2020-12-01T06:35:18Z

If anyone is still wondering why it pulls 0 from the replay memory, it is because the location in the replay memory that was sampled was not filled out yet and thus contained the initial values with which we initialized the replay buffer. i.e., 0's. If you set a condition that the training does not start until the buffer is completely filled, then you never encounter this issue.

ZINZINBIN · 2023-03-28T07:44:39Z

Hello, I also find that the uninitialized samples will be sampled and got the unwanted data "0". I tried to find out the root caused but failed. Did you guys figure out the reason? @stormont @emunaran
Many thanks!

Hi there! I faced the same issue and what I did is to sample another value of that same interval, until it is not an integer (given that the capacity is initialized to np.zeros ). In the prioritized memory I added the following:
for i in range(n):
            a = segment * i
            b = segment * (i + 1)
            while True:
                s = random.uniform(a, b)
                (idx, p, data) = self.tree.get(s)
                if not isinstance(data, int):
                    break
            priorities.append(p)
            batch.append(data)
            idxs.append(idx)
This did the trick for me. Hope it does the same to you.

Brilliant!

Silent-Zebra mentioned this issue Jul 11, 2019

Errors Running Directly After Clone #6

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

divide by zero error #4

divide by zero error #4

richielo commented Nov 8, 2018

stormont commented Feb 21, 2019 •

edited

Loading

emunaran commented Mar 20, 2019

josiahls commented Aug 24, 2019 •

edited

Loading

yougeyxt commented Sep 4, 2019

yougeyxt commented Sep 4, 2019

Jspujol commented Nov 7, 2019

being-aerys commented Dec 1, 2020

ZINZINBIN commented Mar 28, 2023

divide by zero error #4

divide by zero error #4

Comments

richielo commented Nov 8, 2018

stormont commented Feb 21, 2019 • edited Loading

emunaran commented Mar 20, 2019

josiahls commented Aug 24, 2019 • edited Loading

yougeyxt commented Sep 4, 2019

yougeyxt commented Sep 4, 2019

Jspujol commented Nov 7, 2019

being-aerys commented Dec 1, 2020

ZINZINBIN commented Mar 28, 2023

stormont commented Feb 21, 2019 •

edited

Loading

josiahls commented Aug 24, 2019 •

edited

Loading