Fix - sample type inconsistency in (Multi)Categorical Probability Distribution #588

seheevic · 2019-11-28T07:00:30Z

Description

Motivation and Context

I have raised an issue to propose this change (required for new features and bug fixes)
closes [Bug] sampling from CategoricalProbabilityDistribution should return tf.int32 #589

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (update in the documentation)

Checklist:

I've read the CONTRIBUTION guide (required)
I have updated the changelog accordingly (required).
My change requires a change to the documentation.
I have updated the tests accordingly (required for a bug fix or a new feature).
I have updated the documentation accordingly.
I have ensured pytest and pytype both pass.

Miffyli · 2019-11-28T07:53:47Z

Thanks for the PR! However this could have been discussed further in the issue before opening a PR. Namely: I do not think the probability distributions should return int32s, as the default value for integers (in indexing/slicing) seems to be long / int64 in TF/PyTorch.

And more importantly, Gym spaces use int64. Rather than updating sampling to return int32 I think they should return int64 as previously but change the sample_dtype.

seheevic · 2019-11-28T08:31:02Z

@Miffyli
As you said, I should have discussed more on the issue thread.
I just think that this is trivial bug because my modification is not by my own preference but from other code of this repository as you can see below:

CategoricalProbabilityDistributionType already assumes sample type of tf.int32

stable-baselines/stable_baselines/common/distributions.py

Lines 157 to 181 in b461adb

    
           class CategoricalProbabilityDistributionType(ProbabilityDistributionType): 
        
               def __init__(self, n_cat): 
        
                   """ 
        
                   The probability distribution type for categorical input 
        
                   :param n_cat: (int) the number of categories 
        
                   """ 
        
                   self.n_cat = n_cat 
        
               def probability_distribution_class(self): 
        
                   return CategoricalProbabilityDistribution 
        
               def proba_distribution_from_latent(self, pi_latent_vector, vf_latent_vector, init_scale=1.0, init_bias=0.0): 
        
                   pdparam = linear(pi_latent_vector, 'pi', self.n_cat, init_scale=init_scale, init_bias=init_bias) 
        
                   q_values = linear(vf_latent_vector, 'q', self.n_cat, init_scale=init_scale, init_bias=init_bias) 
        
                   return self.proba_distribution_from_flat(pdparam), pdparam, q_values 
        
               def param_shape(self): 
        
                   return [self.n_cat] 
        
               def sample_shape(self): 
        
                   return [] 
        
               def sample_dtype(self): 
        
                   return tf.int32

MultiCategoricalProbabilityDistributionType also assumes sample type of tf.int32

stable-baselines/stable_baselines/common/distributions.py

Lines 184 to 214 in b461adb

    
           class MultiCategoricalProbabilityDistributionType(ProbabilityDistributionType): 
        
               def __init__(self, n_vec): 
        
                   """ 
        
                   The probability distribution type for multiple categorical input 
        
                   :param n_vec: ([int]) the vectors 
        
                   """ 
        
                   # Cast the variable because tf does not allow uint32 
        
                   self.n_vec = n_vec.astype(np.int32) 
        
                   # Check that the cast was valid 
        
                   assert (self.n_vec > 0).all(), "Casting uint32 to int32 was invalid" 
        
               def probability_distribution_class(self): 
        
                   return MultiCategoricalProbabilityDistribution 
        
               def proba_distribution_from_flat(self, flat): 
        
                   return MultiCategoricalProbabilityDistribution(self.n_vec, flat) 
        
               def proba_distribution_from_latent(self, pi_latent_vector, vf_latent_vector, init_scale=1.0, init_bias=0.0): 
        
                   pdparam = linear(pi_latent_vector, 'pi', sum(self.n_vec), init_scale=init_scale, init_bias=init_bias) 
        
                   q_values = linear(vf_latent_vector, 'q', sum(self.n_vec), init_scale=init_scale, init_bias=init_bias) 
        
                   return self.proba_distribution_from_flat(pdparam), pdparam, q_values 
        
               def param_shape(self): 
        
                   return [sum(self.n_vec)] 
        
               def sample_shape(self): 
        
                   return [len(self.n_vec)] 
        
               def sample_dtype(self): 
        
                   return tf.int32

MultiCategoricalProbabilityDistribution already returns sample of tf.int32

stable-baselines/stable_baselines/common/distributions.py

Lines 355 to 356 in b461adb

def mode(self):

return tf.cast(tf.stack([p.mode() for p in self.categoricals], axis=-1), tf.int32)

stable-baselines/stable_baselines/common/distributions.py

Lines 367 to 368 in b461adb

def sample(self):

return tf.cast(tf.stack([p.sample() for p in self.categoricals], axis=-1), tf.int32)

Practically, I think it would be better to use tf.int32, but if other libraries are going to int64, then some parameter could help me (who stick to int32 for service issue).
Anyway sorry about this hurry PR.

Miffyli · 2019-11-28T10:50:27Z

I would switch to using int64 around the code for the consistency with Gym and the libraries (have to double check this is the case, though. I recall integers are int64 by default in TF/PyTorch, but I am not 100% sure).

@araffin

Any comments on this? Something I am missing?

araffin · 2019-11-28T19:23:51Z

Something I am missing?

I don't think so, or at least, I'm not aware. Yes, I'm for the consistency ;)

seheevic · 2019-11-29T02:09:26Z

@araffin @Miffyli
Ok, I got it.
Then may I close this PR?

araffin · 2019-11-29T10:12:37Z

Then may I close this PR?

Well, you can do the changes on that PR, no need to close it.

…babilityDistribution, MultiCategoricalProbabilityDistribution to tf.int64)

…-int32

seheevic · 2019-12-02T11:18:39Z

@Miffyli
Ok, now tf.int32 becomes tf.int64.

araffin

LGTM, thanks

Fix - sample type inconsistency in CategoricalProbabilityDistribution

e76fbcb

araffin added the PR template not filled Please fill the pull request template label Nov 28, 2019

Adding info on fix to changelog.

55aaff1

araffin removed the PR template not filled Please fill the pull request template label Nov 28, 2019

Fix - sample type inconsistency (change sample type of CategoricalPro…

d3a7e20

…babilityDistribution, MultiCategoricalProbabilityDistribution to tf.int64)

seheevic changed the title ~~Fix - sample type inconsistency in CategoricalProbabilityDistribution~~ Fix - sample type inconsistency in CategoricalProbabilityDistribution/MultiCategoricalProbabilityDistribution Dec 2, 2019

seheevic and others added 2 commits December 2, 2019 15:13

Merge branch 'master' into CategoricalProbabilityDistribution-cast-tf…

3e82e34

…-int32

Change dtype of actions to int64 of ACER

e8db5d8

araffin requested review from araffin, hill-a, AdamGleave, ernestum and Miffyli and removed request for araffin December 2, 2019 10:44

Update changelog.rst

5e78d22

araffin approved these changes Dec 2, 2019

View reviewed changes

araffin changed the title ~~Fix - sample type inconsistency in CategoricalProbabilityDistribution/MultiCategoricalProbabilityDistribution~~ Fix - sample type inconsistency in (Multi)Categorical Probability Distribution Dec 2, 2019

araffin merged commit 6039b89 into hill-a:master Dec 2, 2019

seheevic deleted the CategoricalProbabilityDistribution-cast-tf-int32 branch December 3, 2019 02:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix - sample type inconsistency in (Multi)Categorical Probability Distribution #588

Fix - sample type inconsistency in (Multi)Categorical Probability Distribution #588

seheevic commented Nov 28, 2019 •

edited

Loading

Miffyli commented Nov 28, 2019

seheevic commented Nov 28, 2019 •

edited

Loading

Miffyli commented Nov 28, 2019

araffin commented Nov 28, 2019

seheevic commented Nov 29, 2019 •

edited

Loading

araffin commented Nov 29, 2019

seheevic commented Dec 2, 2019

araffin left a comment

Fix - sample type inconsistency in (Multi)Categorical Probability Distribution #588

Fix - sample type inconsistency in (Multi)Categorical Probability Distribution #588

Conversation

seheevic commented Nov 28, 2019 • edited Loading

Description

Motivation and Context

Types of changes

Checklist:

Miffyli commented Nov 28, 2019

seheevic commented Nov 28, 2019 • edited Loading

Miffyli commented Nov 28, 2019

araffin commented Nov 28, 2019

seheevic commented Nov 29, 2019 • edited Loading

araffin commented Nov 29, 2019

seheevic commented Dec 2, 2019

araffin left a comment

Choose a reason for hiding this comment

seheevic commented Nov 28, 2019 •

edited

Loading

seheevic commented Nov 28, 2019 •

edited

Loading

seheevic commented Nov 29, 2019 •

edited

Loading