Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix - sample type inconsistency in (Multi)Categorical Probability Distribution #588

Conversation

seheevic
Copy link

@seheevic seheevic commented Nov 28, 2019

Description

Motivation and Context

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (update in the documentation)

Checklist:

  • I've read the CONTRIBUTION guide (required)
  • I have updated the changelog accordingly (required).
  • My change requires a change to the documentation.
  • I have updated the tests accordingly (required for a bug fix or a new feature).
  • I have updated the documentation accordingly.
  • I have ensured pytest and pytype both pass.

@araffin araffin added the PR template not filled Please fill the pull request template label Nov 28, 2019
@Miffyli
Copy link
Collaborator

Miffyli commented Nov 28, 2019

Thanks for the PR! However this could have been discussed further in the issue before opening a PR. Namely: I do not think the probability distributions should return int32s, as the default value for integers (in indexing/slicing) seems to be long / int64 in TF/PyTorch.

And more importantly, Gym spaces use int64. Rather than updating sampling to return int32 I think they should return int64 as previously but change the sample_dtype.

@seheevic
Copy link
Author

seheevic commented Nov 28, 2019

@Miffyli
As you said, I should have discussed more on the issue thread.
I just think that this is trivial bug because my modification is not by my own preference but from other code of this repository as you can see below:

  • CategoricalProbabilityDistributionType already assumes sample type of tf.int32

    class CategoricalProbabilityDistributionType(ProbabilityDistributionType):
    def __init__(self, n_cat):
    """
    The probability distribution type for categorical input
    :param n_cat: (int) the number of categories
    """
    self.n_cat = n_cat
    def probability_distribution_class(self):
    return CategoricalProbabilityDistribution
    def proba_distribution_from_latent(self, pi_latent_vector, vf_latent_vector, init_scale=1.0, init_bias=0.0):
    pdparam = linear(pi_latent_vector, 'pi', self.n_cat, init_scale=init_scale, init_bias=init_bias)
    q_values = linear(vf_latent_vector, 'q', self.n_cat, init_scale=init_scale, init_bias=init_bias)
    return self.proba_distribution_from_flat(pdparam), pdparam, q_values
    def param_shape(self):
    return [self.n_cat]
    def sample_shape(self):
    return []
    def sample_dtype(self):
    return tf.int32

  • MultiCategoricalProbabilityDistributionType also assumes sample type of tf.int32

    class MultiCategoricalProbabilityDistributionType(ProbabilityDistributionType):
    def __init__(self, n_vec):
    """
    The probability distribution type for multiple categorical input
    :param n_vec: ([int]) the vectors
    """
    # Cast the variable because tf does not allow uint32
    self.n_vec = n_vec.astype(np.int32)
    # Check that the cast was valid
    assert (self.n_vec > 0).all(), "Casting uint32 to int32 was invalid"
    def probability_distribution_class(self):
    return MultiCategoricalProbabilityDistribution
    def proba_distribution_from_flat(self, flat):
    return MultiCategoricalProbabilityDistribution(self.n_vec, flat)
    def proba_distribution_from_latent(self, pi_latent_vector, vf_latent_vector, init_scale=1.0, init_bias=0.0):
    pdparam = linear(pi_latent_vector, 'pi', sum(self.n_vec), init_scale=init_scale, init_bias=init_bias)
    q_values = linear(vf_latent_vector, 'q', sum(self.n_vec), init_scale=init_scale, init_bias=init_bias)
    return self.proba_distribution_from_flat(pdparam), pdparam, q_values
    def param_shape(self):
    return [sum(self.n_vec)]
    def sample_shape(self):
    return [len(self.n_vec)]
    def sample_dtype(self):
    return tf.int32

  • MultiCategoricalProbabilityDistribution already returns sample of tf.int32

    def mode(self):
    return tf.cast(tf.stack([p.mode() for p in self.categoricals], axis=-1), tf.int32)

    def sample(self):
    return tf.cast(tf.stack([p.sample() for p in self.categoricals], axis=-1), tf.int32)

Practically, I think it would be better to use tf.int32, but if other libraries are going to int64, then some parameter could help me (who stick to int32 for service issue).
Anyway sorry about this hurry PR.

@araffin araffin removed the PR template not filled Please fill the pull request template label Nov 28, 2019
@Miffyli
Copy link
Collaborator

Miffyli commented Nov 28, 2019

I would switch to using int64 around the code for the consistency with Gym and the libraries (have to double check this is the case, though. I recall integers are int64 by default in TF/PyTorch, but I am not 100% sure).

@araffin

Any comments on this? Something I am missing?

@araffin
Copy link
Collaborator

araffin commented Nov 28, 2019

Something I am missing?

I don't think so, or at least, I'm not aware. Yes, I'm for the consistency ;)

@seheevic
Copy link
Author

seheevic commented Nov 29, 2019

@araffin @Miffyli
Ok, I got it.
Then may I close this PR?

@araffin
Copy link
Collaborator

araffin commented Nov 29, 2019

Then may I close this PR?

Well, you can do the changes on that PR, no need to close it.

…babilityDistribution, MultiCategoricalProbabilityDistribution to tf.int64)
@seheevic seheevic changed the title Fix - sample type inconsistency in CategoricalProbabilityDistribution Fix - sample type inconsistency in CategoricalProbabilityDistribution/MultiCategoricalProbabilityDistribution Dec 2, 2019
@araffin araffin requested review from araffin, hill-a, AdamGleave, ernestum and Miffyli and removed request for araffin December 2, 2019 10:44
@seheevic
Copy link
Author

seheevic commented Dec 2, 2019

@Miffyli
Ok, now tf.int32 becomes tf.int64.

Copy link
Collaborator

@araffin araffin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@araffin araffin changed the title Fix - sample type inconsistency in CategoricalProbabilityDistribution/MultiCategoricalProbabilityDistribution Fix - sample type inconsistency in (Multi)Categorical Probability Distribution Dec 2, 2019
@araffin araffin merged commit 6039b89 into hill-a:master Dec 2, 2019
@seheevic seheevic deleted the CategoricalProbabilityDistribution-cast-tf-int32 branch December 3, 2019 02:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] sampling from CategoricalProbabilityDistribution should return tf.int32
3 participants