Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[xdoctest] No.44-47 and No.50-59 doc style #55813

Merged
merged 9 commits into from
Aug 7, 2023

Conversation

gouzil
Copy link
Member

@gouzil gouzil commented Jul 30, 2023

PR types

Others

PR changes

Others

Description

修改如下文件的示例代码,使其通过 xdoctest 检查:

预览:

关联 PR:

@sunzhongkai588 @SigureMo @megemini

@paddle-bot
Copy link

paddle-bot bot commented Jul 30, 2023

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot paddle-bot bot added contributor External developers status: proposed labels Jul 30, 2023
@gouzil gouzil changed the title [xdoctest] No.50-52 doc style [xdoctest] No.50-55 doc style Jul 30, 2023
@gouzil gouzil changed the title [xdoctest] No.50-55 doc style [xdoctest] No.44-47 and No.56-59 doc style Jul 30, 2023
@luotao1 luotao1 added the HappyOpenSource 快乐开源活动issue与PR label Jul 31, 2023
@gouzil gouzil changed the title [xdoctest] No.44-47 and No.56-59 doc style [xdoctest] No.44-47 and No.50-59 doc style Jul 31, 2023
…t_50_52

# Conflicts:
#	python/paddle/distribution/uniform.py
Tensor(shape=[2], dtype=float32, place=Place(gpu:0), stop_gradient=True,
[-0.22843921, -0.22843921])
>>> reinterpreted_beta = independent.Independent(beta, 1)
>>> print(reinterpreted_beta.batch_shape, reinterpreted_beta.event_shape)
# () (2,)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的 # 要删掉~

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

怎么好像都没更新?

Copy link
Member Author

@gouzil gouzil Aug 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

怎么好像都没更新?

7fe022e (#55813) 这次提交里更新了

Comment on lines 47 to 52
>>> import paddle

m = paddle.distribution.Laplace(paddle.to_tensor(0.0), paddle.to_tensor(1.0))
m.sample() # Laplace distributed with loc=0, scale=1
# Tensor(shape=[], dtype=float32, place=Place(cpu), stop_gradient=True,
# 3.68546247)
>>> m = paddle.distribution.Laplace(paddle.to_tensor(0.0), paddle.to_tensor(1.0))
>>> m.sample() # Laplace distributed with loc=0, scale=1
Tensor(shape=[], dtype=float32, place=Place(cpu), stop_gradient=True,
3.68546247)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

加个 seed 吧~

>>> import paddle
>>> paddle.seed(2023)
>>> m = paddle.distribution.Laplace(paddle.to_tensor(0.0), paddle.to_tensor(1.0))
>>> print(m.sample())

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

喵喵喵?我看的是旧版本么?

Comment on lines 324 to 328
>>> import paddle
>>> m = paddle.distribution.Laplace(paddle.to_tensor([0.0]), paddle.to_tensor([1.0]))
>>> m.rsample((1,)) # Laplace distributed with loc=0, scale=1
Tensor(shape=[1, 1], dtype=float32, place=Place(cpu), stop_gradient=True,
[[0.04337667]])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

加个 seed ~

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 56 to 67
>>> import paddle

>>> multinomial = paddle.distribution.Multinomial(10, paddle.to_tensor([0.2, 0.3, 0.5]))
>>> print(multinomial.sample((2, 3)))
Tensor(shape=[2, 3, 3], dtype=float32, place=Place(cpu), stop_gradient=True,
[[[1., 4., 5.],
[0., 2., 8.],
[2., 4., 4.]],

[[1., 6., 3.],
[3., 3., 4.],
[3., 4., 3.]]])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

输出不要留空行~ 另外,加个 seed ~

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个要留空格吧

Python 3.10.8 (v3.10.8:aaaf517424, Oct 11 2022, 10:14:40) [Clang 13.0.0 (clang-1300.0.29.30)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import paddle
>>> paddle.seed(2023)
<paddle.fluid.libpaddle.Generator object at 0x1070bb270>
>>> multinomial = paddle.distribution.Multinomial(10, paddle.to_tensor([0.2, 0.3, 0.5]))
>>> print(multinomial.sample((2, 3)))
Tensor(shape=[2, 3, 3], dtype=float32, place=Place(cpu), stop_gradient=True,
       [[[1., 5., 4.],
         [0., 4., 6.],
         [1., 3., 6.]],

        [[2., 2., 6.],
         [0., 6., 4.],
         [3., 3., 4.]]])
>>> 

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以留空格,美观就行~ 但是不要留空行~

>>> x = paddle.ones((1,2,3))
>>> reshape_transform = paddle.distribution.ReshapeTransform((2, 3), (3, 2))
>>> print(reshape_transform.forward_shape((1,2,3)))
(5, 2, 6)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

输出错误~

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 1254 to 1257
>>> print(tanh.inverse(tanh.forward(x)))
Tensor(shape=[2, 3], dtype=float32, place=Place(cpu), stop_gradient=True,
[[1.00000012, 2. , 3.00000286],
[4.00002146, 5.00009823, 6.00039864]])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我设置默认的浮点精度是 5 位小数,这里 CI 上运行得到的

2023-08-01 13:14:27     Tensor(shape=[2, 3], dtype=float32, place=Place(cpu), stop_gradient=True,
2023-08-01 13:14:27            [[1.        , 2.        , 2.99999666],
2023-08-01 13:14:27             [3.99993253, 4.99977016, 6.00527668]])

精度也就是 1 位小数(6.00039864 vs 6.00527668)~

@SigureMo tanh 底层实现有变过吗?这个感觉差得有点大?

如果把 CI 的浮点精度设置为 1 ,也不太好~ 看看这咋整?!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里应该不是设备差异造成的,我这里也是这样:

>>> import paddle
>>> tanh = paddle.distribution.TanhTransform()
>>> x = paddle.to_tensor([[1., 2., 3.], [4., 5., 6.]])
>>> print(tanh.forward(x))
Tensor(shape=[2, 3], dtype=float32, place=Place(cpu), stop_gradient=True,
       [[0.76159418, 0.96402758, 0.99505472],
        [0.99932921, 0.99990916, 0.99998784]])
>>> print(tanh.inverse(tanh.forward(x)))
Tensor(shape=[2, 3], dtype=float32, place=Place(cpu), stop_gradient=True,
       [[1.        , 2.        , 2.99999666],
        [3.99993253, 4.99977016, 6.00527668]])

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嗯,跟我这边的结果一样(ubuntu22.04 + docker paddle cpu 2.5)~

In [4]: >>> import paddle
   ...: >>> tanh = paddle.distribution.TanhTransform()
   ...: >>> x = paddle.to_tensor([[1., 2., 3.], [4., 5., 6.]])
   ...: >>> print(tanh.forward(x))
   ...: 
Tensor(shape=[2, 3], dtype=float32, place=Place(cpu), stop_gradient=True,
       [[0.76159418, 0.96402758, 0.99505472],
        [0.99932921, 0.99990916, 0.99998784]])

In [5]: >>> print(tanh.inverse(tanh.forward(x)))
   ...: 
Tensor(shape=[2, 3], dtype=float32, place=Place(cpu), stop_gradient=True,
       [[1.        , 2.        , 2.99999666],
        [3.99993253, 4.99977016, 6.00527668]])

不知道原来的结果

Tensor(shape=[2, 3], dtype=float32, place=Place(cpu), stop_gradient=True,
        [[1.00000012, 2.        , 3.00000286],
        [4.00002146, 5.00009823, 6.00039864]])

怎么来的~

@SigureMo @sunzhongkai588 @luotao1 这里的精度,前后差了较多,不确定什么原因导致的 ~ 可以手动复制目前的结果过测,具体要不要追溯一下原因,各位给看看?

Comment on lines 32 to 46
>>> import paddle
>>> from paddle.distribution import transformed_distribution

>>> d = transformed_distribution.TransformedDistribution(
... paddle.distribution.Normal(0., 1.),
... [paddle.distribution.AffineTransform(paddle.to_tensor(1.), paddle.to_tensor(2.))]
>>> )

>>> print(d.sample([10]))
Tensor(shape=[10], dtype=float32, place=Place(gpu:0), stop_gradient=True,
[-0.10697651, 3.33609009, -0.86234951, 5.07457638, 0.75925219,
-4.17087793, 2.22579336, -0.93845034, 0.66054249, 1.50957513])
>>> print(d.log_prob(paddle.to_tensor(0.5)))
Tensor(shape=[], dtype=float32, place=Place(gpu:0), stop_gradient=True,
-1.64333570)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

加个 seed 吧~

            >>> import paddle
            >>> paddle.seed(2023)
            >>> from paddle.distribution import transformed_distribution
            >>> d = transformed_distribution.TransformedDistribution(
            ...     paddle.distribution.Normal(0., 1.),
            ...     [paddle.distribution.AffineTransform(paddle.to_tensor(1.), paddle.to_tensor(2.))]
            >>> )
            >>> print(d.sample([10]))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 75 to 81
>>> # [1.4189385] with shape: [1]
>>> lp = lognormal_a.log_prob(value_tensor)
>>> # [-0.72069150] with shape: [1]
>>> p = lognormal_a.probs(value_tensor)
>>> # [0.48641577] with shape: [1]
>>> kl = lognormal_a.kl_divergence(lognormal_b)
>>> # [0.34939718] with shape: [1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这 4 个地方的输出用 print(entropy) 之类的代替~

>>> entropy = lognormal_a.entropy()
>>> print(entropy)
Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True,
[1.41893852])

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里怎么也没看到更新?

Comment on lines 79 to 86
>>> entropy = normal_a.entropy()
>>> # [1.4189385] with shape: [1]
>>> lp = normal_a.log_prob(value_tensor)
>>> # [-1.2389386] with shape: [1]
>>> p = normal_a.probs(value_tensor)
>>> # [0.28969154] with shape: [1]
>>> kl = normal_a.kl_divergence(normal_b)
>>> # [0.34939718] with shape: [1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上,用 print 输出~

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里有加 print 嘛?

Comment on lines 1254 to 1257
>>> print(tanh.inverse(tanh.forward(x)))
Tensor(shape=[2, 3], dtype=float32, place=Place(cpu), stop_gradient=True,
[[1.00000012, 2. , 3.00000286],
[4.00002146, 5.00009823, 6.00039864]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里应该不是设备差异造成的,我这里也是这样:

>>> import paddle
>>> tanh = paddle.distribution.TanhTransform()
>>> x = paddle.to_tensor([[1., 2., 3.], [4., 5., 6.]])
>>> print(tanh.forward(x))
Tensor(shape=[2, 3], dtype=float32, place=Place(cpu), stop_gradient=True,
       [[0.76159418, 0.96402758, 0.99505472],
        [0.99932921, 0.99990916, 0.99998784]])
>>> print(tanh.inverse(tanh.forward(x)))
Tensor(shape=[2, 3], dtype=float32, place=Place(cpu), stop_gradient=True,
       [[1.        , 2.        , 2.99999666],
        [3.99993253, 4.99977016, 6.00527668]])

python/paddle/distribution/transform.py Outdated Show resolved Hide resolved
python/paddle/distribution/transform.py Outdated Show resolved Hide resolved
python/paddle/distribution/transform.py Outdated Show resolved Hide resolved
python/paddle/distribution/transform.py Outdated Show resolved Hide resolved
Comment on lines 79 to 86
>>> entropy = normal_a.entropy()
>>> # [1.4189385] with shape: [1]
>>> lp = normal_a.log_prob(value_tensor)
>>> # [-1.2389386] with shape: [1]
>>> p = normal_a.probs(value_tensor)
>>> # [0.28969154] with shape: [1]
>>> kl = normal_a.kl_divergence(normal_b)
>>> # [0.34939718] with shape: [1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里有加 print 嘛?

Comment on lines 61 to 67
[[[1., 4., 5.],
[0., 2., 8.],
[2., 4., 4.]],

[[1., 6., 3.],
[3., 3., 4.],
[3., 4., 3.]]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[[[1., 4., 5.],
[0., 2., 8.],
[2., 4., 4.]],
[[1., 6., 3.],
[3., 3., 4.],
[3., 4., 3.]]])
[[[1., 4., 5.],
[0., 2., 8.],
[2., 4., 4.]],
[[1., 6., 3.],
[3., 3., 4.],
[3., 4., 3.]]])

Comment on lines 75 to 81
>>> # [1.4189385] with shape: [1]
>>> lp = lognormal_a.log_prob(value_tensor)
>>> # [-0.72069150] with shape: [1]
>>> p = lognormal_a.probs(value_tensor)
>>> # [0.48641577] with shape: [1]
>>> kl = lognormal_a.kl_divergence(lognormal_b)
>>> # [0.34939718] with shape: [1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里怎么也没看到更新?

Comment on lines 47 to 52
>>> import paddle

m = paddle.distribution.Laplace(paddle.to_tensor(0.0), paddle.to_tensor(1.0))
m.sample() # Laplace distributed with loc=0, scale=1
# Tensor(shape=[], dtype=float32, place=Place(cpu), stop_gradient=True,
# 3.68546247)
>>> m = paddle.distribution.Laplace(paddle.to_tensor(0.0), paddle.to_tensor(1.0))
>>> m.sample() # Laplace distributed with loc=0, scale=1
Tensor(shape=[], dtype=float32, place=Place(cpu), stop_gradient=True,
3.68546247)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

喵喵喵?我看的是旧版本么?

Tensor(shape=[2], dtype=float32, place=Place(gpu:0), stop_gradient=True,
[-0.22843921, -0.22843921])
>>> reinterpreted_beta = independent.Independent(beta, 1)
>>> print(reinterpreted_beta.batch_shape, reinterpreted_beta.event_shape)
# () (2,)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

怎么好像都没更新?

@gouzil gouzil requested review from SigureMo and megemini August 2, 2023 16:16
@SigureMo
Copy link
Member

SigureMo commented Aug 3, 2023

image

这两个失败的需要看一下,如果是因为设备差异导致的随机性可以 skip

@megemini
Copy link
Contributor

megemini commented Aug 3, 2023

这两个失败的需要看一下,如果是因为设备差异导致的随机性可以 skip

第一个,还是 TanhTransform 的精度问题 ... ...

第二个,我这边跟 CI 的结果是一致的

In [21]: paddle.device.set_device('cpu')
Out[21]: Place(cpu)

In [22]: >>> import paddle
    ...: >>> from paddle.distribution import transformed_distribution
    ...: >>> paddle.seed(2023)
    ...: >>> d = transformed_distribution.TransformedDistribution(
    ...: ...     paddle.distribution.Normal(0., 1.),
    ...: ...     [paddle.distribution.AffineTransform(paddle.to_tensor(1.), paddle.to_tensor(2.))]
    ...: ... )
    ...: >>> print(d.sample([10]))
Tensor(shape=[10], dtype=float32, place=Place(cpu), stop_gradient=True,
       [ 1.12264419,  3.22699189,  1.83812487,  0.50283587, -2.70338631,
        -2.00740123,  4.47909021,  1.26663208,  4.32719326, -0.11529565])

@gouzil 文件里面的这个结果是在本地跑出来的?!按理说 seed 应该有作用的 ... ...

>>> print(d.sample([10]))
Tensor(shape=[10], dtype=float32, place=Place(cpu), stop_gradient=True,
    [ 3.22699189,  1.12264419,  0.50283587,  1.83812487, -2.00740123,
    -2.70338631,  1.26663208,  4.47909021, -0.11529565,  4.32719326])

@gouzil
Copy link
Member Author

gouzil commented Aug 3, 2023

@gouzil 文件里面的这个结果是在本地跑出来的?!按理说 seed 应该有作用的 ... ...

这两个都是我本地跑出来的结果,emmmm

@megemini
Copy link
Contributor

megemini commented Aug 3, 2023

这两个都是我本地跑出来的结果,emmmm

啊?!这有意思了哈~

def seed(seed):
"""
Sets the seed for global default generator, which manages the random number generation.
Args:
seed(int): The random seed to set. It is recommend to set a large int number.
Returns:
Generator: The global default generator object.
Examples:
.. code-block:: python
>>> import paddle
>>> gen = paddle.seed(102)
"""
# TODO(zhiqiu): 1. remove program.random_seed when all random-related op upgrade
# 2. support gpu generator by global device
seed = int(seed)
if core.is_compiled_with_cuda():
for i in range(core.get_cuda_device_count()):
core.default_cuda_generator(i).manual_seed(seed)
elif core.is_compiled_with_xpu():
for i in range(core.get_xpu_device_count()):
core.default_xpu_generator(i).manual_seed(seed)
place = fluid.framework._current_expected_place()
if isinstance(place, core.CustomPlace):
dev_cnt = sum(
[
place.get_device_type() == s.split(':')[0]
for s in core.get_available_custom_device()
]
)
for i in range(dev_cnt):
core.default_custom_device_generator(
core.CustomPlace(place.get_device_type(), i)
).manual_seed(seed)
return core.default_cpu_generator().manual_seed(seed)

paddle 是这样处理 seed 的~ 对于 cpu/gpu 同一个 seed 的结果可能不一样,这个可以理解~

你那边是具体什么环境?!硬件、系统、paddle 版本?!

@gouzil
Copy link
Member Author

gouzil commented Aug 3, 2023

你那边是具体什么环境?!硬件、系统、paddle 版本?!

OS: macOS 12.4 21F79 x86_64
Kernel: 21.5.0
CPU: Intel i7-7700HQ (8) @ 2.80GHz
GPU: Intel HD Graphics 630
Memory: 16374MiB / 32768MiB

paddle 版本是前两天 Nightly build 的,不过我觉得应该跟硬件没啥关系吧

@megemini
Copy link
Contributor

megemini commented Aug 3, 2023

paddle 版本是前两天 Nightly build 的,不过我觉得应该跟硬件没啥关系吧

按理说跟硬件没啥关系 ~ 麻烦能不能在你本地运行一下以下三段代码(下面是我这边运行的结果):

>>> import paddle
>>> from paddle.distribution import transformed_distribution
>>> # paddle.seed(2023)
>>> d = paddle.distribution.Normal(0., 1.)
>>> s = d.sample([10], seed=1)
>>> t = paddle.to_tensor(1.) + paddle.to_tensor(2.) * s
>>> print(d, s, t)
<paddle.distribution.normal.Normal object at 0x7f5d505a7760> Tensor(shape=[10], dtype=float32, place=Place(cpu), stop_gradient=True,
       [-0.38683185, -0.03939996,  0.68682355, -0.24894781, -0.79514617,
        -0.05464687,  1.93794608,  1.00095284,  0.11751921, -0.85881203]) Tensor(shape=[10], dtype=float32, place=Place(cpu), stop_gradient=True,
       [ 0.22633630,  0.92120010,  2.37364721,  0.50210440, -0.59029233,
         0.89070624,  4.87589216,  3.00190568,  1.23503840, -0.71762407])

>>> import paddle
>>> from paddle.distribution import transformed_distribution
>>> paddle.seed(2023)
>>> d = paddle.distribution.Normal(0., 1.)
>>> s = d.sample([10], seed=0)
>>> t = paddle.to_tensor(1.) + paddle.to_tensor(2.) * s
>>> print(d, s, t)
<paddle.distribution.normal.Normal object at 0x7f5d3c934b50> Tensor(shape=[10], dtype=float32, place=Place(cpu), stop_gradient=True,
       [ 0.06132207,  1.11349595,  0.41906244, -0.24858207, -1.85169315,
        -1.50370061,  1.73954511,  0.13331604,  1.66359663, -0.55764782]) Tensor(shape=[10], dtype=float32, place=Place(cpu), stop_gradient=True,
       [ 1.12264419,  3.22699189,  1.83812487,  0.50283587, -2.70338631,
        -2.00740123,  4.47909021,  1.26663208,  4.32719326, -0.11529565])

>>> import paddle
>>> from paddle.distribution import transformed_distribution
>>> paddle.seed(2023)
>>> d = paddle.distribution.Normal(0., 1.)
>>> s = d.sample([10], seed=1)
>>> t = paddle.to_tensor(1.) + paddle.to_tensor(2.) * s
>>> print(d, s, t)
<paddle.distribution.normal.Normal object at 0x7f5d3c88ecd0> Tensor(shape=[10], dtype=float32, place=Place(cpu), stop_gradient=True,
       [-0.38683185, -0.03939996,  0.68682355, -0.24894781, -0.79514617,
        -0.05464687,  1.93794608,  1.00095284,  0.11751921, -0.85881203]) Tensor(shape=[10], dtype=float32, place=Place(cpu), stop_gradient=True,
       [ 0.22633630,  0.92120010,  2.37364721,  0.50210440, -0.59029233,
         0.89070624,  4.87589216,  3.00190568,  1.23503840, -0.71762407])

@gouzil
Copy link
Member Author

gouzil commented Aug 3, 2023

按理说跟硬件没啥关系 ~ 麻烦能不能在你本地运行一下以下三段代码(下面是我这边运行的结果):

这个结果看着差距也不大呀

Python 3.10.8 (v3.10.8:aaaf517424, Oct 11 2022, 10:14:40) [Clang 13.0.0 (clang-1300.0.29.30)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import paddle
>>> from paddle.distribution import transformed_distribution
>>> d = paddle.distribution.Normal(0., 1.)
>>> s = d.sample([10], seed=1)
>>> t = paddle.to_tensor(1.) + paddle.to_tensor(2.) * s
>>> print(d, s, t)
<paddle.distribution.normal.Normal object at 0x10726bfd0> Tensor(shape=[10], dtype=float32, place=Place(cpu), stop_gradient=True,
       [-0.03939996, -0.38683185, -0.24894781,  0.68682355, -0.05464687,
        -0.79514617,  1.00095284,  1.93794608, -0.85881203,  0.11751921]) Tensor(shape=[10], dtype=float32, place=Place(cpu), stop_gradient=True,
       [ 0.92120010,  0.22633630,  0.50210440,  2.37364721,  0.89070624,
        -0.59029233,  3.00190568,  4.87589216, -0.71762407,  1.23503840])

>>> import paddle
>>> from paddle.distribution import transformed_distribution
>>> paddle.seed(2023)
<paddle.fluid.libpaddle.Generator object at 0x10dfc1970>
>>> d = paddle.distribution.Normal(0., 1.)
>>> s = d.sample([10], seed=0)
>>> t = paddle.to_tensor(1.) + paddle.to_tensor(2.) * s
>>> print(d, s, t)
<paddle.distribution.normal.Normal object at 0x10dcbf2e0> Tensor(shape=[10], dtype=float32, place=Place(cpu), stop_gradient=True,
       [ 1.11349595,  0.06132207, -0.24858207,  0.41906244, -1.50370061,
        -1.85169315,  0.13331604,  1.73954511, -0.55764782,  1.66359663]) Tensor(shape=[10], dtype=float32, place=Place(cpu), stop_gradient=True,
       [ 3.22699189,  1.12264419,  0.50283587,  1.83812487, -2.00740123,
        -2.70338631,  1.26663208,  4.47909021, -0.11529565,  4.32719326])

>>> import paddle
>>> from paddle.distribution import transformed_distribution
>>> paddle.seed(2023)
<paddle.fluid.libpaddle.Generator object at 0x100ea9970>
>>> d = paddle.distribution.Normal(0., 1.)
>>> s = d.sample([10], seed=1)
>>> t = paddle.to_tensor(1.) + paddle.to_tensor(2.) * s
>>> print(d, s, t)
<paddle.distribution.normal.Normal object at 0x100ba72e0> Tensor(shape=[10], dtype=float32, place=Place(cpu), stop_gradient=True,
       [-0.03939996, -0.38683185, -0.24894781,  0.68682355, -0.05464687,
        -0.79514617,  1.00095284,  1.93794608, -0.85881203,  0.11751921]) Tensor(shape=[10], dtype=float32, place=Place(cpu), stop_gradient=True,
       [ 0.92120010,  0.22633630,  0.50210440,  2.37364721,  0.89070624,
        -0.59029233,  3.00190568,  4.87589216, -0.71762407,  1.23503840])

@megemini
Copy link
Contributor

megemini commented Aug 3, 2023

这个结果看着差距也不大呀

第1和第3两段代码结果一样,说明 d.sample([10], seed=1) 起作用,且覆盖了 paddle.seed ~

第2段代码,说明 d.sample([10], seed=0) 不起作用,paddle.seed 起作用 ~

综合来说(结合我这边的执行结果),即便给定了 sample 的 seed,跨平台也不起作用 ... ...

TransformedDistribution 这个示例里面主要是这个 Normal 的 sample 起作用:

def sample(self, shape=(), seed=0):
"""Generate samples of the specified shape.
Args:
shape (Sequence[int], optional): Shape of the generated samples.
seed (int): Python integer number.
Returns:
Tensor, A tensor with prepended dimensions shape.The data type is float32.
"""
if not isinstance(shape, Iterable):
raise TypeError('sample shape must be Iterable object.')
if not in_dynamic_mode():
check_type(seed, 'seed', (int), 'sample')
shape = list(shape)
batch_shape = list((self.loc + self.scale).shape)
name = self.name + '_sample'
if -1 in batch_shape:
output_shape = shape + batch_shape
fill_shape = list(batch_shape + shape)
fill_shape[0] = paddle.shape(self.loc + self.scale)[0].item()
zero_tmp = paddle.full(fill_shape, 0.0, self.dtype)
zero_tmp_reshape = paddle.reshape(zero_tmp, output_shape)
zero_tmp_shape = paddle.shape(zero_tmp_reshape)
normal_random_tmp = random.gaussian(
zero_tmp_shape, mean=0.0, std=1.0, seed=seed, dtype=self.dtype
)
output = normal_random_tmp * (zero_tmp_reshape + self.scale)
output = paddle.add(output, self.loc, name=name)
return output
else:
output_shape = shape + batch_shape
output = random.gaussian(
output_shape, mean=0.0, std=1.0, seed=seed, dtype=self.dtype
) * (paddle.zeros(output_shape, dtype=self.dtype) + self.scale)
output = paddle.add(output, self.loc, name=name)
if self.all_arg_is_float:
return paddle.reshape(output, shape, name=name)
else:
return output

sample 直接用了 random.gaussian 来实现采样,seed 在 random.gaussian 跨平台不起作用 ... ... 不知道能不能下这个结论,有点离谱 ~

另外,上面这个 sample 方法,seed 默认是 0 ?!?!?!这是按照 GaussianKernel 来写的?!

void GaussianKernel(const Context& dev_ctx,
const IntArray& shape,
float mean,
float std,
int seed,
DataType dtype,
DenseTensor* out) {
auto tensor = out;
std::normal_distribution<T> dist(mean, std);
tensor->Resize(phi::make_ddim(shape.GetData()));
int64_t size = tensor->numel();
T* data = dev_ctx.template Alloc<T>(tensor);
std::shared_ptr<std::mt19937_64> engine;
if (seed) {
engine = std::make_shared<std::mt19937_64>();
engine->seed(seed);
} else {
engine = dev_ctx.GetGenerator()->GetCPUEngine();
}

不合适吧 ... ... (c++好久没动了,不敢造次...)

@SigureMo 看看咋整?

@SigureMo
Copy link
Member

SigureMo commented Aug 3, 2023

之前在 #55732 (comment) 孙师傅去问过,底层实现不同,不能保证设置 seed 后一致,即便是同样的设备,系统不同都不能保证,因此这种情况还是 skip 比较好,一旦 CI 环境将来变动了呢

另外我觉得我们可以,在一个 API 产生的结果并不重要的情况下(我们并不需要保证他的输出是什么样的,在大多数有随机性场景都是这样),print shape 即可,也能避免将来 CI 环境变化的情况下,输出无法匹配

@luotao1 luotao1 added HappyOpenSource Pro 进阶版快乐开源活动,更具挑战性的任务 and removed HappyOpenSource 快乐开源活动issue与PR labels Aug 7, 2023
@gouzil
Copy link
Member Author

gouzil commented Aug 7, 2023

@sunzhongkai588

Copy link
Member

@SigureMo SigureMo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTMeow~

@luotao1 luotao1 merged commit 84fe045 into PaddlePaddle:develop Aug 7, 2023
@gouzil gouzil deleted the xdoctest_50_52 branch September 24, 2023 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributor External developers HappyOpenSource Pro 进阶版快乐开源活动,更具挑战性的任务
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants