You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm curious about the initialization for CNTK, so I replace the kernel_fn in c_map(W_var, b_var) function in colab with:
# Create a single layer of a network as an affine transformation composed# with an Erf nonlinearity.# kernel_fn = stax.serial(stax.Dense(1024, W_std, b_std), stax.Erf())[2]kernel_fn=stax.serial(
stax.Conv(out_chan=1024, filter_shape=(3, 3), strides=None, padding='SAME', W_std=W_std, b_std=b_std),
stax.Relu(),
stax.Flatten(),
stax.Dense(10, W_std=W_std, b_std=b_std, parameterization='ntk')
)[2]
However, it seems that there's a bottom layer error when I tried to plot, with the error msg as follow:
Is there any misunderstanding of me to the Phase Diagram? (Is CNTK fundamentally un-drawn-able?
Also, I've also found that there's totally no difference in Phase Diagram when I simply deeper an FC network, e.g.
@SiuMath and @sschoenholz may answer better, but I can give some brief comments:
Re changing the depth, your observation is correct. $C*$ diagram shows the fixed point correlation, i.e. the limiting correlation value $c^*$ in the infinite-depth limit, so it shouldn't matter if you repeat 1 or 3 identical layers infinitely-many times. The $\chi$ plot will change, but note that by definition and the chain rule, the $\chi$ for $n$ identical layers will be equal to $\chi$ for one layer to the power of $n$, so the phase boundary where $\chi = 1$ will remain the same.
I imagine the code could be generalized to CNNs, but it would need to support vector-valued variances $q$ and covariances $c$ (for spatial locations), so may need some work. Note that per https://arxiv.org/abs/1806.05393 for standard/ntk parameterization and CIRCULAR padding, it should yield the same phase diagram as for the fully-connected network. In Figure 11 of https://arxiv.org/abs/1810.05148 we've run some experiments with SAME padding, and obtained a reasonable agreement too.
One other comment, this notebook relies on being able to determine a fixed-point variance $q*$, which does not always exist for ReLU that you used in your example (for weight variance above 2, no stable non-zero variance exists, and it explodes in the infinite-depth limit), so ReLU nonlinearity won't work in the notebook at the moment, even for FCNs. But you can find the ReLU phase diagram in Figure 4 (b) of https://arxiv.org/abs/1711.00165.
Finally, note that these diagrams study the forward propagation of the signal, so they are only working with the CNN-GP kernel (and not the CNTK), so the parameterization argument should have no impact on them.
I'm curious about the initialization for CNTK, so I replace the
kernel_fn
inc_map(W_var, b_var)
function in colab with:However, it seems that there's a bottom layer error when I tried to plot, with the error msg as follow:
Is there any misunderstanding of me to the Phase Diagram? (Is CNTK fundamentally un-drawn-able?
Also, I've also found that there's totally no difference in Phase Diagram when I simply deeper an FC network, e.g.
Does it mean that the depth of NNs won't affect the initialization?
The text was updated successfully, but these errors were encountered: