Unify on NCHW and maybe support CHWN and HWCN tensor layout #132

mratsim · 2017-10-29T09:37:42Z

N: batch size
C: Channel / convolution feature_map
H: Height
W: Width

NCHW

NCHW is the most widespread format and is the default format in:

CuDNN, Torch, PyTorch, Chainer

N is the first index which is the most familiar to data scientists. It however presents optimization challenges even on Nvidia side. See soumith/convnet-benchmarks#93

CHWN

CHWN is the format used by Neon and the (dead) pioneer cuda-convnet:
It is perfect for Winograd convolution. Main issue is that N on the right is unfamiliar. Also for RNNs it might be worse to have batch as the innermost dimension.

Some models also concatenates other the feature maps which can be done directly similar to (C1 + C2 + C3)HWN

CuDNN supports it according to doc

but for CuDNN v7 cudnn.h only has this:

typedef enum
{
    CUDNN_TENSOR_NCHW = 0,          /* row major (wStride = 1, hStride = w) */
    CUDNN_TENSOR_NHWC = 1,          /* feature maps interleaved ( cStride = 1 )*/
    CUDNN_TENSOR_NCHW_VECT_C = 2    /* each image point is vector of element of C : the length of the vector is carried by the data type*/
} cudnnTensorFormat_t;

HWCN

HWCN is the format used by Tensorflow. It is also better than NCHW for Winograd (but not as good as CHWN), it is also the best format for implementing "Memory Efficient Convolution" #131

Format conversion

Converting between NCHW and CHWN can be done very efficiently by considering a transposition between [N, CHW] and [CHW, N]

Implementation by Neon: NervanaSystems/neon@682dde6

Implementation by NVIDIA: https://devblogs.nvidia.com/parallelforall/efficient-matrix-transpose-cuda-cc/

Paper: Optimizing memory efficiency for DCNN on GPUs

The text was updated successfully, but these errors were encountered:

mratsim added feature RFC key feature and removed feature labels Oct 29, 2017

mratsim added the Cuda label Nov 5, 2017

mratsim closed this as completed in 1024c7a Dec 16, 2017

kmario23 mentioned this issue Jul 15, 2018

Use consistent notation for array dimensions in the docstrings scikit-image/scikit-image#3031

Merged

3 tasks

samskalicky mentioned this issue Jul 10, 2019

adding layout field to store tensor layout (ie. NCHW) dmlc/dlpack#42

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unify on NCHW and maybe support CHWN and HWCN tensor layout #132

Unify on NCHW and maybe support CHWN and HWCN tensor layout #132

mratsim commented Oct 29, 2017 •

edited

Loading

Unify on NCHW and maybe support CHWN and HWCN tensor layout #132

Unify on NCHW and maybe support CHWN and HWCN tensor layout #132

Comments

mratsim commented Oct 29, 2017 • edited Loading

NCHW

CHWN

HWCN

Format conversion

mratsim commented Oct 29, 2017 •

edited

Loading