Home > Embedded Computing > Forums > Accelerated Computing > Jetson & Embedded Systems > Jetson TX2 > View Topic

GET STARTED

GET INVOLVED

Authorization Required

Not a member? Register Now

The layer time is :

Why the depthwise_conv(Unnamed Layer* 0) 0.197ms

MobileNet/conv_1/BiasAdd + MobileNet/conv_1/batch_norm/Relu 0.427ms

MobileNet/conv_ds_2/depthwise_conv/BiasAdd + MobileNet/conv_ds_2/dw_batch_norm/Relu 8.436ms

MobileNet/conv_ds_2/pointwise_conv/BiasAdd + MobileNet/conv_ds_2/pw_batch_norm/Relu 0.731ms

(Unnamed Layer* 13) 0.850ms

MobileNet/conv_ds_3/depthwise_conv/BiasAdd + MobileNet/conv_ds_3/dw_batch_norm/Relu 4.350ms

MobileNet/conv_ds_3/pointwise_conv/BiasAdd + MobileNet/conv_ds_3/pw_batch_norm/Relu 0.502ms

MobileNet/conv_ds_4/depthwise_conv/BiasAdd + MobileNet/conv_ds_4/dw_batch_norm/Relu 8.709ms

MobileNet/conv_ds_4/pointwise_conv/BiasAdd + MobileNet/conv_ds_4/pw_batch_norm/Relu 0.814ms

(Unnamed Layer* 30) 0.430ms

MobileNet/conv_ds_5/depthwise_conv/BiasAdd + MobileNet/conv_ds_5/dw_batch_norm/Relu 2.753ms

MobileNet/conv_ds_5/pointwise_conv/BiasAdd + MobileNet/conv_ds_5/pw_batch_norm/Relu 0.417ms

MobileNet/conv_ds_6/depthwise_conv/BiasAdd + MobileNet/conv_ds_6/dw_batch_norm/Relu 5.513ms

MobileNet/conv_ds_6/pointwise_conv/BiasAdd + MobileNet/conv_ds_6/pw_batch_norm/Relu 0.751ms

(Unnamed Layer* 47) 0.226ms

MobileNet/conv_ds_7/depthwise_conv/BiasAdd + MobileNet/conv_ds_7/dw_batch_norm/Relu 2.938ms

MobileNet/conv_ds_7/pointwise_conv/BiasAdd + MobileNet/conv_ds_7/pw_batch_norm/Relu 0.433ms

MobileNet/conv_ds_8/depthwise_conv/BiasAdd + MobileNet/conv_ds_8/dw_batch_norm/Relu 5.846ms

MobileNet/conv_ds_8/pointwise_conv/BiasAdd + MobileNet/conv_ds_8/pw_batch_norm/Relu 0.796ms

MobileNet/conv_ds_9/depthwise_conv/BiasAdd + MobileNet/conv_ds_9/dw_batch_norm/Relu 4.293ms

MobileNet/conv_ds_9/pointwise_conv/BiasAdd + MobileNet/conv_ds_9/pw_batch_norm/Relu 0.786ms

MobileNet/conv_ds_10/depthwise_conv/BiasAdd + MobileNet/conv_ds_10/dw_batch_norm/Relu 4.885ms

MobileNet/conv_ds_10/pointwise_conv/BiasAdd + MobileNet/conv_ds_10/pw_batch_norm/Relu 0.787ms

MobileNet/conv_ds_11/depthwise_conv/BiasAdd + MobileNet/conv_ds_11/dw_batch_norm/Relu 5.855ms

MobileNet/conv_ds_11/pointwise_conv/BiasAdd + MobileNet/conv_ds_11/pw_batch_norm/Relu 0.748ms

MobileNet/conv_ds_12/depthwise_conv/BiasAdd + MobileNet/conv_ds_12/dw_batch_norm/Relu 4.874ms

MobileNet/conv_ds_12/pointwise_conv/BiasAdd + MobileNet/conv_ds_12/pw_batch_norm/Relu 0.791ms

(Unnamed Layer* 96) 0.118ms

MobileNet/conv_ds_13/depthwise_conv/BiasAdd + MobileNet/conv_ds_13/dw_batch_norm/Relu 5.715ms

MobileNet/conv_ds_13/pointwise_conv/BiasAdd + MobileNet/conv_ds_13/pw_batch_norm/Relu 0.502ms

MobileNet/conv_ds_14/depthwise_conv/BiasAdd 10.942ms

Time over all layers: 85.414

Why is the depthwise conv cost so much time?

May I know which data format do you use? NCHW or NHWC?

Thanks.

Hi AastaLLL,

tensorflow depthwise conv API only supports NHWC. I use NHWC data format.

Thanks

Currently, separable convolution is implemented with groups=C + conv1x1, and it's not efficient enough.

We're looking at the possibility to optimize general convolution groups. But we can't provide any firm commitments or estimates at this time.

Thanks and sorry for the inconvenience.

@373197201, can you please specify which implementation of "tensorflow mobilenet" you were using ?

We have implemented our own kernels in cuda, but would like more optimal convolutions like winograd.

Regards.

We're looking at the possibility, but we can't provide any firm commitments or estimates at this time.

Thanks.