Hi all,

Currently, I’m studying different approximation schemes in NN propagations.

Suppose I have input feature maps like 100 X 3 X 28 X 28 and kernels like 32 X 3 X 3 X3.

Because of approximate computing, I’d like to adapt the IFMs for different OFMs.

This means I have 100 X 32 X 3 X 28 X 28 inputs, and I need to conv2 the inputs with 32 X 3 X 3 X 3 one by one correspondingly. Is there a way to parallelize the process? Right now I use a loop implementation which is too slow.

Thank you in advanced!

Xin