Caffe2 vs cuda performance comparison

Hi,

has anyone performed a comparison between the performance of caffe2 and hand-tuned C/C++ CUDA production engine ( even if it is just your general experience working with both )?

It would be great if you could share the results.

Thanks