Why not use neon's backend?

kirk86 · December 18, 2017, 9:04pm

According to Soumith’s benchmark https://github.com/soumith/convnet-benchmarks (which have not been updated recently), it seems that neon’s custom cuda backend is quite fast, nothing new here we already new that. So, my thought was why not utilize that backend to speed up pytorch since neon is open source project and and there is interoperability since the backend is c/c++. So my question is, are there any restrictions or valid reasons not trying to bring that backend speed to pytorch?

Thanks!

smth · December 18, 2017, 9:20pm

the benchmarks are old. cudnn should be at par with neon now. Also, data layout for neon is CHWB which means pytorch needs to do a transpose to use that layout.

kirk86 · December 19, 2017, 2:11am

Thanks for the response Soumith! Any plans on updating the benchmarks?

smth · December 19, 2017, 2:46am

no plans, they are effectively dead. The space has converged to just cudnn becoming the best.
other good benchmarks are being developed that are more end-to-end, for example:

http://dawn.cs.stanford.edu/benchmark/

kirk86 · December 19, 2017, 5:55am

Thanks Soumith Appreciate it!