I currently train my neural network on my home computer. I have a 1660 Super and the input size of my data is 880k entries which result in ~8gb. So there’s enough data to really get that 1600 Super working.
Now gpustat is currently telling me that I have an average workload of 58%. Looks a bit small to me but maybe that’s fine, I have zero experience with cuda.
Now if my parallel application running on a “cpu cluster” would achieve 60% of the peak performance, then that’d be beleivable. Usually you are limited by communication, file IO or whatever else but with cuda and gpu? What can be expected?
I’m running arch linux and didn’t configure much. I connected via SSH, there is nothing running on that machine that’s visual.
Is 58% good or should I expect more?