Hi,
I would like to read the source code for cudnn.benchmark, but there is no link to it in the docs, and I couldn’t find it by googling. Where can I find it?
The benchmark
flag will be passed eventually to algorithm_search
where either cudnnGet
or cudnnFind
will be called in the v7 API. A similar approach is used for the experimental v8 API.
Thanks!
Are these the various convolution implementations being benchmarked?: link
Where are these algorithms defined?
The algorithms are closed source and shipped in cuDNN.
Okay,
Thanks for the help
Hi,
In v7 API, it will use cudnnFindConvolutionForwardAlgorithmEx() to get the lowest compute time convolution algorithm, according to algorithm_search
But in v8, I can’t understand or find the corresponding part similar to it. Is get_plans_from_find the approach in v8 to select which convolution algorithm to run? Could you help me understanding this part?
Thanks!
cudnn_frontend::time_sorted_plan
should contain the execution plans sorted by their run time, which is then moved to sorted_plans
and returned.
@eqy would know the details about this implementation as he is the code owner.
@ptrblck is right, the corresponding function is get_plans_from_find
, but the actual timing is done by time_sorted_plan
which is called by get_plans_from_find
.
Thanks,
After tracing the corresponding part in cudnn frontend API, now I understand that time_sorted_plan
can find the the fastest plan.
But how does each plan correspond to a different convolution algorithm?
Is that because the cudnnBackendDescriptorType_t, which be passed in to get_plans_from_find
, is CUDNN_BACKEND_OPERATION_CONVOLUTION_FORWARD_DESCRIPTOR
?
Plans are derived from engine configs, which can be thought of as a combination of algorithm + specific knob settings. Benchmarking is done to find the best choice of algorithm + settings.