In this document slide 43 I read that it is recommended to use at::parallel_for
over OpenMP pragmas.
In another post here the individual elements of the tensor are accessed by the operator[]
, e.g.
torch::Tensor z_out = at::empty({z.size(0), z.size(1)}, z.options());
int64_t batch_size = z.size(0);
at::parallel_for(0, batch_size, 0, [&](int64_t start, int64_t end) {
for (int64_t b = start; b < end; b++) {
z_out[b] = z[b] * z[b];
}
});
Is this the right way to do or should one still use a tensor accessor (even when using at::parallel_for
)?