Is there a faster way to assign values in a matrix in LibTorch?

I’m assigning new values to a matrix and I’m doing it one by one for each individual value. Is there a faster way to do this in LibTorch? If I was using PyTorch (in Python instead of C++) I would be able to do this with one line,
matrix_two[:, tril[0], tril[1]] = matrix_one

But since I can’t use indexing like this to store values in C++ I did it this way, and it is very slow when in the forward function of a model.

The first ‘for’ loop “o” of ten iterations in the code below is just to help show the time it takes to run, though when it is in a model it runs even slower.

for (int o = 0; o<10; o++) {
    auto matrix_one = torch::rand({42, 903});
    auto matrix_two = torch::zeros({42, 42, 42});
    auto trl = torch::tril_indices(42, 42, 0);

    for (int p = 0; p<matrix_one.sizes()[0]; p++) {
        for (int z = 0; z<matrix_one[p].sizes()[0]; z++) {
            matrix_two[p][trl[0][z]][trl[1][z]] = matrix_one[p][z];
        }
    }
    std::cout << o << "\n";
}