Can we use pre allocate output tensor for forward execution when inference model?

Is that possible to use forward execution with preallocated output tensors?

For example, can we use forward function like below?:

model = torch.load(…)
input = torch.Tensor(…)
output = torch.Tensor(…)
model.forward(input, output=[output])

When inferencing model, I need to save output into cuda ipc shared memory based tensor. But I didn’t found the way executing model with pre allocated shared memory based tensor. so, now I just use memcpy from model’s out to shared memory based tensor, but it seems not efficient.