How to frozen weights in TorchScript IR?

Hi, i just add a pass in TorchScript IR to convert BertLayer to fastertransformer Encoder, however i find model is slow after convert to TorchScript. I get Nvprof result and find a time consuming activity:

Type  Time(%)      Time     Calls       Avg       Min       Max  Name
 GPU activities:   57.50%  1.49484s     25200  59.319us  3.2000us  151.55us  _ZN2at6native27unrolled_elementwise_kernelIZZZNS0_21copy_device_to_deviceERNS_14TensorIteratorEbENKUlvE0_clEvENKUlvE2_clEvEUlfE_NS_6detail5ArrayIPcLi2EEE16OffsetCalculatorILi1EjESC_NS0_6memory15LoadWithoutCastENSD_16StoreWithoutCastEEEviT_T0_T1_T2_T3_T4_

I watched my final TorchScript IR, and i guess it’s reason is each time it runs it will do aten::contiguous several times, like:

%1752 : Float(*, *, requires_grad=1, device=cuda:0) = aten::contiguous(%1153, %21)

aten::contiguous is needed for Tensors which will be send to custom op because they will be convert by .transpose(-1, -2) first, but aten::contiguous seems time consuming. So is there any way that i can convert model weights to constant in TorchScript IR so that aten::contiguous(weights) will be convert to Constant Tensor, or if i can do something to avoid aten::contiguous? Thankyou very much!

.contiguous() is copying the data, if the data isn’t stored in a contiguous memory array e.g. after a transpose.
If your kernel needs to work on a contiguous array and you need to permute the tensor (i.e. you cannot pass it in the expected shape from the beginning), I don’t think there is a workaround.

Thankyou for response, so can i frozen weights to Constant in TorchScript IR? I mean, i just want to add Tensor t =, -2).contiguous() in TorchScript IR, if i convert model.weight to Constant, then Tensor t will be optimize to Constant by pass.
My current IR is:

%1128 : Float(*, *, requires_grad=1, device=cuda:0) = prim::GetAttr[name="weight"](%1126)
%1436 : Float(*, *, requires_grad=1, device=cuda:0) = aten::transpose(%1128, %10, %1147)
%1610 : Float(*, *, requires_grad=1, device=cuda:0) = aten::contiguous(%1436, %21)

And i want to covert it to:

%1060 : Tensor = prim::Constant[value=<Tensor>]()

Are you calling the transpose operation in the __init__ method of your model, the forward or somewhere else?
Could you transpose the parameter before and pass it to the model directly?
Also, don’t use the .data attribute as it might yield unwanted side effects.

Sorry, i need to write a pass for TorchScript IR so i can’t control the model, so i want to convert model’s weights to constant just in IR pass.

Or can i get actual Tensor of model weigth in TorchScript pass? If i have:

%weight : Tensor = prim::GetAttr[name="weight"](%1)

can i get the actual Tensor by Value %weight?

Yes, you should be able to get the tensor by its name. However, based on the IR it seems your graph contains the transpose and contiguous op, as it’s apparently needed in a custom layer.
I’m unsure, how you would like to avoid these operations.

I want to get the actual Tensor of %weight, and then create a new Constant Node by this Tensor and insert it to Graph after transpose it, so how to get tensor by name? Thankyou very much!