Adjusting Output Size of Auxiliary Matrix in Python

In the below code, after hidden layer activation, I want to create an auxiliary matrix to better capture the temporal aspect of the data. After computation, the current shape of the return variable is [16,16] that is [out_channels , out_channels] but I want the returned shape to be [4, 16] that is [input_channels , out_channels]. Which part of the code should be modified to achieve the desired output while keeping the idea/logic as it is?

def my_fun( self, H: torch.FloatTensor,) -> torch.FloatTensor:
    
    self.input_channels = 4
    self.out_channels = 16
    self.forgettingFactor = 0.92
    self.lamb = 0.01
    self.M = torch.inverse(self.lamb*torch.eye(self.out_channels))
    HH = self.calculateHiddenLayerActivation(H) # [4,16]
    Ht = HH.t() # [16 , 4]
    ###### Computation of auxiliary matrix 
    
    initial_product = torch.mm((1 / self.forgettingFactor) * self.M, Ht) # [16, 4]
    intermediate_matrix = torch.mm(HH, initial_product )   # [4, 4]
    sum_inside_pseudoinverse = torch.eye(self.input_channels) + intermediate_matrix # [4, 4]

    pseudoinverse_sum = torch.pinverse(sum_inside_pseudoinverse) # [4, 4]
    product_inside_expression = torch.mm(HH, (1/self.forgettingFactor) * self.M) #  [4, 16]              
    dot_product_pseudo = torch.mm( pseudoinverse_sum  , product_inside_expression) # [4, 16]   
    dot_product_with_hidden_matrix = torch.mm(Ht, dot_product_pseudo ) # [16, 16]

    res = (1/self.forgettingFactor) * self.M - torch.mm((1/self.forgettingFactor) * self.M, dot_product_with_hidden_matrix  ) # [16,16]        

    return res

just use H, and not Ht, and invert the matmul order when computing the dot_product_with_hidden_matrix.

dot_product_with_hidden_matrix = torch.mm(dot_product_pseudo, H) # [4, 16]
res = (1/self.forgettingFactor) * self.M - torch.mm((1/self.forgettingFactor) * self.M, dot_product_with_hidden_matrix.t()  ) # [16, 4]        
return res.t() # [4, 16]

If I just use H, I am unable to perform the second step after ‘initial_product’ due to the same shape matrics.