There seems to be a bug in the torch.baddbmm() interface. When alpha is equal to 0, if the calculation scale of tensor is relatively large, beta will not participate in the calculation and directly return the value of the torch_tensor3 variable. But if the calculation scale of tensor is small, beta will participate in the calculation at this time. why is that?
This is a bug in pytorch1.9 version, and beta does not participate in the calculation.
device = torch. device(" cuda")
torch_ tensor 1 = torch. randn( 887040, 10,124). to(device)
torch_ tensor_ 2 = torch . randn( 887040, 124, 1). to(device)
torch_ tensor 3 = torch. randn( 887040, 10, 1). to( device)
print(" torch_ tensor_ 3 device:",torch_ tensor_ 3. device)
print(" torch tensor 2 device:",torch tensor 2.device)
print(" torch tensor 1 device:",torch_ tensor 1. device)
torch_ out = torch . baddbmm(torch_ tensor 3,torch_ tensor 1, torch tensor 2, beta= 2, alpha= 0)
print(torch_ tensor _3.cpu().view( -1)[ :10])
print(torch_ out .cpu() .view(-1)[:10])