dp=torch.zeros(batch,height+1,width+1,self.hiden_size).cuda(device)
for b in range(batch):
for i in range(1,height+1):
for j in range(1,width+1):
dp[b,i,j]=self.compute(interaction[b,i-1,j-1],dp[b,i,j-1],dp[b,i-1,j],dp[b,i-1,j-1])
return dp[:,height,width]
I can’t exactly understand what the definition of In-Place operation is, is it based on a variable? or on a element of a variable?
In my code above, there is a dp procedure in my computation graph, and during dp, I didn’t modify the pre-element value of the dp Tensor after the value had been computed, just modify current element value of the dp Tensor via some compution based on three pre-elements in dp Tensor, but I got a RuntimeError “one of the variables needed for gradient computation has been modified by an inplace operation”.And It can work when I used .clone()
as follow.
dp=torch.zeros(batch,height+1,width+1,self.hiden_size).cuda(device)
for b in range(batch):
for i in range(1,height+1):
for j in range(1,width+1):
dp[b,i,j]=self.compute(interaction[b,i-1,j-1],dp[b,i,j-1].clone(),dp[b,i-1,j].clone(),dp[b,i-1,j-1].clone())
return dp[:,height,width]