I am trying to use the code for IIC (Invariant Information Clustering) : https://github.com/xu-ji/IIC
It basically calls the network twice on original input and modified input.
x_out = net(sample) # Softmax output for original sample
x_tf_out = net(noisy_sample) # Softmax output for noisy sample
loss = IIC_loss(x_out, x_tf_out)
And then the loss wants the original sample and the noisy sample to be labeled to the same class. It is computed by the IIC loss :
def IIC_loss(x_out, x_tf_out, EPS=sys.float_info.epsilon):
# has had softmax applied
_, k = x_out.size()
p_i_j = compute_joint(x_out, x_tf_out)
assert (p_i_j.size() == (k, k))
p_i = p_i_j.sum(dim=1).view(k, 1).expand(k, k)
p_j = p_i_j.sum(dim=0).view(1, k).expand(k, k) # but should be same, symmetric
# avoid NaN losses. Effect will get cancelled out by p_i_j tiny anyway
p_i_j[(p_i_j < EPS).data] = EPS
p_j[(p_j < EPS).data] = EPS
p_i[(p_i < EPS).data] = EPS
loss = - p_i_j * (torch.log(p_i_j)
- torch.log(p_j)
- torch.log(p_i))
loss = loss.sum()
return loss
def compute_joint(x_out, x_tf_out):
# produces variable that requires grad (since args require grad)
bn, k = x_out.size()
assert (x_tf_out.size(0) == bn and x_tf_out.size(1) == k)
p_i_j = x_out.unsqueeze(2) * x_tf_out.unsqueeze(1) # bn, k, k
p_i_j = p_i_j.sum(dim=0) # k, k
p_i_j = (p_i_j + p_i_j.t()) / 2. # symmetrise
p_i_j = p_i_j / p_i_j.sum() # normalise
return p_i_j
When I now call backward() on the loss, it leads to the following error :
loss.backward()
optimizer.step()
in backward
#allow_unreachable=True) # allow_unreachable flag
RuntimeError: unsupported operation: more than one element of the written-to tensor refers to a single memory location. Please clone() the tensor before performing the operation.
What changes in the code are needed in new version of Pytorch ? The code ran perfectly even for PyTorch 1.0.