Inconsistent in-place addition of transposed tensor

Adding the transposed version to a tensor in place (+=) produces inconsistent results.
Is it expected?

>>> tt = torch.arange(16).view(4,4)
>>> tt
tensor([[  0.,   1.,   2.,   3.],
        [  4.,   5.,   6.,   7.],
        [  8.,   9.,  10.,  11.],
        [ 12.,  13.,  14.,  15.]])
>>> tt.t()
tensor([[  0.,   4.,   8.,  12.],
        [  1.,   5.,   9.,  13.],
        [  2.,   6.,  10.,  14.],
        [  3.,   7.,  11.,  15.]])
>>> tt + tt.t()
tensor([[  0.,   5.,  10.,  15.],
        [  5.,  10.,  15.,  20.],
        [ 10.,  15.,  20.,  25.],
        [ 15.,  20.,  25.,  30.]])
>>> tt += tt.t()
>>> tt
tensor([[  0.,   5.,  10.,  15.],
        [  9.,  10.,  15.,  20.],
        [ 18.,  24.,  20.,  25.],
        [ 27.,  33.,  39.,  30.]])

Version 0.4 on Ubuntu 16.04.

In-place operations don’t work when operating on tensors that share storage. This code attempts to add tt.t() in place to tt, both of which share underlying storage.

If you want this to work, tt += tt.t().clone() should work.

If that is the case, some warnings ahead would be appreciated.

However, is it really not achievable?
It would be handy to save memory with in-place operations for very large matrices like in this case.