Histogram function in pytorch

Is there any function in pytorch like numpy.histogram?

numpy.histogram(torch_tensor.numpy()) ?

torch.histc

6 Likes

Thanks for pointing that out.

I have a 3d tensor and I want to apply histogram on the 3rd dimension. How can I do that?

A histogram puts scalar values from a vector into different bins. If you pass in a multidimensional tensor, it treats it like a vector. Could you clarify what you mean?

I think he wants a histogram on z-axis for each couple (x,y) of an (x,y,z) tensor. That can be useful, for ex, in a loss function that compares histograms, which is the greedy way to compare statistical properties between two tensors (I tried to do that once for art-style transfer).

Unfortunately, if torch.histc flats the tensor, I see no ways to avoid a loop over couples (x,y).

1 Like

In that case then yes this needs to be achieved manually, probably using a loop. Unless numpy.histogram does this, but I assume you would have mentioned if this was the case.

yes, @alexis-jacq understood my problem. I am implementing a model where I have a 3d tensor of shape batch_size x sequence_len x feature_size and after applying histogram, I am expecting a tensor of shape batch_size x sequence_len x num_bins.

I am expecting a loop-less solution. Otherwise it will take longer time in GPU.

Hi,

If you are in the mood for a weekend hack, here is a not entirely serious and apparently not very stable (it seemed to be crashing for me fairly often) solution without a for loop.

I’m not sure how you do with more than very few features, but here you go.
It uses the fact that sparse matrix indices may contain the same coordinate multiple times, the matrix entry is then the sum of all values at the coordinate.

import torch
from matplotlib import pyplot
%matplotlib inline

data = torch.randn(2,1000,2) # batch x draws x (x,y)
data[1,:,1] += data[1,:,0]
d_scaled = (data*2.5).long().view(-1,2)
d_scaled -= d_scaled.min()
d_idx0 = (torch.arange(0,data.size(0)).long().view(-1,1)*torch.LongTensor(1,data.size(1)).fill_(1)).view(-1,1)
d_idx = torch.cat([d_idx0, d_scaled], dim=1)
d_ones = torch.FloatTensor(d_idx0.size(0)).fill_(1.0)
st = torch.sparse.FloatTensor(d_idx.t(),d_ones,torch.Size((2,20,20)))
hist = st.to_dense()
pyplot.subplot(1,2,1)
pyplot.contour(hist[0].numpy(), extent=(-4,4,-4,4))
pyplot.subplot(1,2,2)
pyplot.contour(hist[1].numpy(), extent=(-4,4,-4,4))

Best regards

Thomas

4 Likes

I am picking up on this topic from 3 years ago and see your clever solution to interpolating or binning a set of N-dimensional points into an N-dimensional tensor. That’s what I want to do, but I’m wondering if, since you wrote that, is there now a better way than using the sparse() command followed by to_dense()?

Personally, I think there is nothing wrong with it, but you could try using scatter_add from PyTorch Scatter instead.

Best regards

Thomas

Ah, interesting! Thanks Tom. However, I am trying to wrap my head around how I would scatter points into a 2D image/tensor – it looks like scatter only scatters along a single dimension.
If I can figure that out, then actually have floating point “indices” that I want to accumulate in a tensor. I suppose I could do a bilinear interpolation myself and then add them, unless you know of another function that would do that.