Problems with target arrays of int (int32) types in loss functions

fmassa · January 22, 2017, 11:22pm

Hi,

I’ll try to address your questions in the following points.

For training neural networks, using float is more than enough precision-wise, so no need for double.
For keeping track of indices, int32 might not be enough for large models, so int64 (long) is preferred. That’s probably one of the reasons why we use long whenever we pass indices to functions (including NLLLoss)
Note that you can also convert a numpy array to a tensor using torch.Tensor(numpy_array), and you can specify the type of the output tensor you want, in your case torch.LongTensor(numpy_array). This constructor does not share memory with the numpy array, so it’s slower and less memory efficient than the from_numpy equivalent.
You can get the type of the tensor by passing no arguments to the type function, so tensor.type() returns the type of the tensor, and you can do things like

tensor = torch.rand(3).double()
new_tensor = torch.rand(5).type(tensor.type())