Hi, I am looking for some information regarding the best practice in PyTorch for data inputs into a model.
Problem description:
I work with satellite channels of different resolutions. Because of this one input may be half the size of another input (i.e. a 0.5 km resolution channel vs. a 1 km resolution channel). When I access the data in a generator, I load a tensor of inputs from a netcdf as well as a tensor targets (i.e. x and y). In order to account for the difference of resolutions, I have previously used numpy to created a concatenated array of inputs and an initial upsampling step within my U-Net model (mode nearest). However, I want to get away from using numpy, instead using Torch tensors to help reduce i/o. Correct me if I am wrong, but I believe the Torch tensors need the data to be the same size for a Torch tensor with torch.cat()
. I found the function torch.nn.functional.interpolate()
which could be implemented before running the model (i.e. in the data generator), but I am not sure if it is proper to do this.
Thus, in the spirit of best practice, what is the best way to deal with data of multiple sizes?
Disclaimer:
I am just switching to PyTorch from Keras for the customizability with regards to GPU use, so forgive me if this is rather obvious.