I have a batch of 100, data vectors each of length 784, So my batch has size,
>>> images_vec.size()
torch.Size([100, 784])
How do I copy each vector, seq_length=28 times? So that I now have a batch of data vectors, which are simply repeated seq_length times. So the batch should now have size,
torch.Size([100, 784, 28])
Here’s the batch loading code,
import torch
import torchvision.datasets as dsets
import torchvision.transforms as transforms
# Hyper Parameters
sequence_length = 28
input_size = 28*28
batch_size = 100
# MNIST Dataset
train_dataset = dsets.MNIST(root='../data_tmp/',
train=True,
transform=transforms.ToTensor(),
download=True)
# Data Loader (Input Pipeline)
train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
batch_size=batch_size,
shuffle=True)
for i, (images, labels) in enumerate(train_loader):
#images_repeated = images.view(-1, sequence_length, input_size)
images_vec = images.view(-1, input_size)
#images_repeated = torch.randn(100,784,28) # pre-allocate
images_repeated = torch.randn(batch_size, input_size, sequence_length) # pre-allocate
#for j in range(sequence_length):
#images_repeated [,,j] = images_vec
y_true = labels
if i == 1:
break
images.size()
images_vec.size()
If you want to stay in pytorch (e.g. for GPU arrays), you can do (X[1] != X_tmp[1,:,0]).sum()==0 . The caveat is that NaN != NaN (X != X seemed to be easiest way to check for NaN in pytorch a while back).