Custom image dataset for autoencoder

Zaide · April 8, 2018, 8:50am

Hi all,

I am trying to replicate experiments done with autoencoder in the following article : https://arxiv.org/pdf/1606.08921.pdf

Basically, I want to use an autoencoder to “filter” noise and artifacts from image, and more specifically in my case, medical MRI images of the brain.

I’ve tried some experiments with MNIST datasets, but obviously that is not the end goal. I’d like to build my custom dataset. I already have built an image library (in .png format). The inputs would be the noisy images with artifacts, while the outputs would be the clean images. As of now, I have my images in two folders structured like this :
Folder 1 - Clean images
img1.png
img2.png
…
imgX.png
Folder 2 - Transformed images
img1_transform1.png
img1_transform2.png
…
img1_transformY.png
…
imgX_transformY.png

So basically, I have, for each of the X “clean” images, Y different transformations (various level of noises, artifacts, etc.).

Does anyone have any idea on how to build a custom dataset for that kind of experiment (or a link to a detailed tutorial)? I’ve read on other topics but since I’m also quite new to PyTorch, I don’t really understand everything and all I’ve tried so far has failed miserably.

Thank you!

ptrblck · April 8, 2018, 10:05am

You could create a mapping between the clean images and the transformations, i.e.:

img1.png, img1_transform1.png
img1.png, img1_transform2.png
...
img2.png, img2_transform1.png
...

Saving this mapping to a text or .csv file, you can pass it to the Dataset as image paths:

class MyDataset(Dataset):
    def __init__(self, image_paths):
        self.image_paths = image_paths

    def __getitem__(self, index)
        image, image_transformed = load_image(self.image_paths[index])
        # transformations, e.g. Random Crop etc. 
        # Make sure to perform the same transformations on image and target
        # Here is a small example: https://discuss.pytorch.org/t/torchvision-transfors-how-to-perform-identical-transform-on-both-image-and-target/10606/7?u=ptrblck

        x, y = TF.to_tensor(image), TF.to_tensor(image_transformed)
        return x, y

    def __len__(self):
        return len(self.image_paths)

Wrap this Dataset into a DataLoader and you are good to go!
Let me know, if this works for you.

Zaide · April 9, 2018, 4:16am

Thank you! I will try this tomorrow.

Stanley_C · July 28, 2020, 2:52am

Hello, could you please demonstrate how the csv or txt of matching pairs would be used for loading point clouds, and what functions would be used(also I’m not quite sure what parameter would be changed).

ptrblck · July 28, 2020, 9:07am

It’s a bit hard to give an example without seeing the data structure.
Generally you should write a method (which would then be used as the __getitem__ method), which accepts an index and loads a single sample (data and target).
Usually the file will be (pre-)loaded in the __init__, while each sample will be loaded and transformed in the __getitem__.
To load csv or txt files I would recommend to use e.g. pandas (or any other lib you are more familiar with).

Stanley_C · July 29, 2020, 9:09pm

I’ll also start a new thread, just in case I am clogging up this thread.

import pandas as pd
import numpy as np

df=pd.read_csv('train.csv', sep=',', usecols = ['input','output'])
shape = df.shape

for current in range (shape[0]):	
	input_pc = np.loadtxt(df.iloc[current,0], delimiter=' ')
	print(input_pc.shape)
	output_pc = np.loadtxt(df.iloc[current,1], delimiter=' ')
	print(output_pc.shape)

So I wrote some code to load in the csv file for mappings, then load each corresponding input and output point cloud. For the input point cloud, it has the shape of (900,3) and the output point cloud has the shape of (8100,3). Each “point” has its x coordinate in the first layer, the y coordinate in the second layer, and the z coordinates in the third layer. Now that I have out input and the corresponding point clouds loaded as numpy arrays, could you please help me with modifying this function:

class MyDataset(Dataset):
    def __init__(self, image_paths):
        self.image_paths = image_paths

    def __getitem__(self, index):
        image, image_transformed = load_image(self.image_paths[index])
        # transformations, e.g. Random Crop etc. 
        # Make sure to perform the same transformations on image and target
        # Here is a small example: https://discuss.pytorch.org/t/torchvision-transfors-how-to-perform-identical-transform-on-both-image-and-target/10606/7?u=ptrblck

        x, y = TF.to_tensor(image), TF.to_tensor(image_transformed)
        return x, y

    def __len__(self):
        return len(self.image_paths)

to work with the dataloader function?

train_loader = DataLoader(MyDataset("./train.csv"), batch_size=16, shuffle=False, num_workers=0, worker_init_fn=None)

I’m currently not sure about what I should pass in as the “Dataset”, in the MyDataSet function. While I’m sure I’ll need to pass in the mappings in the form of the csv at some point, but I’m to quite sure about how to load the mappings into the Dataloader, or the custom function.

ptrblck · July 30, 2020, 3:55am

Your custom Dataset implementation could look like this:

class MyDataset(Dataset):
    def __init__(self, train_path):
        self.df = pd.read_csv(train_path, sep',', usecols=['input', 'output'])

    def __getitem__(self, index):
        input_pc = np.loadtxt(df.iloc[index, 0], delimiter=' ')
        output_pc = np.loadtxt(df.iloc[index, 1], delimiter=' ')
        x, y = torch.from_numpy(input_pc), torch.from_numpy(output_pc)
        return x, y

    def __len__(self):
        return len(self.df.shape[0])

This dataset can then be created and passed to the DataLoader via:

dataset = MyDataset('./train.csv')
loader = DataLoader(dataset, batch_size=...)

Stanley_C · August 9, 2020, 9:26pm

I’m first trying to replicate the image autoencoder, where the input and output image are different.
Here is my code.

"""
Dependencies:
torch: 0.4
matplotlib
numpy
"""
import torch
import torch.nn as nn
import torch.utils.data as Data
import torchvision
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
import numpy as np
import pandas as pd
from barbar import Bar



# torch.manual_seed(1)    # reproducible

# Hyper Parameters
EPOCH = 10
BATCH_SIZE = 256
LR = 0.005         # learning rate
DOWNLOAD_MNIST = False
N_TEST_IMG = 5

# Mnist digits dataset
class MyDataset():
    def __init__(self, image_paths):
        self.image_paths = image_paths

    def __getitem__(self, index):
        image, image_transformed = load_image(self.image_paths[index])
        # transformations, e.g. Random Crop etc. 
        # Make sure to perform the same transformations on image and target
        # Here is a small example: https://discuss.pytorch.org/t/torchvision-transfors-how-to-perform-identical-transform-on-both-image-and-target/10606/7?u=ptrblck

        x, y = TF.to_tensor(image), TF.to_tensor(image_transformed)
        return x, y

    def __len__(self):
        return len(self.image_paths)

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')


# Data Loader for easy mini-batch return in training, the image batch shape will be (50, 1, 28, 28)
trainDataSet=MyDataset("./train.csv")
testDataSet=MyDataset("./test.csv")
train_loader = Data.DataLoader(trainDataSet, batch_size=BATCH_SIZE, shuffle=True)
test_loader = Data.DataLoader(testDataSet, batch_size=BATCH_SIZE, shuffle=True)


class Autoencoder(nn.Module):
	def __init__(self):
		super(Autoencoder,self).__init__()
		
		self.encoder = nn.Sequential(
			nn.Conv2d(3, 6, kernel_size=5),
			nn.ReLU(True),
			nn.Conv2d(6,16,kernel_size=5),
			nn.ReLU(True))
		self.decoder = nn.Sequential(             
			nn.ConvTranspose2d(16,6,kernel_size=5),
			nn.ReLU(True),
			nn.ConvTranspose2d(6,3,kernel_size=5),
			nn.ReLU(True))
	def forward(self,x):
		x = self.encoder(x)
		x = self.decoder(x)
		return x


autoencoder = Autoencoder()
autoencoder.to(device)
print(autoencoder)


optimizer = torch.optim.Adam(autoencoder.parameters(), lr=LR)
loss_func = nn.MSELoss()

df = pd.DataFrame({'Epoch':[],'Train': [], 
					'Test':[]}) 
count = 1
for epoch in range(EPOCH):

	for step, (x, y) in enumerate(train_loader):
		autoencoder.train()
		running_loss = 0.0
		b_x = x.view(-1, 750*750).to(device)   # batch x, shape (batch, 28*28)
		b_y = y.view(-1, 900*800).to(device)   # batch y, shape (batch, 28*28)

		encoded, decoded = autoencoder(b_x)

		loss = loss_func(decoded, b_y)      # mean square error
		optimizer.zero_grad()               # clear gradients for this training step
		loss.backward()                     # backpropagation, compute gradients
		optimizer.step()                    # apply gradients
		running_loss += loss.item()
		

		
		if step % 100 == 0:
		
			for step, (x, b_label) in enumerate(test_loader):

				autoencoder.eval()
				test_loss = 0.0
				b_x = x.view(-1, 750*750).to(device)   # batch x, shape (batch, 28*28)
				b_y = x.view(-1, 900*800).to(device)   # batch y, shape (batch, 28*28)

				encoded, decoded = autoencoder(b_x)

				loss = loss_func(decoded, b_y)      # mean square error
				optimizer.zero_grad()               # clear gradients for this training step
				loss.backward()                     # backpropagation, compute gradients
				optimizer.step()                    # apply gradients
				test_loss += loss.item()
			print('Epoch: ', epoch + 1, '| train loss: %.4f' % running_loss, '| test loss: %.4f' % test_loss)

			df.loc[count] = [epoch + 1,running_loss, test_loss]
			count = count + 1		


torch.save(autoencoder.state_dict(), "./model.pt")
df.to_csv("./train-test-loss.csv", index = False)

However, I’m getting the error with the Dataset loader.

Autoencoder(
  (encoder): Sequential(
    (0): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))
    (1): ReLU(inplace=True)
    (2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
    (3): ReLU(inplace=True)
  )
  (decoder): Sequential(
    (0): ConvTranspose2d(16, 6, kernel_size=(5, 5), stride=(1, 1))
    (1): ReLU(inplace=True)
    (2): ConvTranspose2d(6, 3, kernel_size=(5, 5), stride=(1, 1))
    (3): ReLU(inplace=True)
  )
)
Traceback (most recent call last):
  File "train_depth_map.py", line 92, in <module>
    for step, (x, y) in enumerate(train_loader):
  File "C:\Users\Username\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\utils\data\dataloader.py", line 363, in __next__
    data = self._next_data()
  File "C:\Users\Username\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\utils\data\dataloader.py", line 403, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "C:\Users\Username\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\Users\Username\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "train_depth_map.py", line 38, in __getitem__
    image, image_transformed = load_image(self.image_paths[index])
NameError: name 'load_image' is not defined

I’m currently unsure about why the Dataset is creating the issue. It seems like it load into the Dataloader, but an error seems to be having in the main train loop. I tried adapting this example, which was originally for cifar, but it appears that the Dataset is not load the images properly. I also had to remove Dataset from class MyDataset(Dataset):, since I was getting errors that it would not defined.

ptrblck · August 10, 2020, 4:26am

The error points to the load_image function, which is undefined.
Did you forget to define this method in the current script?