I’m wanting to train a SSD-Mobilenet model using my own dataset.
My dataset is labelled, below is the structure of my data;
Dataset
JPEGImages
0001.jpeg
0002.jpeg…
Annotations
0001.XML
0002.XML
Almost all tutorials i can find either use built in datasets or datasets containing a csv file.
Any ideas on how i can load the above structure into pytorch,I’ll be using torchvision.
Thanks
Flock1
(Flock Anizak)
June 11, 2021, 5:28pm
2
You can get a list of files using os.listdir
. Then, you can write a custom dataloader like this:
class data_gen(torch.utils.data.Dataset):
def __init__(self, files):
self.files = files
def __getitem__(self, I):
#read the images here, convert to torch format and return
def __len__(self):
return <number of images>
Then, use an instance like train_dl = data_gen(image_files)
and finally use
train_loader = torch.utils.data.DataLoader(
train_dl, batch_size= , shuffle=True, num_workers= , pin_memory=True)
Thanks for a prompt reply. I’ll give it a shot now.
Below is teh example code i’m trying to adapt, but the issue is i don’t have a csv file my annoattaions are simple in teh annotation folder as .xml???
class data_Gen(torch.utils.data.Dataset):
def init (self,csv_file,root_dir,transform):
self.annotations=pd.read_csv(csv_file)
self.root_dir=root_dir
self.transform=transform
def __getitem__(self,I):
img_path=os.path.join(self.root_dir,self.annotations.iloc[index,0])
image=io.imread((img_path))
y_label=torch.tensor((int(self.annotations.iloc[index,1])))
if self.transform:
image=self.transform(image)
return(image,y_label)
Flock1
(Flock Anizak)
June 12, 2021, 3:26am
5
You can read xml files in python. Look for library that does it and get the annotations from there and return it as y_label
Thanks and really sorry for the late reply, I just noticed your answer