Help in reading, debugging, and resizing before feeding in to the PyTorch!

The output is showing the size of the contents of the 1st folder only i.e. 19,384x384. I want to get all the shapes from all folders in a boy array. However, some files has different shape(21,384,384) and (23,384,384). Can you tell me where and how should I use resize operation for this? I want the final storage of the array to be (5,19,384,384). I read the docs but still finding it hard to do it. if you can make the changes, I will be grateful to you. Directory structure is as follows–
C:/Users/ Desktop/train/’

PX1_0000- T2W.nii.gz + seg.nii.gz

PX1_0001- T2W.nii.gz+ seg.nii.gz
.
.
… PX1_0200- T2W.nii.gz+ seg.nii

path1='C:/Users/Desktop/train/'

def load_data(path):
  my_dir = sorted(os.listdir(path))
  for p in tqdm(my_dir):
    data1=[]
    gt=[]
    data_list = os.path.join(path+p)
    img = sitk.ReadImage(path + p + '/'+'seg.nii.gz')    
    seg =  sitk.GetArrayFromImage(img)
    img = sitk.ReadImage(path + p + '/'+ 'T2W.nii.gz')
    T2W = sitk.GetArrayFromImage(img)
    data1.append([T2W])
    gt.append(seg)
  data1 = np.asarray(data1,dtype=np.uint8)
  gt = np.asarray(gt,dtype=np.uint8)
  return data1,gt

data1,gt1 = load_data(path1)

I guess you are doing segmentation…so rather than resizing, crop-pad is more appropriate. Since, the max size is 23 (in 0th dimension), you can pad the other images using np.pad

seg = np.pad(seg, [(23-seg.shape[0], 0), (0, 0), (0, 0)]) # pads in the 0th dimension
T2W = np.pad(T2W, [(23-seg.shape[0], 0), (0, 0), (0, 0)])

before appending into data1 and gt.

But my files are not getting read from the directories. Padding I can check and add it. But can you please look into os.join thing. Only files from the 1st folder are read!

@user_123454321

data1 and gt should be initialized outside the for loop? Because when you go to each new folder they are re-initialized to empty lists…

Padding is okay— to be done in z direction! According to the 3D image format!
So what changes should I make!? I mean do you think is it even going into the other directories?

path1='C:/Users/Desktop/train/'

def load_data(path):
  my_dir = sorted(os.listdir(path))
  data1=[]
  gt=[]
  for p in tqdm(my_dir):
    data_list = os.path.join(path+p)
    img = sitk.ReadImage(path + p + '/'+'seg.nii.gz')    
    seg =  sitk.GetArrayFromImage(img)
    img = sitk.ReadImage(path + p + '/'+ 'T2W.nii.gz')
    T2W = sitk.GetArrayFromImage(img)
    seg = np.pad(seg, [(23-seg.shape[0], 0), (0, 0), (0, 0)]) # pads in the 0th dimension
    T2W = np.pad(T2W, [(23-T2W.shape[0], 0), (0, 0), (0, 0)])
    data1.append([T2W])
    gt.append(seg)

  data1 = np.asarray(data1,dtype=np.uint8)
  gt = np.asarray(gt,dtype=np.uint8)
  return data1,gt

data1,gt1 = load_data(path1)

Can you try this please ?

1 Like

It’s showing 5 folders, but the T2W size is showing (5,1,23,384,384), z direction is okay! But where this 1 is coming from? If I guess it correctly, it’s the number of channel, right? They are grayscale 3D images!

@

I believe it is because of this

data1.append([T2W])

Can you change this to

data1.append(T2W)
1 Like

Thanks man, if I face any issues, I will msg you!!

Hey @user_123454321, I have send you one. Can you please have a check.