Appending a tensor of high dimension to create training set

Hi,
I have an image of dimension 1024x1024 which after flatteining comes out to be a 1048576x1 .
Apart from this I have a json file containing the coordinate information(3x1).In all there are 17,000 entries in that json file.
As such,I am appending the 3x1 vector to the flattened image ,this will create one sample of dimension 1048579x1
The final dataset dimension will be of 1048579x17,000.

Is there an efficient way to create this and load it appropriately in pytorch

I tried reducing the size of the image to 256x256,but my current code leads to a memory issue(even when it’s run on colab pro).

My current code (which was written in numpy is as follows)

import json
import cv2
import numpy as np



image=cv2.imread('blender_files/Model_0/image_0/model_0_0.jpg',0)
shape=image.shape
image_1=cv2.resize(image,(256,256))
flat_img=image_1.ravel()
print(flat_img.shape)
with open('blender_files/Model_0/image_0/vertices_0_0.json') as f:
    json1_str = f.read()
    json1_data=json.loads(json1_str)
print(len(json1_data))
local_coordinates=[]

for i in range(len(json1_data)):
    local_coord=json1_data[i]['local_coordinate']
    local_coord=np.array(local_coord)
    new_arr=np.hstack((flat_img,local_coord))
    new_arr=new_arr.tolist()
    local_coordinates.append(new_arr)
#local_coords=np.array(json1_data['local_coordinate'])
#print(local_coords.shape)
#new_arr=np.hstack((flat_img,local_coords))
print(new_arr.shape)
local_coordinates=np.array(local_coordinates)
print(local_coordinates.shape)

In my experience, it is not good approach to stack a such large tensor at once.
I recommend you to store (256x256 + 3, 1) size of tensor which indicates a single data.

Storing each data points separately has an advantage of easy managing and fast loading and so forth.
Think about why such a huge datasets (well-known open datasets) manages all data points(images) separately.

So just to rephrase you are saying,that once I get that tensor X,I should simply save it in a npy file .And thanks for the comment,in hindsight I think that makes more sense.
And hence my final dataset will contain about 17,000 npys right?

Also,is it better to downsample the image from 1024 to 256,because I feel that during training having a 1048579 vector might be a problem

Exactly.

And downsampling is up to your model. However, I haven’t ever seen 1048579 vector for training.
I recommend you to use other models which utilize 3x1 vector effectively. They perhaps have a name of ‘Conditional ~~’ something.