Hi,
I am new to PyTorch and it is unclear to me how I should format my dataset before feeding it into a model. In the videos and most of the postings I have found, datasets used were images, I am working with numerical dataframes to make an RNN. Currently I have a list of numpy arrays representing each seperate dataframe and a corresponding sequenced list of targets/ labels, each target corresponds to one dataframe. What is the best way to turn this data into a dataset for pytorch ?. Should I use a dictionary with keys of targets and values of arrays? Or Should I use lists as I currently have the data arranged? If I do use lists is there a specifc function I will want to call? Also will I want to turn all of the dataset into one tensor? will I want to turn each dataframe into a seperate dataframe? Or will I want to create batches? Lastly, if I turn each dataframe into it’s own tensor, what should I do with the corresponding target? I believe it takes more than one number to make a tensor. I imagine that this question is for beginners, but I have looked around for answers to this questions extensively and I don’t think there is much in the way of detailed explanations for this. I would be very appreciative of any answers you could give. Thanks!
What kind of information is in each numpy array or dataframe? Why You chose RNN for that? If you have classification problem I would recommend to test class ML algorithms like decision trees, forest, gradient boosting over algorithms