I am trying to replicate the results of VPGNet ICCV 2017 in which for vanishing point prediction task, the target has 5 channels based on the location of vanishing point. If the VP exists in the image, the first 4 channels get 1 in the respective quadrant locations while the 5th channel is 0. If there doesnt exist a VP, the 5th channel is initalized to be 1. Hence in an image which has VP in it, the output of the model will be a 5D vector and target will also be a 5D vector(kind of one hot encoded). Now, the thing is how should I divide the dataset into train and val as there are total 19,302 images with easy VP, 262 with hard VP and 1,533 with no VP
@ptrblck any suggestions from your end!