Small object localization

I have a small dataset (around 50) of 2D CT images which I need to use to build a (CNN) network in order to detect the x and y dimensions of a specific landmark in every image. I know that every image contains the landmark. However, I need to pinpoint the coordinates of the landmark. The images are 512 x 512 pixels. I have already built a vanilla FCN model where I use Adam optimizer and MSE criterion for this “regression” problem. The issue is that to increase the accuracy of the predictions, I would need a substantially larger dataset, which I do not have currently. Instead, I decided to enhance the complexity of the model by dividing each image into patches (e.g. 4x4 grid cells) and use a multi output approach by first classifying if the patch may contain the landmark and then try to find the coordinate of the landmark using regression in the positive patches. I, however, am not sure if this approach would make sense and increase the accuracy of the predictions. Moreover, I am not sure if an example of such or a similar approach exists in PyTorch. See the illustration of the problem: the red dot is the landmark whose coordinates are given in y_regression and the grid containing the dot is labeled as 1 in the y_classification.
I appreciate any hints and/or suggestions.