Please tell me what detail part I should study of pytorch to get the requirement

Hi, guys.

I am new to PyTorch, I have no idea to start with pytorch.
I am a college student, and doing a pytorch homework in limit time, so I can not learn all the knowledge of pytorch.

Let me tell you a little bit about the title:

the icon1 and icon2 are fixed image (I will provide the fixed icon image in the project, so pytorch can recognize it and record the location sequence) ,

this is a video records the icons’ moving trail,

now I should get the location data of them like this:

[
   "0s": {
    "icon1": (91.21, 87.77),
    "icon2": (64.32, 67.15)
  },
   "1s": {
    "icon1": (91.09, 85.73),
    "icon2": (62.11, 68.85)
  }
  ...
   "60s": {
    "icon1": (25.54, 80.76),
    "icon2": (13.24, 27.11)
  }
]

====

Please tell me what detail part I should study of pytorch to get the requirement.