Who can please tell me why the torchvision.io.read_video()function returns the defferent duration of vision and audio

‘’’
video_tensor, audio_tensor, info = read_video(video_path, start_pts=10, end_pts=20, pts_unit=“sec”)
frame_rate = info[“video_fps”]
sample_rate = info[“audio_fps”]

print(video_tensor.shape[0]/frame_rate)
print(audio_tensor.shape[1]/sample_rate)
‘’’
the duration is defferent, the former is 34.892712724434034, while the latter is 31.79925
the frame_rate is 11.72164523951137 and the sample_rate is 48,000

Now I found that the original video had some stuttering and frame drops
maybe it’s the reason