Hooking into pytorch cuda calls

Sorry if this is is not a relevant question for this forum. I am trying to intercept cudaMemcpy calls from the C++ pytorch library for some data analysis. Is there anyone who can share what is the correct way to write a hook or interceptor to do this?. I looked at the cuHook example from NVIDIA toolkit samples. But that requires modification to the source code which is not possible in my case.