Hello, I’m recently working on project and it requires for me to know the gpu memory address of input tensor and output tensor for convolution operation.
First, I checked the weight address of model like below
for name, param in model.named_parameters():
model_param_dict[name] = param.data_ptr()
Then, I captured a cudagraph like below and then extracted kernelParams like below
stream = torch.cuda.Stream()
with torch.cuda.stream(stream):
err = cudart.cudaStreamBeginCapture(stream.cuda_stream, cudart.cudaStreamCaptureMode.cudaStreamCaptureModeGlobal)
model(inputs)
err, graph = cudart.cudaStreamEndCapture(stream.cuda_stream)
with torch.cuda.stream(stream):
err, _, num_nodes = cudart.cudaGraphGetNodes(graph, num_nodes)
err, nodes, num_nodes = cudart.cudaGraphGetNodes(graph, num_nodes)
kernel_params =
for node in nodes:
param_addresses =
if node.kernelParams:
param_array = ctypes.cast(node.kernelParams, ctypes.POINTER(ctypes.c_void_p))
# NULL이 나올 때까지 모든 파라미터 주소 출력
param_idx = 0
while True:
addr = param_array[param_idx]
if addr:
param_addresses.append(f"{hex(addr)}")
param_idx += 1
else:
break
kernel_params.append(param_addresses)
as far as I know, kernel_params list must contain the elements of model_param_dict since kernel parameters must contain weight address. HOWEVER, kernel_params list only contain some subsets of model_param_dict. It seems like kernel_params only have the weight address when the convolution algorith is winograd. Can someone explain what is the reason of it and how can I find a weight tensor address for all convolution algorithm with pytorch cnn model?
Thank you in advance!