I have a question about how can I preserve the priority of streams when capturing the CUDA graph. I have a set of kernels running on two streams with priority -1 and 0. However, my profiling result shows that this priority is not captured.
I also tried to change the flag for
cudaGraphInstantiateFlagUseNodePriority, but seems that it doesn’t do the job.