Okay, it seems to be related to this post (Time for moving data to GPU varies a lot)
Basically, the in the first case, the calls are all async, so it return immediately without actually finish the work.
Okay, it seems to be related to this post (Time for moving data to GPU varies a lot)
Basically, the in the first case, the calls are all async, so it return immediately without actually finish the work.