Significant performance loss of ML code on GPU

Here are the posts I referred to: