Speed of First pass is very slow

Thanks, with torch.jit.optimized_execution(False): really helped.