Processing speed during training

Can anyone share training throughput in PyTorch ?

I have a 6 layer CNN and have a batchsize of 900 images (221 x 221 x 3) across 4 Titan X Ti GPUs.

I get a throughput of 500 images/second.

Is this a good speed on par with other users ?