Model takes twice the memory footprint with distributed data parallel

ptrblck September 3, 2021, 3:59am 6

You are most likely seeing the same effect described here.

1 Like

show post in topic

Home
Categories
Guidelines
Terms of Service
Privacy Policy

Powered by Discourse, best viewed with JavaScript enabled