The memory use when constructing the graph

How do PyTorch consume memory when dynamically constructing the training graph and the testing graph? e.g. Does it save every middle level Variable ? How to estimate the usage of the network? Is there any function for me to view the memory usage?