RNN(LSTM, GRU) hidden states

I’ve seen 2 ways to use hidden states.
First way:
in class:
self.rnn = nn.rnn(…)
def forward(self, x, h):
out, h = self.rnn(x,h)
return out, h

In training:
for … epochs:
h = torch.zeros(num_layers, batch_size, hidden_size)

for … batch loop:
out, h = model(x, h)


Second way:
In class:
self.rnn = nn.rnn(…)
def weight_init():
self.h = torch.zeros(num_layers, batch_size, hidden_size)
def forward(x):
out, self.h = self.rnn(x, self.h)
return out

In training:
for epochs:
for batch:
out = model(x)

So, which way is correct, or what is the difference?
And also in the second way, should I do weight_init for every batch or for every epoch?
Thank you.