Are these two neural network structures equivalent?

class ModelOne(nn.Module):
  def __init__(self):
    super().__init__()
    self.weights = nn.Parameter(torch.randn(300, 10))
    self.bias = nn.Parameter(torch.zeros(10))
  def forward(self, x):
    return x @ self.weights + self.bias
class ModelTwo(nn.Module):
  def __init__(self):
    super().__init__()
    self.linear = nn.Linear(300, 10)
    
  def forward(self, x):
    return self.linear(x)

if so, then why does

mo = ModelOne()
[len(param) for param in mo.parameters()]

give
[300, 10]

while

mt = ModelTwo()
[len(param) for param in mt.parameters()]

give
[10, 10]

it turns out that

[param.size() for param in mo.parameters()]

gives
[torch.Size([300, 10]), torch.Size([10])]

while

[param.size() for param in  mt.parameters()]

gives
[torch.Size([10, 300]), torch.Size([10])]

transpose when using nn.Linear

nn.Linear is transposing the weights before multiplication so the two networks are equivalent.
To match the params shapes and the behaviour of nn.Linear you have to do:

class ModelOne(nn.Module):
  def __init__(self):
    super().__init__()
    self.weights = nn.Parameter(torch.randn(10, 300))
    self.bias = nn.Parameter(torch.zeros(10))
  def forward(self, x):
    return x @ self.weights.t() + self.bias