Why doesn't llama use FSDP?

Instead they are using:

from fairscale.nn.model_parallel.layers import (
ParallelEmbedding,
RowParallelLinear,
ColumnParallelLinear,
)

I’m guessing they used FSDP only for training and didn’t think it was necessary for inference.
Any other ideas?