Including multiple and/or chained transforms for audio models

Does anyone have an efficient method of implementing and/or chaining together audio transforms within Pytorch Datasets or similar? I’m trying to make the dataset flexible enough that the user can input what transforms they want and whether any are chained together. torchvision has Compose() which allows chaining, so a possible solution might be to use torch.nn.Sequential() in a similar fashion. But that only allows for multiple transforms if they are chained together, not if you want to have two independent transforms to output two separate features i.e magnitude and phase.