Is there any good tutorial about dataset building out there?

(specifically audio dataset, but anything could be useful)