Ideal dataset format for Adversarial Autoencoder?

Hi,

I am building an Adversarial Autoencoder and I am interested in hearing others’ feedback about the ideal way of representing my dataset which is not imagery / pixel data but rather a large set of records related to a digital acquisition system. I have been pulling sample data from a temperature thermistor along with a bunch of other environmental variables from a piece of equipment and would like to create a generative model that is able to mimic the behavior of the system by using the thermistor temperature column as a label for each CSV row within the dataset file.

The dataset file looks like this:

aaa,aab,aac,aad,aae,aaf…mmm,mmn,mmo,mmp,label
bytes,bytes,bytes,bytes,bool,bool,bool…int,int,int,float

So I’ve got a mixture of different datatypes and my thoughts were to first cast each column to an integer and then create a vector from each int, followed by normalization. Those vectors would would then be the input to the encoder network to create the latent space for the AAE.

Is this a logical approach or should I be handing everything to the encoder network input layer as binary data similar to as how pixelwise AAEs do?

I am essentially looking to sample from the latent space similar exemplars pulled from the thermistor in the future, for determining from the generative model what state the rest of the machine should be in based upon generated samples if that makes sense.

Thanks in advance for any feedback!