New to machine learning and need some clarification

Clueless · August 2, 2018, 8:19am

I just finished a linear algebra/differential equations class, so I am familiar with how machine learning works under the hood. I need some help with the terminology and how the numbers are dealt with. If I draw comparisons to what I learned in calculus and LA/Diff Eq that are wrong or bad let me know. I am try to learn by association as it’s the most effective way for me.
If I have, lets say a furnace, and it has a pressure sensor, a temperature sensor and an outside skin sensor(easiest for me to picture as all the variables are directly related.) this would produce a vector of 3 variables, let’s say [100, 200, 150].
If that furnace had 3 zones of sensors, it would produce a matrix [100,200,150; 120, 220, 160; 110,240,220] (using semicolon between rows because I took a Matlab class last semester).
How would I enter this in a machine learning model? Would the matrix be considered 1 batch? Are the measurements (matrix elements) what are called features in this forum? If I lost the temp sensor, that would become my y variable and the other 2 would be x_1 and x_2, so how do I solve for a single y given 2 x as input? I saw were the input dimensions are batch, sequence length and feature. Am I wrong or is batch the whole matrix at time t or is it a group of matrices, say from t-5 to t, a 3d matrix with the third dimension as time? ? Sequence length is the second input. Is that the maximum length of any 1 row of the matrix or is it the whole matrix? The second dimension or number of columns? From what I read, the 3rd input is number of features, I believe this is the number of elements in the matrix, is that right? Any help would be very appreciated. Also, I have used some example code and tried to plug multiple inputs(x’s) into the model and tried to get one y out and it complains about the input and output being different sizes. I have seen some example of multi input to single output, but I don’t know what’s going on and am trying to understand it.
Thank you

Clueless · August 2, 2018, 9:15am

I found this keras explanation. Is it the same as pytorch or different?

https://machinelearningmastery.com/reshape-input-data-long-short-term-memory-networks-keras/

willmyers · June 15, 2023, 11:32pm

You have a good foundation in linear algebra and differential equations, which are indeed key mathematical underpinnings of machine learning (ML). It sounds like you’re trying to understand how to structure and interpret data in a machine-learning context.

In the context of machine learning, the measurements from your furnace sensors (pressure, temperature, outside skin) can indeed be called features. Each unique measurement (like [100, 200, 150]) would be a feature vector. Each of these vectors represents a single instance of data, which could be considered one example or observation.

In the matrix you’ve created with the sensor readings from the three zones, each row would be a separate instance (i.e., a feature vector). Therefore, this matrix could be considered a batch of data, with each row being an individual instance with three features.

If one of the sensors fails (e.g., the temperature sensor), you might decide to predict that missing feature based on the other sensor readings. In that case, the missing feature (temperature) would indeed become your output variable (y), and the other sensor readings would be your input variables (x_1 and x_2). The task of predicting the missing temperature is a regression problem.

When it comes to structuring data for a model, the three dimensions you mentioned (batch, sequence length, feature) are typical for sequential data (such as in recurrent neural networks or other sequence models). In that case, ‘batch’ usually refers to a set of sequences, ‘sequence length’ refers to the length of each of these sequences, and ‘feature’ refers to the dimensionality of each instance in the sequences. If you’re working with non-sequential data, you might only need to worry about ‘batch’ and ‘feature’, which correspond to the number of instances and the number of features per instance, respectively.

If you’re trying to input multiple 'x’s to predict one ‘y’, but are getting an error about input and output sizes, it may be due to how your data is structured or how your model is configured. It’s hard to say without more information.