To make training the network easier, we standardize each of the continuous variables. That is, we'll shift and scale the variables such that they have zero mean and a standard deviation of 1.
The scaling factors are saved so we can go backwards when we use the network for predictions.
SHIFTING
If we have one random variable, that is constructed by adding a constant to another random variable
- We would shift the mean by that constant
- It would not shift the standard deviation