Technique details for machine learning

1. Why and when do we need to standardization for machine learning?

Normalization typically means rescales the values into a range of [0,1]. Standardization typically means rescales data to have a mean of 0 and a standard deviation of 1 (unit variance).

The short answer is that you generally need to do some kind of gradient descent to train your model, and this will rely on selecting initial conditions. Poor initial conditions will lead to poor convergence results. As an example, consider a single neuron in a neural network σ(ax+bwith only one feature, where say σ is a ReLu. In this case, if you initialize a,b to be N(0,1), they will have magnitudes on the order of 1. Now imagine your inputs are data with x being around −100. Most of your activations will be zero and hence gradient descent becomes impossible. You could remedy this with a leaky ReLu, but then the training would be slow. Whereas if you normalize your data to be commensurate with your initial conditions, you'll have non zero gradients.

References:

  1. Why and When Do We Need Standardization for Machine Learning
  2. Normalization vs Standardization — Quantitative analysis

 2. irregular time series

irregular time series or unevenly spaced time series. 

Unevenly spaced time series naturally occur in many industrial and scientific domains. 

A common approach to analyzing unevenly spaced time series is to transform the data into equally spaced observations using some form of interpolation - most often linear - and then to apply existing methods for equally spaced data. 

3. global anomalies for one-time series or multivariate time series. 

for multivariate, we define global anomalies are global anomalies in each dimension rather than in all dimensions. It can be one global anomaly in all dimensions or just in one dimension

The methods for detecting global anomalies:  

 

 

 

原文地址:https://www.cnblogs.com/dulun/p/14244356.html