Generalized normal distribution and Skew normal distribution

Density Function

The Generalized Gaussian density has the following form:

$mathcal{GG}(x; ho) = frac{1}{2\, Gamma(1+1/ ho)} exp(-|x|^{ ho})$

where $ho$ (rho) is the "shape parameter". The density is plotted in the following figure:

Matlab code used to generate this figure is available here: ggplot.m.

Adding an arbitrary location parameter, $mu$ , and inverse scale parameter, $eta$ , the density has the form,

$mathcal{GG}(x;mu,eta, ho) = frac{eta^{1!/2}}{2\, Gamma(1+1/ ho)} exp(-eta^{ ho/2}|x-mu|^{ ho})$

Matlab code used to generate this figure is available here: ggplot2.m.

Generating Random Samples

Samples from the Generalized Gaussian can be generated by a transformation of Gamma random samples, using the fact that if $Y$ is a $ext{Gamma}(1/ ho,1)$ distributed random variable, and $S$ is an independent random variable taking the value -1 or +1 with equal probability, then,

$X = S cdot |Y|^{1/ ho}$

is distributed $matcal{GG}(x;0,1, ho)$ . That is,

$parstyleegin{eqnarray*}Y & sim & ext{Gamma}(1/ ho,1) \S & sim & mbox{$frac{1}{2}$}\, [S=-1] + mbox{$frac{1}{2}$}\, [S=1] \mu + eta^{-1/2} S cdot |Y|^{1/ ho} & sim & mathcal{GG}(x;mu,eta, ho)end{eqnarray*}$

where the density of $S$ is written in a non-standard but suggestive form.

Matlab Code

Matlab code to generate random variates from the Generalized Gaussian density with parameters as described here is here:

gg6.m

As an example, we generate random samples from the example Generalized Gaussian densities shown above.

Matlab code used to generate this figure is available here: ggplot3.m.

Mixture Densities

A more general family of densities can be constructed from mixtures of Generalized Gaussians. A mixture density, $p_M(x)$ , is made up of $m$ constituent densities $p_j(x),\, j = 1,ldots,m,$ together with probabilities $alpha_j$ associated with each constituent density.

$p_M(x) = sum_{j=1}^m alpha_j p_j(x)$

The densities $p_j(x)$ have different forms, or parameter values. A random variable with a mixture density can be thought of as being generated by a two-part process: first a decision is made as to which constituent density to draw from, where the $j ext{th}$ density is chosen with probability $alpha_j$ , then the value of the random variable is drawn from the chosen density. Independent repetitions of this process result in a sample having the mixture density $p_M$ .

As an example consider the density,

$mbox{$frac{1}{2}$}\,mathcal{GG}(x;-2,1,1) + mbox{$frac{2}{10}$}\,mathcal{GG}(x;0,1,2) + mbox{$frac{3}{10}$}\,mathcal{GG}(x;2,1,10)$

Matlab code used to generate these figures is available here: ggplot4.m.

The generalized normal distribution or generalized Gaussian distribution (GGD) is either of two families of parametric continuous probability distributions on the real line. Both families add a shape parameter to the normal distribution. To distinguish the two families, they are referred to below as "version 1" and "version 2". However this is not a standard nomenclature.

Version 1

Generalized Normal (version 1)
Probability density function
Cumulative distribution function
Parameters	$mu \,$ location (real) $alpha \,$ scale (positive, real) $eta \,$ shape (positive, real)
Support	$xin (-infty ;+infty )!$
PDF	$frac{eta}{2alphaGamma(1/eta)} ; e^{-(\|x-mu\|/alpha)^eta}$ $Gamma$ denotes the gamma function
CDF	$frac{1}{2} + sgn(x-mu)frac{gammaleft[1/eta, left( frac{\|x-mu\|}{alpha} ight)^eta ight]}{2Gamma(1/eta)}$ $gamma$ denotes the lower incomplete gamma function
Mean	$mu \,$
Median	$mu \,$
Mode	$mu \,$
Variance	$frac{alpha^2Gamma(3/eta)}{Gamma(1/eta)}$
Skewness	0
Ex. kurtosis	$frac{Gamma(5/eta)Gamma(1/eta)}{Gamma(3/eta)^2}-3$
Entropy	$frac{1}{eta}-logleft[frac{eta}{2alphaGamma(1/eta)} ight]$ ^[1]

Known also as the exponential power distribution, or the generalized error distribution, this is a parametric family of symmetric distributions. It includes all normal and Laplacedistributions, and as limiting cases it includes all continuous uniform distributions on bounded intervals of the real line.

This family includes the normal distribution when $extstyleeta=2$ (with mean $extstylemu$ and variance $extstyle frac{alpha^2}{2}$ ) and it includes the Laplace distributionwhen $extstyleeta=1$ . As $extstyleeta ightarrowinfty$ , the density converges pointwise to a uniform density on $extstyle (mu-alpha,mu+alpha)$ .

This family allows for tails that are either heavier than normal (when $eta<2$ ) or lighter than normal (when $eta>2$ ). It is a useful way to parametrize a continuum of symmetric, platykurticdensities spanning from the normal ( $extstyleeta=2$ ) to the uniform density ( $extstyleeta=infty$ ), and a continuum of symmetric, leptokurticdensities spanning from the Laplace ( $extstyleeta=1$ ) to the normal density ( $extstyleeta=2$ ).

Parameter estimation

Parameter estimation via maximum likelihood and the method of moments has been studied.^[2] The estimates do not have a closed form and must be obtained numerically. Estimators that do not require numerical calculation have also been proposed.^[3]

The generalized normal log-likelihood function has infinitely many continuous derivates (i.e. it belongs to the class C^∞ of smooth functions) only if $extstyleeta$ is a positive, even integer. Otherwise, the function has $extstylelfloor eta floor$ continuous derivatives. As a result, the standard results for consistency and asymptotic normality of maximum likelihood estimates of $eta$ only apply when $extstyleetage 2$ .

Maximum likelihood estimator

It is possible to fit the generalized normal distribution adopting an approximate maximum likelihood method.^[4]^[5] With $mu$ initially set to the sample first moment $m_{1}$ , $extstyleeta$ is estimated by using a Newton–Raphson iterative procedure, starting from an initial guess of $extstyleeta= extstyleeta_0$ ,

eta _0 = frac{m_1}{sqrt{m_2}},

where

m_1={1 over N} sum_{i=1}^N |x_i|,

is the first statistical moment of the absolute values and $m_{2}$ is the second statistical moment. The iteration is

eta _{i+1} = eta _{i} - frac{g(eta _{i})}{g'(eta _{i})} ,

where

{displaystyle g(eta )=1+{frac {psi (1/eta )}{eta }}-{frac {sum _{i=1}^{N}|x_{i}-mu |^{eta }log |x_{i}-mu |}{sum _{i=1}^{N}|x_{i}-mu |^{eta }}}+{frac {log({frac {eta }{N}}sum _{i=1}^{N}|x_{i}-mu |^{eta })}{eta }},}

and

{displaystyle {egin{aligned}g'(eta )={}&-{frac {psi (1/eta )}{eta ^{2}}}-{frac {psi '(1/eta )}{eta ^{3}}}+{frac {1}{eta ^{2}}}-{frac {sum _{i=1}^{N}|x_{i}-mu |^{eta }(log |x_{i}-mu |)^{2}}{sum _{i=1}^{N}|x_{i}-mu |^{eta }}}\[6pt]&{}+{frac {left(sum _{i=1}^{N}|x_{i}-mu |^{eta }log |x_{i}-mu | ight)^{2}}{left(sum _{i=1}^{N}|x_{i}-mu |^{eta } ight)^{2}}}+{frac {sum _{i=1}^{N}|x_{i}-mu |^{eta }log |x_{i}-mu |}{eta sum _{i=1}^{N}|x_{i}-mu |^{eta }}}\[6pt]&{}-{frac {log left({frac {eta }{N}}sum _{i=1}^{N}|x_{i}-mu |^{eta } ight)}{eta ^{2}}},end{aligned}}}

and where $psi$ and $psi '$ are the digamma function and trigamma function.

Given a value for $extstyleeta$ , it is possible to estimate $mu$ by finding the minimum of:

{displaystyle min _{mu }=sum _{i=1}^{N}|x_{i}-mu |^{eta }}

Finally $extstylealpha$ is evaluated as

{displaystyle alpha =left({frac {eta }{N}}sum _{i=1}^{N}|x_{i}-mu |^{eta } ight)^{1/eta }.}

Applications

This version of the generalized normal distribution has been used in modeling when the concentration of values around the mean and the tail behavior are of particular interest.^[6]^[7] Other families of distributions can be used if the focus is on other deviations from normality. If the symmetry of the distribution is the main interest, the skew normal family or version 2 of the generalized normal family discussed below can be used. If the tail behavior is the main interest, the student t family can be used, which approximates the normal distribution as the degrees of freedom grows to infinity. The t distribution, unlike this generalized normal distribution, obtains heavier than normal tails without acquiring a cusp at the origin.

Properties

The multivariate generalized normal distribution, i.e. the product of $n$ exponential power distributions with the same $eta$ and $alpha$ parameters, is the only probability density that can be written in the form $p(mathbf x)=g(|mathbf x|_eta)$ and has independent marginals.^[8] The results for the special case of the Multivariate normal distribution is originally attributed to Maxwell.^[9]

Version 2

Generalized Normal (version 2)
Probability density function
Cumulative distribution function
Parameters	$xi \,$ location (real) $alpha \,$ scale (positive, real) $kappa \,$ shape (real)
Support	$x in (-infty,xi+alpha/kappa) ext{ if } kappa>0$ $x in (-infty,infty) ext{ if } kappa=0$ $x in (xi+alpha/kappa; +infty) ext{ if } kappa<0$
PDF	$frac{phi(y)}{alpha-kappa(x-xi)}$ , where $y = egin{cases} - frac{1}{kappa} log left[ 1- frac{kappa(x-xi)}{alpha} ight] & ext{if } kappa eq 0 \ frac{x-xi}{alpha} & ext{if } kappa=0 end{cases}$ $phi$ is the standard normal pdf
CDF	$Phi(y)$ , where $y = egin{cases} - frac{1}{kappa} log left[ 1- frac{kappa(x-xi)}{alpha} ight] & ext{if } kappa eq 0 \ frac{x-xi}{alpha} & ext{if } kappa=0 end{cases}$ $Phi$ is the standard normal CDF
Mean	$xi - frac{alpha}{kappa} left( e^{kappa^2/2} - 1 ight)$
Median	$xi \,$
Variance	$frac{alpha^2}{kappa^2} e^{kappa^2} left( e^{kappa^2} - 1 ight)$
Skewness	$frac{3 e^{kappa^2} - e^{3 kappa^2} - 2}{(e^{kappa^2} - 1)^{3/2}} ext{ sign}(kappa)$
Ex. kurtosis	$e^{4 kappa^2} + 2 e^{3 kappa^2} + 3 e^{2 kappa^2} - 6$

This is a family of continuous probability distributions in which the shape parameter can be used to introduce skew.^[10]^[11]When the shape parameter is zero, the normal distribution results. Positive values of the shape parameter yield left-skewed distributions bounded to the right, and negative values of the shape parameter yield right-skewed distributions bounded to the left. Only when the shape parameter is zero is the density function for this distribution positive over the whole real line: in this case the distribution is a normal distribution, otherwise the distributions are shifted and possibly reversed log-normal distributions.

Parameter estimation

Parameters can be estimated via maximum likelihood estimation or the method of moments. The parameter estimates do not have a closed form, so numerical calculations must be used to compute the estimates. Since the sample space (the set of real numbers where the density is non-zero) depends on the true value of the parameter, some standard results about the performance of parameter estimates will not automatically apply when working with this family.

Applications

This family of distributions can be used to model values that may be normally distributed, or that may be either right-skewed or left-skewed relative to the normal distribution. The skew normal distribution is another distribution that is useful for modeling deviations from normality due to skew. Other distributions used to model skewed data include the gamma, lognormal, and Weibull distributions, but these do not include the normal distributions as special cases.

Other distributions related to the normal

The two generalized normal families described here, like the skew normal family, are parametric families that extends the normal distribution by adding a shape parameter. Due to the central role of the normal distribution in probability and statistics, many distributions can be characterized in terms of their relationship to the normal distribution. For example, the lognormal, folded normal, and inverse normal distributions are defined as transformations of a normally-distributed value, but unlike the generalized normal and skew-normal families, these do not include the normal distributions as special cases.
Actually all distributions with finite variance are in the limit highly related to the normal distribution. The Student-t distribution, the Irwin–Hall distribution and the Bates distribution also extend the normal distribution, and include in the limit the normal distribution. So there is no strong reason to prefer the "generalized" normal distribution of type 1, e.g. over a combination of Student-t and a normalized extended Irwin–Hall – this would include e.g. the triangular distribution (which cannot be modeled by the generalized Gaussian type 1).
A symmetric distribution which can model both tail (long and short) and center behavior (like flat, triangular or Gaussian) completely independently could be derived e.g. by using X = IH/chi.

Skew normal distribution

Skew Normal
Probability density function
Cumulative distribution function
Parameters	$xi \,$ location (real) $omega \,$ scale (positive, real) $alpha \,$ shape (real)
Support	$xin (-infty ;+infty )!$
PDF	${frac {1}{omega pi }}e^{{-{frac {(x-xi )^{2}}{2omega ^{2}}}}}int _{{-infty }}^{{alpha left({frac {x-xi }{omega }} ight)}}e^{{-{frac {t^{2}}{2}}}} dt$
CDF	$Phi left({frac {x-xi }{omega }} ight)-2Tleft({frac {x-xi }{omega }},alpha ight)$ $T(h,a)$ is Owen's T function
Mean	$xi +omega delta {sqrt {{frac {2}{pi }}}}$ where $delta ={frac {alpha }{{sqrt {1+alpha ^{2}}}}}$
Variance	$omega ^{2}left(1-{frac {2delta ^{2}}{pi }} ight)$
Skewness	$gamma _{1}={frac {4-pi }{2}}{frac {left(delta {sqrt {2/pi }} ight)^{3}}{left(1-2delta ^{2}/pi ight)^{{3/2}}}}$
Ex. kurtosis	$2(pi -3){frac {left(delta {sqrt {2/pi }} ight)^{4}}{left(1-2delta ^{2}/pi ight)^{2}}}$
MGF	$M_{X}left(t ight)=2exp left(xi t+{frac {omega ^{2}t^{2}}{2}} ight)Phi left(omega delta t ight)$
CF	${displaystyle e^{itxi -t^{2}omega ^{2}/2}left(1+i\,{ extrm {Erfi}}left({frac {delta omega t}{sqrt {2}}} ight) ight)}$

In probability theory and statistics, the skew normal distribution is a continuous probability distribution that generalises the normal distribution to allow for non-zero skewness.

Definition

Let $phi (x)$ denote the standard normal probability density function

phi (x)={frac {1}{{sqrt {2pi }}}}e^{{-{frac {x^{2}}{2}}}}

with the cumulative distribution function given by

Phi (x)=int _{{-infty }}^{{x}}phi (t) dt={frac {1}{2}}left[1+operatorname {erf}left({frac {x}{{sqrt {2}}}} ight) ight]

where erf is the error function. Then the probability density function (pdf) of the skew-normal distribution with parameter $alpha$ is given by

f(x)=2phi (x)Phi (alpha x).\,

This distribution was first introduced by O'Hagan and Leonard (1976). A popular alternative parameterization is due to Mudholkar and Hutson (2000), which has a form of the c.d.f. that is easily inverted such that there is a closed form solution to the quantile function.

A stochastic process that underpins the distribution was described by Andel, Netuka and Zvara (1984).^[1] Both the distribution and its stochastic process underpinnings were consequences of the symmetry argument developed in Chan and Tong (1986), which applies to multivariate cases beyond normality, e.g. skew multivariate t distribution and others. The distribution is a particular case of a general class of distributions with probability density functions of the form f(x)=2 φ(x) Φ(x) where φ() is any PDF symmetric about zero and Φ() is any CDF whose PDF is symmetric about zero.^[2]

To add location and scale parameters to this, one makes the usual transform $x ightarrow {frac {x-xi }{omega }}$ . One can verify that the normal distribution is recovered when $alpha =0$ , and that the absolute value of the skewness increases as the absolute value of $alpha$ increases. The distribution is right skewed if $alpha >0$ and is left skewed if $alpha <0$ . The probability density function with location $xi$ , scale $omega$ , and parameter $alpha$ becomes

f(x)={frac {2}{omega }}phi left({frac {x-xi }{omega }} ight)Phi left(alpha left({frac {x-xi }{omega }} ight) ight).\,

Note, however, that the skewness of the distribution is limited to the interval $(-1,1)$ .

Estimation

Maximum likelihood estimates for $xi$ , $omega$ , and $alpha$ can be computed numerically, but no closed-form expression for the estimates is available unless $alpha =0$ . If a closed-form expression is needed, the method of moments can be applied to estimate $alpha$ from the sample skew, by inverting the skewness equation. This yields the estimate

{displaystyle |delta |={sqrt {{frac {pi }{2}}{frac {|{hat {gamma }}_{1}|^{frac {2}{3}}}{|{hat {gamma }}_{1}|^{frac {2}{3}}+((4-pi )/2)^{frac {2}{3}}}}}}}

where $delta ={frac {alpha }{{sqrt {1+alpha ^{2}}}}}$ , and ${displaystyle {hat {gamma }}_{1}}$ is the sample skew. The sign of $delta$ is the same as the sign of ${displaystyle {hat {gamma }}_{1}}$ . Consequently, ${hat {alpha }}=delta /{sqrt {1-delta ^{2}}}$ .

The maximum (theoretical) skewness is obtained by setting ${delta =1}$ in the skewness equation, giving ${displaystyle gamma _{1}approx 0.9952717}$ . However it is possible that the sample skewness is larger, and then $alpha$ cannot be determined from these equations. When using the method of moments in an automatic fashion, for example to give starting values for maximum likelihood iteration, one should therefore let (for example) ${displaystyle |{hat {gamma }}_{1}|=min(0.99,|(1/n)sum {((x_{i}-{ar {x}})/s)^{3}}|)}$ .

Concern has been expressed about the impact of skew normal methods on the reliability of inferences based upon them.^[3]

Differential equation

The differential equation leading to the pdf of the skew normal distribution is

omega^4 f''(x)+left(alpha^2+2 ight) omega^2 (x-xi) f'(x)+f(x) left(left(alpha^2+1 ight) (x-xi )^2+omega^2 ight)=0

with initial conditions

egin{array}{l} displaystyle f(0)=frac{expleft(-frac{xi^2}{2omega^2} ight) operatorname{erfc}left(frac{alphaxi}{sqrt{2} omega} ight)} {sqrt{2pi}omega} ext{ and} \[16pt] displaystyle f'(0)=frac{expleft(-frac{left(alpha^2+1 ight)xi ^2} {2 omega^2} ight) left(2alphaomega+sqrt{2pi} xi expleft(frac{alpha^2 xi^2}{2 omega^2} ight) operatorname{erfc}left(frac{alphaxi}{sqrt{2} omega} ight) ight)} {2piomega^3}. end{array}

广义高斯分布：亚高斯信号，高斯信号，超高斯信号

一个信号的高斯性是通过其峭度定义的。在信号x的均值为零的条件下，其峭度定义如下：

kurt(x)=E{x^4}-3[E{x^2}]^2

<0 次高斯信号（亚高斯信号）

kurt(x) =0 高斯信号

>0 超高斯信号

当我们拿到任意信号x的一个样本后,可通过如下的计算求其峭度，进而判断高斯性：

假设x是1*N的行向量：

x=x-mean(x)*ones(1,N); %去均值

KurtX=mean(x.^4)-3*(mean(x.^2))^2; %求峭度

均匀分布的信号是次高斯信号，拉普拉斯分布的信号是超高斯信号。语音信号是超高斯信号。根据中心极限定理的意义，N个不同分布信号的联合分布有高斯化的趋势，所以信号的非高斯性是盲信号分离一个很好的优化判据。

相对于高斯信号，亚高斯信号更平坦多峰，超高斯信号更尖锐且有更长的尾巴。

对于高斯分布的信号，二阶统计量足以描述其特性，但是对于通信系统中典型的通信信号，其分布通常是欠高斯的，所以二阶统计量不足以描述其特性，必须用更高阶统计量描述其特性。

非平稳信号：可以简单地理解为分布参数或者分布律随时间发生变化。

高斯信号：是分布规律符合正态分布的非平稳信号

而非平稳高斯信号：就是信号的分布律不随时间变化，总是高斯的，但分布参数（均值和方差）却是随时间变化的。

一般对于非平稳信号，主要有时频分析和小波分析。

补充：

高斯信号就是信号的各种幅值出现的机会满足高斯分布的信号。

站在ICA上说，高斯信号的坏处就是，它看起来就是一堆玉米(顺便废话：它的概率密度曲线看起来确实很像玉米堆)，你在一堆玉米上再倒上一堆玉米，得到的仍然是一堆玉米，看不出来是由原来两堆玉米混起来的，所以在理论上是不可分离的。

超高斯分布比高斯分布更加集中

亚高斯分布比高斯分布平坦

超高斯：四阶累积量大于0

亚高斯：四阶累积量小于0