Notes on “The Role of Manifold Learning in Human Motion Analysis “ 1

1.外在的高维与本质的低维

基于模型的方法

跟踪

初始化的代价

Role of Manifold:

作为任务中的约束

HMM组合线性模型以近似非线性

Linear Bilinear and Multi-linear Models:

能否对整个configuration space的分解？

1. PCA SVD

2.bilinear multi-linear tensor analysis HOSVD

目标是动态的，能否用Linear Embedding分解出形状信息？

deformation and self-occlusion 导致流形的非线性、扭曲

在流形上极远点，可能在Euclidean visual input上很近

因此， PCA MDS等不能正确发现其流形。

Nonlinear Dimensionality Reduction and Decomposition of Orthogonal Factors:

embedding of nonlinear manifolds based on local structure of the manifold.

2 categories:

spectral-embedding approaches : Isomap    LLE Laplacian eigenmaps Manifold Charting

        construct an affinity matrix between data points using data dependent kernels, which reflect local manifold

        kernel based learning    , in particular KPCA

statistical approaches:

Biological Motivation:

2.2 Learning a Simple Motion manifold

2.2.1 Case Study : The Gait Manifold

通过非线性降维，可以得到一欧氏空间内的嵌入

但被嵌入的空间最少可有几维？

与视角有关，前、后视只有2，而侧则要3

2.2.2 Learning the Visual Manifold:Generative Model

如何利用来学习移动物体的表达，以支持合成、姿态恢复、重建、跟踪

y_t = T_a \gamma{(x_t ; a)}

由body configuration 到 image presentation, 再经 global geometric transformation

!! 身体的configuration落在流形之上，从而受其约束，为合理状态

1. 在嵌入流形上，用函数表示，或用exemplar；或用HMM EM建模等

2. 嵌入流形如何与visual input space映射？

       问题： recover body configuration from input, ie. 学习 R^d –> R^e   但这是不可行的，因为 visual input is very high dimensional so learning such mapping would required a large number of sample to interpolate. 而且， inherent ambiguity in 2D data

       改为: 学习从embedding space   到 visual input space , i.e. , in a generative manner，再加一个 mechanism to directly solve for the inverse mapping

3   Y={y_ui \in R^d, i=1,…,N} X={x_I \in R^e, i=1,…,2N}    f^k : R^e –>R    f^k( x ) = p^k(x) + \sum_{i=1}^{N}{w_i^k \phi (|x – x_i|)}

     basis function常用的选择: thin-plate spline: \phi (u)= u^2 log(u)    multiquadric: \phi( u) \ root{(U^2 + c^2)} Guassian: e^{-cu^2} biharmonic: \phi ( u) = u and triharmonic \phi ( u) = u^3

    f(x) = B \Phi(x)

   保证orthogonality and to make the problem well posed:   \sum_{i=1}^{N}{w_I p_j(x_i)} =0, j =1,…,m

2.2.3 Solving for the Embedding Coordinates

Given a new input y \in R^d , find x \in R^e ,by solving for the inverse mapping.

x^* = \argmin_{x}{\Left | y – B \Phi {x} \|}

Least Squre

B = USV^T \Phi(x) = V\hat S U^T y

2.2.4 Synthesis , Recovery and Reconstruction

2.3 Adding More Variability : Factoring out the Style

Any input image is a function of many aspects such as person body structure, appearance, viewpoint illumination, and body configuration, etc.

1. Consider silhouette only. Adding a variable describing people shape variability.

Aim to learn a decomposable generative model that explicitly decomposes the following two factors:

1. Content

2.Style

y_t^s = \gamma(x_t^c; a, b^s)

1. 同一人的风格不变，可照前

2.不同人间，建立线性模型

2.4 Style Adaptive Tracking: Bayesian Tracking on a Manifold