maximum likelihood estimation of intrinsic dimension
Introduction
In dimensionality reduction (or manifold-learning) , the foundation of all methods is the belief that the observed data [math]\displaystyle{ \left\{ \mathbf{x}_{j} \right\} }[/math] are not truly in the high-dimensional [math]\displaystyle{ \mathbb{R}^{D} }[/math]. Rather, there exists a smooth mapping [math]\displaystyle{ \varphi }[/math] such that the data can be efficiently represented in a lower-dimensional space [math]\displaystyle{ \mathbb{R}^{d} }[/math] ([math]\displaystyle{ 0\lt d \leq D }[/math], called intrinsic dimension) by the mapping: [math]\displaystyle{ \mathbf{y}=\varphi(\mathbf{x}), \mathbf{y} \in \mathbb{R}^{d} }[/math]. Most methods (such as PCA, MDS, LLE, ISOMAP, etc.) focus on recover the embedding of high-dimensional data, i.e. [math]\displaystyle{ \left\{ \widehat{\mathbf{y} \right\} }[/math]. However, there is no consensus on how this intrinsic dimension [math]\displaystyle{ d }[/math] should be determined.
This paper reviewed several previous works in this topic and proposed a new estimator of intrinsic dimension. The properties of the estimator is discussed and the comparison between this estimator and others is carried out in the numeral experiments.