stat441w18/summary 1: Difference between revisions
Line 25: | Line 25: | ||
3) Solving linear system of equations | 3) Solving linear system of equations | ||
In this paper, the authors propose |
Revision as of 09:46, 7 March 2018
Random features for large scale kernel machines
Group members
Faith Lee
Jacov Lisulov
Shiwei Gong
Introduction and problem motivation
In classification problems, kernel methods are used for pattern analysis and require only user-specified kernel. Kernel methods can be thought of as instance-based methods where they learn the i-th training examples are "remembered" to learn for corresponding weights. Prediction on untrained examples are then treated with a similarity function, k (also called a kernel between this untrained example and each of the training inputs. In short, a similarity function measures similarity between two objects. By conventional notation, we have that
[math]\displaystyle{ f(x; \alpha) = \sum_{i = 1}^{N} \alpha_i k(x, x_i) }[/math] where k is a kernel function, [math]\displaystyle{ k(x, x') \approx \sum_{j = i}^{D} Z(X; W_j)Z(X'; W_j) }[/math]
and [math]\displaystyle{ \alpha }[/math] are the corresponding weights.
An example of a kernel method is the support vector machine. Kernel methods provides a means of approximating a non-linear function or decision boundary. However, the problem of using kernel methods are that:
1) It scales poorly with the size of the dataset
2) Requires computation of matrices (and inverses)
3) Solving linear system of equations
In this paper, the authors propose