Search results

independent Component Analysis: algorithms and applications
...<math>g \,</math> and <math>h \,</math>, <math>g(y_i) \,</math> and <math>h(y_j) \,</math> are uncorrelated. ...possible values <math>\{x_1, x_2, ..., x_n\} \,</math> is defined as <math>H(X) = -\sum_{i=1}^n {p(x_i) \log p(x_i)}</math> ...

15 KB (2,422 words) - 09:45, 30 August 2017
stat946w18/Unsupervised Machine Translation Using Monolingual Corpora Only
...finite sequences of words in the source and target language, and let <math>H'</math> denote the set of finite sequences of vectors in the latent space. ...s a sequence of hidden states <math display="inline">(h_1,\ldots, h_m) \in H'</math> in the latent space. Crucially, because the word vectors of the tw ...

28 KB (4,522 words) - 21:29, 20 April 2018
stat946w18/Spectral normalization for generative adversial network
...to the largest singular value of A. Therefore, for a linear layer <math> g(h)=Wh </math>, the norm is given by <math> ||g||_{Lip}=\sigma(W) </math>. Obs ...ator more sensitive, one would hope to make the norm of <math> \bar{W_{WN}}h </math> large. For weight normalization, however, this comes at the cost of ...

16 KB (2,645 words) - 10:31, 18 April 2018
Task Understanding from Confusing Multi-task Data
...The authors define the deconfusing function as an indicator function <math>h(x, y, g_k) </math> which takes some sample <math>(x,y)</math> and determine $$ R(g,h) = \int_x \sum_{j,k} (f_j(x) - g_k(x))^2 \; h(x, f_j(x), g_k) \;p(f_j) \; p(x) \;\mathrm{d}x $$ ...

27 KB (4,358 words) - 15:35, 7 December 2020
learning a Nonlinear Embedding by Preserving Class Neighborhood Structure
stochastic binary feature vector <math> \mathbf h </math> are modeled by products of conditional Bernoulli distributions: <br> <center> <math> \mathbf p(x_i=1|h)= \sigma(b_i+\sum_{j}W_{ij}j_j) </math> </center> ...

20 KB (3,263 words) - 09:45, 30 August 2017
dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces
Let <math>({ H}_1, k_1)</math> and <math>({H}_2, k_2)</math> be RKHS over <math>(\Omega_1, { B}_1)</math> and <math>(\Om <math><f, \Sigma_{YU}g>_{{H}_1} \approx \frac{1}{n} ...

14 KB (2,403 words) - 09:45, 30 August 2017
STAT946F17/Conditional Image Generation with PixelCNN Decoders
...purpose of the latent vector is to model the conditional distribution $p(x|h)$ such that we get a probability as to if the images suites this descriptio $$p(x|h) = \prod\limits_{i=1}^{n^2} p(x_i | x_1, ..., x_{i-1}, h)$$ ...

31 KB (4,917 words) - 12:47, 4 December 2017
deep neural networks for acoustic modeling in speech recognition
<math> E\left(\mathbf{v}, \mathbf{h}; \mathbf{W}\right) = - \sum_{i \in visible}a_iv_i - \sum_{j \in hidden}b_j * <math>\mathbf{h}</math> is the vector of hidden units, with components <math>h_j</math> and ...

24 KB (3,699 words) - 09:46, 30 August 2017
The Curious Case of Degeneration
:<math>PP(p) := 2^{H(p)}=2^{-\sum_x p(x)\log_2 p(x)}</math> Here <math>H(p)</math> is the entropy in bits and <math>p(x)</math> is the probability o ...

13 KB (2,144 words) - 05:41, 10 December 2020
stat441F18/YOLO
h <math>(x, y)</math> and <math>(w, h)</math> are normalized to the range <math>(0, 1)</math>. Further, <math>p_c ...

19 KB (2,746 words) - 16:04, 20 November 2018
Research on Multiple Classification Based on Improved SVM Algorithm for Balanced Binary Decision Tree
[1] S. Y. Xia, H. Pan, and L. Z. Jin, “Multi-class SVM method based on a non-balanced binary H. Yu and C. K. Mao, “Automatic three-way decision clustering algorithm based ...

9 KB (1,392 words) - 01:45, 23 November 2021
markov Chain Definitions
<math> I = \displaystyle\int^\ h(x)f(x)\,dx </math> by <math>\hat{I} = \frac{1}{N}\displaystyle\sum_{i=1}^Nh ...

5 KB (865 words) - 09:45, 30 August 2017
Summary of A Probabilistic Approach to Neural Network Pruning
...n its value never changes quicker than the function <math display="inline">h(x)=Kx</math>. The reason the activation functions are Lipschitz continuous [3] Han, S., Mao, H., and Dally, W. J. Deep compression: Compressing deep neural networks with ...

28 KB (4,367 words) - 00:30, 23 November 2021
Unsupervised Domain Adaptation with Residual Transfer Networks
...all $f \in \mathcal{H}_K$. Now, if we take $\phi: \mathcal{X} \to \mathcal{H}_K$, then we can define the MMD between two distributions $p$ and $q$ as fo ...thbf{E}_{x\sim p}(\phi(x^s)) - \mathbf{E}_{x\sim q}(\phi(x^t))||_{\mathcal{H}_K} ...

35 KB (5,630 words) - 10:07, 4 December 2017
stat946s13
...to the subspace spanned by the columns of <math>U_d</math>. A unique <math>H^+</math> solution can be obtained by finding the pseudo inverse of <math>X< ...ath> <math>X= U \Sigma V^T</math> <math>X^+ = V \Sigma^+ U^T</math> <math>H^+= U \Sigma V^T V \Sigma^+ U^T =UU^T</math> For each rank <math>d</math>, ...

29 KB (4,816 words) - 09:46, 30 August 2017
stat946w18/Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolutional Layers
* Li, H., Kadav, A., Durdanovic, I., Samet, H., & Graf, H. P. (2016). Pruning filters for efficient convnets. arXiv preprint arXiv:16 * Han, S., Mao, H., & Dally, W. J. (2015). Deep compression: Compressing deep neural networks ...

13 KB (1,942 words) - 00:18, 21 April 2018
summary
...hbf{h}_1), (\mathbf{x}_{2k}, \mathbf{h}_2) , ... (\mathbf{x}_{nk}, \mathbf{h}_n) } ...

12 KB (1,916 words) - 17:34, 18 March 2018
Neural Speed Reading via Skim-RNN
...\bf h}_{t-1} \in \mathbb{R}^d</math> and outputs the new state <math>{\bf h}_t </math> (although the dimensions of the hidden state and input are the ...\alpha({\bf x}_t, {\bf h}_{t-1})) = \text{softmax}({\bf W}[{\bf x}_t; {\bf h}_{t-1}]+{\bf b}) \in \mathbb{R}^k</math> ...

27 KB (4,321 words) - 05:09, 16 December 2020
visualizing Similarity Data with a Mixture of Maps
...^m-y_j^m ||^2, \quad z_i=\sum_{h}\sum_{m} \pi_{i}^{m} \pi_{h}^{m} e^{-d_{i,h}^{m}} </math> </center> ...

15 KB (2,530 words) - 09:45, 30 August 2017
CRITICAL ANALYSIS OF SELF-SUPERVISION
...uch that <math>\beta \leq \frac{wh}{WH}</math> and <math>\gamma \leq \frac{h}{w} \leq \gamma^{-1}</math>. The smalles size of crops is at least <math>\b ...

12 KB (1,792 words) - 00:08, 13 December 2020

Search results

Navigation menu

Search