Search results

Jump to navigation Jump to search
View ( | ) (20 | 50 | 100 | 250 | 500)
  • [1] S. Y. Xia, H. Pan, and L. Z. Jin, “Multi-class SVM method based on a non-balanced binary H. Yu and C. K. Mao, “Automatic three-way decision clustering algorithm based ...
    9 KB (1,392 words) - 01:45, 23 November 2021
  • <math> I = \displaystyle\int^\ h(x)f(x)\,dx </math> by <math>\hat{I} = \frac{1}{N}\displaystyle\sum_{i=1}^Nh ...
    5 KB (865 words) - 09:45, 30 August 2017
  • ...n its value never changes quicker than the function <math display="inline">h(x)=Kx</math>. The reason the activation functions are Lipschitz continuous [3] Han, S., Mao, H., and Dally, W. J. Deep compression: Compressing deep neural networks with ...
    28 KB (4,367 words) - 00:30, 23 November 2021
  • ...all $f \in \mathcal{H}_K$. Now, if we take $\phi: \mathcal{X} \to \mathcal{H}_K$, then we can define the MMD between two distributions $p$ and $q$ as fo ...thbf{E}_{x\sim p}(\phi(x^s)) - \mathbf{E}_{x\sim q}(\phi(x^t))||_{\mathcal{H}_K} ...
    35 KB (5,630 words) - 10:07, 4 December 2017
  • ...to the subspace spanned by the columns of <math>U_d</math>. A unique <math>H^+</math> solution can be obtained by finding the pseudo inverse of <math>X< ...ath> <math>X= U \Sigma V^T</math> <math>X^+ = V \Sigma^+ U^T</math> <math>H^+= U \Sigma V^T V \Sigma^+ U^T =UU^T</math> For each rank <math>d</math>, ...
    29 KB (4,816 words) - 09:46, 30 August 2017
  • * Li, H., Kadav, A., Durdanovic, I., Samet, H., & Graf, H. P. (2016). Pruning filters for efficient convnets. arXiv preprint arXiv:16 * Han, S., Mao, H., & Dally, W. J. (2015). Deep compression: Compressing deep neural networks ...
    13 KB (1,942 words) - 00:18, 21 April 2018
  • ...hbf{h}_1), (\mathbf{x}_{2k}, \mathbf{h}_2) , ... (\mathbf{x}_{nk}, \mathbf{h}_n) } ...
    12 KB (1,916 words) - 17:34, 18 March 2018
  • ...\bf h}_{t-1} \in \mathbb{R}^d</math> and outputs the new state <math>{\bf h}_t </math> (although the dimensions of the hidden state and input are the ...\alpha({\bf x}_t, {\bf h}_{t-1})) = \text{softmax}({\bf W}[{\bf x}_t; {\bf h}_{t-1}]+{\bf b}) \in \mathbb{R}^k</math> ...
    27 KB (4,321 words) - 05:09, 16 December 2020
  • ...^m-y_j^m ||^2, \quad z_i=\sum_{h}\sum_{m} \pi_{i}^{m} \pi_{h}^{m} e^{-d_{i,h}^{m}} </math> </center> ...
    15 KB (2,530 words) - 09:45, 30 August 2017
  • ...uch that <math>\beta \leq \frac{wh}{WH}</math> and <math>\gamma \leq \frac{h}{w} \leq \gamma^{-1}</math>. The smalles size of crops is at least <math>\b ...
    12 KB (1,792 words) - 00:08, 13 December 2020
  • ...oth mappings of labelled and unlabelled images by <math>g</math> and <math>h</math> respectively will be utilized. ...tion loss <math>\mathcal{L}_{ss}</math> utilizes a separate function <math>h</math> which maps the embeddings of unlabeled images to a separate label sp ...
    17 KB (2,644 words) - 01:46, 13 December 2020
  • where x's are the feature values of each data point, and h's are the weights of the corresponding x's. <math>r_k(z) = \frac{1}{\sum_{(x,h) \in D_k} h} \sum_{(x,h) \in D_k, x<z} h,</math> ...
    15 KB (2,406 words) - 18:07, 28 November 2018
  • ...</math> equal to the prediction on the corresponding clean example <math> h(x) </math>. ...h>x</math> is a perturbed image <math>x'</math>, such that <math>h(x) \neq h(x')</math> and <math>d(x, x') \leq \rho</math> for some dissimilarity func ...
    32 KB (4,769 words) - 18:45, 16 December 2018
  • ...{x})}}[E(\mathbf{x})]- E_{\mathbf{x} \sim q(\mathbf{x})}[E(\mathbf{x})] + H(q) ...lity was used to obtain the variational lower bound on the NLL given <math>H(q) </math>. This bound is tight if <math> q(x) \propto e^{-E(\mathbf{x})} \ ...
    22 KB (3,540 words) - 17:50, 6 December 2020
  • ...h>-dimensional vector <math> \boldsymbol{c} = \left[ c_1, c_2, \dots, c_{n-h+1} \right] </math>, called a ''feature map''. ...et, we set all the hyperparameters: rectified linear units, filter windows(h) of 3, 4, 5 with 100 feature maps each, dropout rate (p) of 0.5, l2 constr ...
    21 KB (3,330 words) - 03:15, 13 March 2018
  • ...h> \mathcal{U} \in \mathbb{R}^{n_{h} x n_{x} x T} </math>, where <math> n_{h} </math> is the number of hidden units and <math> n_{x} </math> is the size ...multiplication of three terms: <math>\boldsymbol W_{a} \in \mathbb{R}^{n_{h}xn_{f}}, \boldsymbol W_{b} \in \mathbb{R}^{n_{f} x T}, </math>and <math> \b ...
    18 KB (2,810 words) - 23:45, 14 November 2018
  • ...on distribution $q(\mathbf{x}_{t+1}|\mathbf{x}_t)$, and an episode length $H$. In i.i.d. supervised learning problems, the length $H =1$. The model may generate samples of length $H$ by choosing an output at at each time $t$. The cost $\mathcal{L}$ provides ...
    26 KB (4,205 words) - 10:18, 4 December 2017
  • ...low, L, frequency components. The assumption is that high frequency band, H, is conditionally independent of the lower frequency bands, given the middl P(H|M,L) = P(H|M) ...
    18 KB (3,001 words) - 09:46, 30 August 2017
  • ...th> n </math> and the output value of the hidden layer of the model, <math>h</math>. The idea of this method is to represent the output classes as the l ...\frac{\partial Err}{\partial v_{n_i}^{'}h} \cdot \frac{\partial v_{n_i}^{'}h }{\partial v_{n_i}^{'}} </math> <br></div> ...
    32 KB (5,160 words) - 22:32, 27 March 2018
  • ...set of transformations through hidden states (a.k.a layers) <math>\mathbf{h}</math>, given by the equation ...le="text-align:center;"><math> \mathbf{h}_{t+1} = \mathbf{h}_t + f(\mathbf{h}_t,\theta_t) </math> (1) </div> ...
    24 KB (3,891 words) - 15:01, 7 December 2020
View ( | ) (20 | 50 | 100 | 250 | 500)