Search results

Deep Learning for Extreme Multi-label Text Classification
...out. However, the shortcomings of the existing methods are inevitable due to data sparsity and scalability. With deep learning and Convolutional Neural ...interpret. Therefore, the concept of compressing label space is introduced to effectively create lower-dimensional label vectors using either linear or n ...

6 KB (969 words) - 21:50, 13 November 2021
binomial Probability Monte Carlo Sampling June 2 2009
...order to get a distribution for the probability 'p' of a Binomial, we have to divide the Binomial distribution by n. This new distribution has the same s # Compute <math>\displaystyle \delta = p_1 - p_2</math> in order to get n values for <math>\displaystyle \delta</math>; ...

5 KB (788 words) - 09:45, 30 August 2017
Dynamic Routing Between Capsulesl
...cases, we want to reduce the number of dimensions because we always want to save computations. The reason behind this kind of pooling method is based o ...od is that, it only passes the local patterns into the next layer. That is to say, if our original data set doesn't have the good property of neighborhoo ...

8 KB (1,394 words) - 19:54, 20 March 2018
Unsupervised Machine Translation Using Monolingual Corpora Only
The paper presents an unsupervised method to machine translation using only monoligual corpora without any alignment bet The general approach of the methodology is to first use a unsupervised word-by-word translation model proposed by [Connea ...

8 KB (1,359 words) - 22:48, 19 November 2018
generating Random Numbers
...lling a fair die repetitively to produce a series of random numbers from 1 to 6). One way to generate pseudo random numbers from the uniform distribution is using the ' ...

8 KB (1,324 words) - 09:45, 30 August 2017
a Dynamic Bayesian Network Click Model for Web Search Ranking
...users click on what appears as the first search results and it is unlikely to click on results that do not appear at the beginning, even though relevant. ...el'' of user behavior, assumes that the user scans search results from top to bottom and eventually stops because either their information need is satisf ...

11 KB (1,852 words) - 09:45, 30 August 2017
hierarchical Dirichlet Processes
...osal generally cannot model shared information between groups. One idea is to make <math>G_0</math> become discrete by limiting the choice of <math> G_0 ...e measure. Note that <math>G_0</math> is discrete with probability one due to the fact of Dirichlet process. ...

8 KB (1,341 words) - 09:46, 30 August 2017
large-Scale Supervised Sparse Principal Component Analysis
...s that it is computationally expensive. Many algorithms have been proposed to solve the sparse PCA problem, and the authors introduced a fast block coord ...nsion of the data. Since <math>\hat{n}</math> could be very small compared to the dimension <math>n</math> of the data, this algorithm is computationally ...

7 KB (1,209 words) - 09:46, 30 August 2017
nonlinear Dimensionality Reduction by Semidefinite Programming and Kernel Matrix Factorization
...s with computing k-nearest neighbors of each input and adding a constraint to preserve distances and angles between k-nearest neighbors:<br /> and also a constraint on outputs to be centerd on the origin:<br /> ...

7 KB (1,093 words) - 09:45, 30 August 2017
deep Sparse Rectifier Neural Networks
...easy to train and easy to generalize, while neuroscientists' objective is to produce useful representation of the scientific data. In other words, machi ...e at 1/2 of their maximum rate when at zero. A solution to this problem is to use a rectifier neuron which does not fire at it's zero value. This rectifi ...

9 KB (1,338 words) - 09:46, 30 August 2017
stat441w18/summary 1
...based methods where they learn the i-th training examples are "remembered" to learn for corresponding weights. Prediction on untrained examples are then ...nal feature space and then apply existing linear methods. The main goal is to reduce the bottleneck of kernel-based inference methods. ...

5 KB (753 words) - 12:51, 7 March 2018
stat946F18
|width="30pt"|Link to the paper |width="30pt"|Link to the summary ...

14 KB (1,851 words) - 03:22, 2 December 2018
stat441F21
|width="15pt"|Link to the paper |width="30pt"|Link to the summary ...

8 KB (1,194 words) - 04:28, 1 December 2021
measuring and testing dependence by correlation of distances
...o random variables could be in different dimensions. Second, dCov is equal to zero if and only is the two variables are independent. ...ritten in terms of the expectations of Euclidean distances which is easier to interpret: ...

4 KB (586 words) - 09:46, 30 August 2017
a Rank Minimization Heuristic with Application to Minimum Order System Approximation
...stics and signal processing. Except in some special cases the RMP is known to be computationally hard. \mbox{subject to: } & X \in C, ...

8 KB (1,446 words) - 09:45, 30 August 2017
on the Number of Linear Regions of Deep Neural Networks
...rger. Furthermore, having many layers can theoretically cause problems due to vanishing gradients. ...number of input regions. This is caused by the deep hierarchy which allows to apply the same computation across different regions of the input space. ...

8 KB (1,391 words) - 09:46, 30 August 2017
STAT946F20/BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
..."bank" as a "financial institution" or the "land alongside or sloping down to a river or lake". ...e positional encoding, which has the same dimension as the word embedding, to obtain the sequential information of the inputs. BERT is built by the N uni ...

9 KB (1,342 words) - 06:36, 10 December 2020
Convolutional neural network for diagnosis of viral pneumonia and COVID-19 alike diseases
...f pneumonia in CT Scan images. Then they carried out 10 k cross validation to estimate the model will perform on unseen dataset. And finally they evaluat ...iologists, radiologists and computer scientists have been working together to detect microbial diseases such as tuberculosis, malaria and pneumonia using ...

7 KB (974 words) - 14:56, 21 November 2021
techniques for Normal and Gamma Sampling
...sform Method and sample from independent uniform distributions seen before to generate a sample following a Gamma distribution. ...le to use the Acceptance-Rejection method, but there are still better ways to sample from a Standard Normal Distribution. ...

7 KB (1,114 words) - 09:45, 30 August 2017
test
...imal). This creates a big problem, as this method becomes very susceptible to poor data (i.e., not very robust). This intuitively makes sense, as the age ...e noisy demonstration to be ranked according to their relative performance to each other. Another similar method requires extra labelling of the data wit ...

10 KB (1,526 words) - 17:39, 26 November 2021

Search results

Navigation menu

Search