Search results

Jump to navigation Jump to search
View (previous 100 | ) (20 | 50 | 100 | 250 | 500)
  • |Oct 6 || Tameem Adel |Dec 6 || Ali-Akbar Samadani ...
    501 bytes (73 words) - 09:45, 30 August 2017
  • |Nov 6|| Durgesh Saraph |Dec 4 || Project #6 Mohammad Derakhshani || Project #8 Aurelien Quevenne || Project #3 Yao Ya ...
    1 KB (207 words) - 09:45, 30 August 2017
  • |Mar 1 || Ilia Sucholutsky || 6|| One-Shot Imitation Learning || [https://papers.nips.cc/paper/6709-one-sh |Mar 6 || George (Shiyang) Wen || 7|| AmbientGAN: Generative models from lossy me ...
    9 KB (1,240 words) - 18:05, 19 November 2018
  • |Nov 30 ||Project #6 by Ahmed Ibrahim || Project #15 by Jenna Voisin || Project #9 by Ali-Akb ...
    1 KB (160 words) - 09:45, 30 August 2017
  • |Oct 6 || Joel Smith || || || ...
    2 KB (143 words) - 09:45, 30 August 2017
  • |Oct 30 || Glen Chalatov || 6 || Pixels to Graphs by Associative Embedding || [http://papers.nips.cc/pape |Nov 6 || Nargess Heydari || 10 ||Wavelet Pooling For Convolutional Neural Netwo ...
    14 KB (1,851 words) - 03:22, 2 December 2018
  • |Oct 6 ||Johnny Chow || Jennifer Smith || Zhe Wang ...
    1 KB (158 words) - 09:45, 30 August 2017
  • ...ed Sorting[http://alex.smola.org/papers/2009/QuaSonSmo09.pdf])|| Yun Wang (6) ...
    1 KB (193 words) - 09:45, 30 August 2017
  • |Nov 6 || Ali Ghodsi || || Lecturer|||| |Nov 6 || Ali Ghodsi || || Lecturer|||| ...
    11 KB (1,453 words) - 13:01, 16 October 2018
  • ...aphEliminate on your moral graph, using the elimination ordering (7, 8, 9, 6, 3, 5, 4, 2, 1), and show the resulting reconstituted graph (the graph that (c) Report (b), but using the elimination ordering (7, 6, 8, 9, 4, 3, 2, 5, 1). ...
    14 KB (2,497 words) - 09:45, 30 August 2017
  • |Nov 15 ||Project #|| Project #6 by Jeff Glaister || Project #3 Grace Tompkins, Tatianna Krikella, Swaleh H ...
    2 KB (222 words) - 12:49, 6 October 2020
  • ...Hybridization of supervised and unsupervised learning for fraud detection [6] Figure 3: Proposed Architecture [6] ...
    12 KB (1,776 words) - 19:07, 24 November 2021
  • == 6. Experiments == ...
    9 KB (1,428 words) - 09:46, 30 August 2017
  • ...ocols, known as El Escorial criteria, involves a battery of tests taking 3-6 months. This is a considerable amount of time since a quicker diagnosis wou ...owl. Discov. 2015, 29, 1033–1069. [[http://doi.org/10.1007/s10618-014-0386-6 CrossRef]] ...
    8 KB (1,188 words) - 10:31, 17 May 2022
  • ...enes) while in sparse PCA each involves variables corresponding to at most 6 genes. ...CA (dashed) and the method we explained with <math>k=5</math> and <math>k=6</math> (solid lines). ...
    13 KB (2,202 words) - 09:45, 30 August 2017
  • <math>\,\hat{y_i} = \sum_{\alpha}{Q_{i\alpha}l_{\alpha}} </math> (6)<br /> substituting (6) into (7) gives <math>\,K\approx QLQ^T</math> where <math>\,L_{\alpha\beta} ...
    7 KB (1,093 words) - 09:45, 30 August 2017
  • ...nt labels. This is the simplest and the most inaccurate approach among all 6 methods introduced. The XML-CNN model is compared against the 6 existing competitive methods. The results are as shown in the tables below: ...
    6 KB (969 words) - 21:50, 13 November 2021
  • 6. The re-estimation of <math>\sum_{r_p}</math> is then performed using the s 6. Perform state clustering given the parameters of the untied model in step ...
    8 KB (1,374 words) - 09:45, 30 August 2017
  • ...Tangxinxin Yao, Jingyue Huang, Ming Fan, Mingguang Liu, Xiaohan Wang || 6|| A New Method of Region Embedding for Text Classification || [https://op ...
    5 KB (694 words) - 18:02, 31 August 2018
  • For resolution augmentation, 6 scales of input are used, which results in unpooled layer 5 maps of varying (d). The classifier (layers 6,7,8) has a fixed input size of 5x5 and produces a C-dimensional output vect ...
    19 KB (2,961 words) - 09:46, 30 August 2017
  • ...ch image has 6X6 pixels and each pixel has 8 dimensions. Thus we have 32*6*6 pixels at this point. Consider each pixel is an capsule. We have 32*6*6 capsules <math>u_i</math> from second Conv layer. Thus, we have <math>\hat{ ...
    14 KB (2,384 words) - 12:36, 29 March 2018
  • ...mes 256</math> tensor from Conv1 and produce an output of a <math>6 \times 6 \times 8</math> tensor. * Size of each convolutional unit: <math>6 \times 6</math>. ...
    22 KB (3,375 words) - 22:40, 20 April 2018
  • ...of the most common pretext tasks used are rotations and jigsaw puzzle [4,5,6]. As shown in Figure 2, in the rotation task, unlabeled images, <math> </ma \begin{align} \tag{6} \label{eqn:6} ...
    20 KB (3,045 words) - 23:02, 12 December 2020
  • ...ased NLI systems can be broken by changing words by synonyms or hypernyms [6]. ...antic datasets is a useful means to avoid the problems highlighted in [4,5,6] by means of asking humans to (i) provide counterfactual labels, (ii) retai ...
    10 KB (1,605 words) - 19:42, 6 December 2020
  • ...the SSL dataset domain has a positive effect, with diminishing ends. Fig. 6(b) shows the effects of shifting the domain of the SSL dataset, by changing <div align="center">Figure 6: (a) Effect of number of images on SSL. (b) Effect of domain shift on SS ...
    17 KB (2,644 words) - 01:46, 13 December 2020
  • ...lemi, et al. proposed a deep sequence model for premise selection in 2016 [6], and they claim to be the first team to involve deep neural networks in AT ...izar_article.png|thumb|center|Figure 4. An article from MML. Adapted from [6].]] ...
    20 KB (3,127 words) - 20:45, 10 December 2018
  • ...= \underset{j \ne y_i} \Sigma Q_{ij} \le c, \;\;\; ||XQ||_2 \le 1 \;\;\; (6) </math> In <math>\,(6)</math>, <math>\,Q \in \mathbb{R}^{m\times k}</math> is the dual Lagrange v ...
    24 KB (3,815 words) - 09:45, 30 August 2017
  • | Conv 6 || 1 x 1 x 512 || 1 || 56 x 56 x 512 ...rform detection, as shown to be beneficial in Ren et al<sup>[[#References|[6]]]</sup>. ...
    19 KB (2,746 words) - 16:04, 20 November 2018
  • |Nov 20 || Maya(Mahdiyeh) Bayati, Saber Malekmohammadi, Vincent Loung || 6|| Convolutional Neural Networks for Sentence Classification || [https://arxi ...
    6 KB (827 words) - 11:33, 5 September 2020
  • == 6. Conclusions == ...
    12 KB (1,976 words) - 23:37, 20 March 2018
  • ...ccurring afterward. Throughout the four blocks, pooling windows are 10, 8, 6, and 4 respectively. Dilated convolutional layers are also used in lieu of ...ghbor up-sampling followed by conventional convolution with kernel sizes 4,6,9 and 10 and batch normalization. The resulting feature maps are then conca ...
    8 KB (1,170 words) - 01:41, 26 November 2021
  • ...sitive instance in the bag. Some authors combine MIL with Neural Networks[6, 7] and model SMI by max-pooling. This approach is inefficient due to only 6 subtypes of glioma WSI have been tested in this paper: Glioblastoma (GBM), ...
    16 KB (2,470 words) - 14:07, 19 November 2021
  • ...soon as we enter the building (cross the outermost door) set indoors to 1. 6) As soon as we exit, set indoors to 0. 7) Stop recording. 8) Save data as C ...soon as we enter the building (cross the outermost door) set indoors to 1. 6) Finally, enter a building and ascend/descend to any story. 7) Ascend throu ...
    18 KB (2,896 words) - 18:43, 16 December 2018
  • ...ight \|_{1})+\lambda_{2}\left \|\mathbf{\theta} \right \|_{2}^{2}, \;\;\;(6) </math></center> ...{\alpha}_{i} \right \|_{1}</math>.<br /><br /> The learning procedure in (6) minimizes the sum of the costs for the pairs <math>(\mathbf{x}_{i},y_{i})_ ...
    21 KB (3,291 words) - 09:45, 30 August 2017
  • ...eline models, the classic NMT with beam search (NMT-BS)<sup>[[#References|[6]]]</sup> and the one referred as beam search optimization (NMT-BSO), which ...contains 12M, 4.5M and 10M training data for each task.<sup>[[#References|[6]]]</sup> ...
    22 KB (3,543 words) - 00:09, 3 December 2017
  • ...ta is used to generate ground truth labels, such as the Jigsaw puzzle task[6], and the rotation estimation[3]. For example, in the rotation task, we hav * In Jigsaw task [6], the unlabelled images are divided into nine patches and then, the patches ...
    12 KB (1,792 words) - 00:08, 13 December 2020
  • == 6. Conclusions == ...
    14 KB (2,192 words) - 03:01, 23 November 2018
  • ...es are described in the papers by Rabiner and Juang [5] as well as Kalman [6]. The difference with these presentations is that the latent dynamics are c ...einforcement learning: A brief survey. IEEE Signal Processing Magazine, 34(6), 26–38. ...
    13 KB (2,072 words) - 06:07, 10 December 2020
  • ...\leq 1, \; P_1(\textbf{u}) \leq c_1, \; P_2(\textbf{v}) \leq c_2, \;\;\; (6) </math></center> <br /> ...xtbf{v}</math> and the following iterative algorithm can be used to solve (6). <br/> ...
    30 KB (4,829 words) - 09:45, 30 August 2017
  • |Oct 31 || ||6 || || || ...
    10 KB (1,213 words) - 19:28, 19 November 2020
  • L = L_{CE} + L_{Reg} + \lambda \times L_{MR} \tag{6} \label{eq:op5} <div align="center">'''Figure 6:''' Analyzing Co-Exitation</div> ...
    22 KB (3,609 words) - 21:53, 6 December 2020
  • ==Project 6: Dimensionality Reduction for Supervised Learning == ...arge-scale noisy anchor-free graph realization. SIAM J. on Sci. Comp., 31, 6, 4351-4372. </ref>. ...
    17 KB (2,679 words) - 09:45, 30 August 2017
  • ...into three categories: (a) appropriately selecting the neighbors [2] [5] [6] [7] [10]; (b) identifying the outliers [4] [8]; (c) dealing with the insta 6. Optional: extend the analysis to other financial time series, e.g. GDP, un ...
    15 KB (2,332 words) - 09:45, 30 August 2017
  • ...; margin-left: auto; margin-right: auto;">[[File:Screen Shot 2018-11-10 at 6.03.08 PM.png|400px]]</div> ...layers directly to the deep layers are coming from networks like ResNet [6] ...
    21 KB (3,227 words) - 18:12, 14 December 2018
  • ...le 1. The baseline models are D-MTAE[5],Deep-All (Vanilla AlexNet)[2], DSN[6]and AlexNet+TF[2]. On average, the proposed method outperforms other method ...ctors – pole length and cart mass. In both experiments, we randomly choose 6 source domains for training and hold out 3 domains for (true) testing. Sinc ...
    14 KB (2,177 words) - 00:41, 7 December 2020
  • | No conv, 6 full ...3-5% relative improvement over Hybrid DNN. Also CNN-based feature offers 5-6% relative improvement over DNN-based features. ...
    11 KB (1,587 words) - 09:46, 30 August 2017
  • ...eta}</math> we will get the factorized matrix <math>X\approx QYQ^T</math> (6)<br /> First, starting from the m-dimensional solution of eq. (6), use conjugate gradient methods to maximize the objective function in eq. ...
    12 KB (1,953 words) - 09:45, 30 August 2017
  • ...ained InceptionV3 network where all layers except the last one are frozen [6] (Footnote). Adding only one poison instance to the training set causes mis [6] Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigni ...
    11 KB (1,590 words) - 18:29, 26 November 2021
  • ...ation of two entities in the same sentence into 6 potential relations. The 6 relations are USAGE, RESULT, MODEL-FEATURE, PART WHOLE, TOPIC, and COMPARE. ...ioned 9 natural language relationships between the word pairs. Among them, 6 potential relationships are USAGE, RESULT, MODEL-FEATURE, PART WHOLE, TOPIC ...
    15 KB (2,408 words) - 21:25, 5 December 2020
  • ...ionary (DND) used in Neural Episodic Control (NEC) found in [[#References|[6]]], though the gradients from the memory in the MbPA model are not used dur * <sup>[6]</sup>Pritzel. Alexander, Uria. Benigno, Srinivasan. Sriram, Puigdome ...
    12 KB (1,963 words) - 23:48, 9 November 2018
  • ...le, "AU 1" stands for the inner portion of the brows being raised, and "AU 6" stands for the cheeks being raised. Such a framework helps in describing a ...hich recognizes both permanent and transient AUs with high accuracy rates [6]. Hand-crafted feature descriptors like the LBP are very powerful facial re ...
    21 KB (3,321 words) - 15:00, 4 December 2017
  • :<math> X = \{0, 1, 2, 3, 4, 5, 6, 7, 8\} \rightarrow</math>'''State Space''' ...
    6 KB (1,113 words) - 09:45, 30 August 2017
  • 6. Both natural and random images are found to be vulnerable to adversarial p The result is summarized in Figure 6: ...
    17 KB (2,650 words) - 23:54, 30 March 2018
  • ...\max_{x_k} \psi(x_j,x_k) \phi(x_k,y_k) \prod_{i!=j} \hat{M}^l_k \,\,\, (6) ...ration uses the messages above as the <math>\hat{M}</math> variables in Eq(6) : ...
    18 KB (3,001 words) - 09:46, 30 August 2017
  • ...es. Particularly, in the car, person, and rider categories, a 12%, 7%, and 6% higher performance than SharpMask is achieved. File:Figure_3_Neel.JPG|Figure 6: Qualitative results: comparison with human annotator.|alt=alt language ...
    21 KB (3,323 words) - 18:41, 16 December 2018
  • ...m \; L(y_{ti} , <w_t,x_ti)>) + \gamma \sum_{t=1}^T \; <w_t,D^+w_t> \;\;\; (6)</math></center> ...d in the Reference section) that the function <math>\,R</math> in <math>\,(6)</math> is jointly convex in both <math>\,W</math> and <math>\,D</math>. ...
    17 KB (2,834 words) - 09:45, 30 August 2017
  • ...rpretations directly in the models, often known as self-explaining models [6, 7, 8, 9]. The alternative option is to generate interpretations in post-ho [6] David Alvarez-Melis and Tommi S Jaakkola. Towards robust interpretability ...
    11 KB (1,594 words) - 13:14, 25 November 2021
  • ...t reached an accuracy of 72%. Reducing the size of the training dataset to 6 billion caused lower accuracy (66%), which suggests that large amount of th Table 6 shows the empirical comparison between different neural network-based repre ...
    19 KB (2,931 words) - 09:46, 30 August 2017
  • ...ons are shown to useful for language learning[2]. Several studies[3][4][5][6] have shown that feedback is especially useful in second language learning ...s right”. In the datasets, there are 6 templates for positive feedback and 6 templates for negative feedback, e.g. ”Sorry, that’s not it.”, ”Wrong”, etc ...
    26 KB (4,081 words) - 13:59, 21 November 2021
  • ...ics simulator to create a simulated environment where robotic spiders with 6 legs are faced with the task of running due east as quickly as possible. Th ...e task of walking east with the torques of two legs scaled by <math> (i-1)/6 </math> ...
    17 KB (2,846 words) - 00:12, 21 April 2018
  • == 6 End To End Evaluations == === 6.1 System Implementation === ...
    15 KB (2,406 words) - 18:07, 28 November 2018
  • ...even learn the emotion of the agents based on their voices. Aytar et al. [6] proposed a student-teacher training procedure in which a well established 6 seconds of audio was used to compute the spectogram by taking a Short-time ...
    32 KB (5,152 words) - 03:36, 15 December 2020
  • ...ref>R. Smith, "Size of the Moon", Scientific American, 46 (April 1978): 44-6.</ref> ...
    5 KB (769 words) - 22:53, 5 September 2021
  • ...ion-Answering ('''ReQA''') benchmark [5] and used two datasets '''SQuAD'''[6] and '''Natural Questions'''[7] for training and evaluating their models. ...e experiments they did with augmented the dataset as will be seen in table 6. ...
    22 KB (3,409 words) - 22:17, 12 December 2020
  • ...from each class of the CIFAR-10 validation set. Based on figure 4, 5, and 6, we can see that the <math>L(\delta)</math> (classification loss), <math>T In figure 6 and 7, we can see the effect of <math>\lambda_s</math> on the dissimilarity ...
    15 KB (2,325 words) - 06:58, 6 December 2020
  • ==Project 6: Application of clustering in bioinformatics: How the structure of a molec ...
    15 KB (2,344 words) - 09:45, 30 August 2017
  • ...used for training and the others reserved for testing. Table 2 and Figure 6 outlines the result, showing that the proposed RCNN model performs consiste ...EE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 6, pp. 1052–1067, 2007. ...
    16 KB (2,430 words) - 18:30, 16 December 2018
  • ...onventional Neural Networks [1,2,3] and their shifted version 3D CNNs [4,5,6] have been employed in action recognition but they identify and aggregate t * 6 Alternative layers with 64, 128, 256, 256, 512 and 512 kernel response maps ...
    16 KB (2,500 words) - 13:19, 30 November 2017
  • 6. For classification tasks, the idea of learning a “new object” is analogous 6) The simple decaying Hebbian formula in Equation 2 is used to update the He ...
    27 KB (4,100 words) - 18:28, 16 December 2018
  • '''Project 6''' ...
    7 KB (1,125 words) - 09:46, 30 August 2017
  • \tag{6} \label{6} ...
    22 KB (3,540 words) - 17:50, 6 December 2020
  • ...x. If interested, the derivation of the whitening equation can be seen in [6]. Li et al. found that whitening removed styles from the image. Authors use $\alpha$ = 0.6 in the style transfer experiments. ...
    25 KB (4,065 words) - 20:10, 28 November 2017
  • ...ous section (<math>\epsilon = 0.05</math>) gives an approximation of order 6. Additionally, the 8th order and 6th order representations are similar and ...
    8 KB (1,446 words) - 09:45, 30 August 2017
  • ...ng a fair die repetitively to produce a series of random numbers from 1 to 6). ...917, ''b'' = 11, and ''m'' = 2<sup>48</sup><ref>http://java.sun.com/javase/6/docs/api/java/util/Random.html#next(int)</ref>. The class returns at most 3 ...
    8 KB (1,324 words) - 09:45, 30 August 2017
  • <math>f_{1}=6,8,10,\cdots,20</math> <math>f_{1}=6,8,10,\cdots,20</math> ...
    15 KB (2,414 words) - 09:46, 30 August 2017
  • 70 & 76 & -20 & -6 \\ ...training and validation sets as the number of epochs increases, and Table 6 shows the accuracy performance. Average pooling demonstrated the highest ac ...
    15 KB (2,396 words) - 22:57, 20 April 2018
  • ...atasets according to their contextual group <math> X </math> according to [6] and they compare their results using compression ratio <math> \Delta s = \ ...the images mostly taken at night time were selected from DNIM. The figure 6 shows that for DNiM images, the agent's choices are mostly concentrated in ...
    27 KB (4,274 words) - 00:07, 8 December 2020
  • ...for that task, owing again to the overparameterization of the network [5, 6, 7, 8]. [6] Li, Y. and Liang, Y. (2018). Learning overparameterized neural networks vi ...
    15 KB (2,322 words) - 23:30, 7 December 2020
  • ...t the exact architecture and experimental setting of the GrammarVAE (GVAE)[6] and replace its variational framework with that of an RAE's utilizing the [6] Matt J. Kusner, Brooks Paige, and José Miguel Hernández-Lobato. Grammar va ...
    15 KB (2,313 words) - 19:11, 2 December 2020
  • ...ng clean/noisy in order to train the student network on cleaner instances [6]. [[File:Co-Teaching Table 6.png|550px|center]] ...
    15 KB (2,318 words) - 21:02, 11 December 2018
  • ...cipal component analysis. ''Journal of the Royal Statistical Society, B'', 6(3):611-622, 1999.</ref> ; they found that the closed-form solution for <mat ...ssian Process Latent Variable Models. Journal of Machine Learning Research 6 (2005) 1783–1816. November, 2005. ...
    21 KB (3,433 words) - 09:45, 30 August 2017
  • ...s the state-of-the-art on the VQA dataset from 60.3% to 60.5%, and from 61.6% to 63.3% on the COCO-QA dataset. By using ResNet, the performance is furth ...l-attention" has also been implemented in VQA tasks, which is explored in [6]. ...
    27 KB (4,375 words) - 19:50, 28 November 2017
  • ...sed on the magnitude values so eq. (3) and (4) get transformed to (5) and (6), respectively: ...ilon \quad \{ t_{1}, t_{2},....., t_{k} \} \quad else \quad 0 \quad \quad (6) ...
    20 KB (3,272 words) - 20:40, 28 November 2017
  • ...arted to appear, for example, the Gaussian Copula Process Volatility model[6]. For this paper, the authors use coupling AR models and neural networks to ...tasets and check how these networks perform. The result is shown in Figure 6. ...
    29 KB (4,577 words) - 10:13, 14 December 2018
  • ...ities, it will necessarily be higher. The 6.7, for example, indicates that 6.7% of the ground truth tuples appeared in the proposals of the network. ...mages. For this, they used the Visual Genome dataset, with so = 3 and sr = 6. Overall, the new architecture vastly outperformed past models. The results ...
    17 KB (2,749 words) - 18:26, 16 December 2018
  • ...the TLD framework outperforms previous state-of-arts tracking approaches [6]. 6) Deep learning models like stacked autoencoder have been used to learn good ...
    29 KB (4,453 words) - 18:27, 16 December 2018
  • ...pics. Therefore, The TCNLM uses a diversity regularizer<sup>[[#References|[6]]]</sup><sup>[[#References|[7]]]</sup> to reduce it. The idea is to regular * <sup>[http://www.cs.cmu.edu/~pengtaox/papers/kdd15_drbm.pdf [6]]</sup>P. Xie, Y. Deng, and E. Xing. Diversifying restricted boltzmann mach ...
    18 KB (2,810 words) - 23:45, 14 November 2018
  • ..._i - y_j \|^2}{\sqrt{D_{ii}D_{jj}}} </math> (6) ...<math>\,\, \mathbf{y=\sigma[-17(\sqrt((\theta_r-\pi)^2+(\theta_p-\pi)^2)-0.6\pi)]} </math> where <math>\,\, \sigma[.] </math> is the [http://en.wikipedi ...
    26 KB (4,280 words) - 09:45, 30 August 2017
  • ...SQuAD was evaluated using two different models: LSTM+Attention and BiDAF [6]. The first model was inspired by most then-present QA systems consisting o '''Visualization:''' Figure 6 shows that the model does not skim when the input seems to be relevant to a ...
    27 KB (4,321 words) - 05:09, 16 December 2020
  • ...been widely leveraged for addressing domain generalization [3, 4, 5, 7, 8, 6, 9, 10, 11]. Meta-Learning for domain generalization (MLDG) [4] closely fol ...MetaReg[3], significantly. In addition, the best improvement has achieved (6.20%) when the unseen domain is "sketch", which requires more general knowle ...
    15 KB (2,189 words) - 01:58, 13 December 2020
  • <math>W \thicksim U(-L, +L),L = max \left \{ \sqrt{6/n_{in}}, L_{min} \right \}, L_{min} = \beta \sigma</math> MNIST: Network is LeNet-5 variant<sup>[[#References|[6]]]</sup> with 32C5-MP2-64C5-MP2-512FC-10SSE. ...
    20 KB (2,998 words) - 21:23, 20 April 2018
  • ...other hand, calculates the similarity using the cosine similarity of BERT [6] contextual embeddings. BertScore basically addresses two common pitfalls i ...with the same hardware, the Machine Translation test on BERTScore takes 15.6 secs compared to 5.4 secs for BLEU. The time range is essentially small and ...
    17 KB (2,510 words) - 01:32, 13 December 2020
  • 4. The use of dropout, which contributed ~1-6% improvement in the LMH code for different tissues, and ~2-7% in the DNI co ...
    8 KB (1,353 words) - 09:46, 30 August 2017
  • ...ombining the Probabilistic Label Tree [5] method and the Adaptive Softmax [6] to propose APLC. [6] Grave, E., Joulin, A., Cisse, M., J ´ egou, H., et al. Effi- ´ ...
    15 KB (2,456 words) - 22:04, 7 December 2020
  • ...</math>, is shown in the top row left column. Here, there are 4 nodes with 6 operations defined between them. ...ormance in those more complex architectures. The first large model (Figure 6) is targeted to image classification on the CIFAR-10 dataset and the second ...
    30 KB (4,568 words) - 12:53, 11 December 2018
  • [6] Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, ...
    8 KB (1,119 words) - 04:28, 1 December 2021
  • 6). Repeat 3 -- 5 until converges ...
    9 KB (1,589 words) - 09:46, 30 August 2017
  • ...ctions of the Wiener integral, we refer the reader to the textbook by Kuo [6]. Intuitively, the stochastic process <math>X_t</math> can be thought as th ...Gaussian Process Regression, Journal of Machine Learning Research, Volume 6, P1939-1959. ...
    26 KB (4,302 words) - 23:25, 7 December 2020
  • ...mpting to standardize the field of language understanding tasks. SentEval [6] evaluated fixed-size sentence embeddings for tasks. DecaNLP [7] converts t [6] Alexis Conneau and Douwe Kiela. SentEval: An evaluation toolkit for univer ...
    16 KB (2,331 words) - 16:58, 6 December 2020
  • ...r multiple hypotheses and long-term interactions between multiple agents" [6]. ...precise global position in urban road environment; 5) Micro-Autobox II and 6) a MDPS are used to control and actuate the subject. All data are stored in ...
    29 KB (4,569 words) - 23:12, 14 December 2020
View (previous 100 | ) (20 | 50 | 100 | 250 | 500)