http://wiki.math.uwaterloo.ca/statwiki/api.php?action=feedcontributions&user=J47pan&feedformat=atomstatwiki - User contributions [US]2022-05-22T14:53:03ZUser contributionsMediaWiki 1.28.3http://wiki.math.uwaterloo.ca/statwiki/index.php?title=F21-STAT_441/841_CM_763-Proposal&diff=51243F21-STAT 441/841 CM 763-Proposal2021-12-22T13:48:19Z<p>J47pan: </p>
<hr />
<div>Use this format (Don’t remove Project 0)<br />
<br />
Project # 0 Group members:<br />
<br />
Last name, First name<br />
<br />
Last name, First name<br />
<br />
Last name, First name<br />
<br />
Last name, First name<br />
<br />
Title: Making a String Telephone<br />
<br />
Description: We use paper cups to make a string phone and talk with friends while learning about sound waves with this science project. (Explain your project in one or two paragraphs).<br />
<br />
--------------------------------------------------------------------<br />
Project # 1 Group members:<br />
<br />
Feng, Jared<br />
<br />
Huang, Xipeng<br />
<br />
Xu, Mingwei<br />
<br />
Yu, Tingzhou<br />
<br />
Title: Patch-based classification of lung cancers pathological images using convolutional neural networks<br />
<br />
In this project, we explore the classification problem of lung cancer pathological images of some patients. The input images are from three categories of tumor types (LUAD, LUSD, and MESO), and the images have been split into patches in order to reduce the computational difficulty. The classification task is decomposed into patch-wise and whole image-wise. We experiment with three neural networks for patch-wise classification, and two classical machine learning models for patient classification. Techniques of feature extraction and sampling methods for training neural networks are also implemented and studied. Our results show that support vector machine (SVM) on extracted feature vectors outperforms all other methods and achieves an accuracy of 67.86\% based on DenseNet-121 model for patch-wise classification.<br />
<br />
Our poster is [https://www.dropbox.com/s/fu6vr2cxcbt4458/Stat_841_poster.pdf?dl=0 here].<br />
--------------------------------------------------------------------<br />
Project # 2 Group members:<br />
<br />
Anderson, Eric<br />
<br />
Wang, Chengzhi<br />
<br />
Zhong, Kai<br />
<br />
Zhou, Yi Jing<br />
<br />
Title: Clean-Label Targeted Poisons for an End-to-End Trained CNN on the MNIST Dataset<br />
<br />
Description: Applying data poisoning techniques to the MNIST Dataset<br />
<br />
--------------------------------------------------------------------<br />
Project # 3 Group members:<br />
<br />
Chopra, Kanika<br />
<br />
Rajcoomar, Yush<br />
<br />
Bhattacharya, Vaibhav<br />
<br />
Title: Cancer Classification<br />
<br />
Description: We will be classifying three tumour types based on pathological data. <br />
<br />
--------------------------------------------------------------------<br />
Project # 4 Group members:<br />
<br />
Li, Shao Zhong<br />
<br />
Kerr, Hannah <br />
<br />
Wong, Ann Gie<br />
<br />
Title: Predicting "Pawpularity" of Pets with Image Regression<br />
<br />
Description: Analyze raw images and metadata to predict the “Pawpularity” of pet photos to help guide shelters and rescuers around the world improve the appeal of their pet profiles, so that more animals can get adopted and animals can find their "furever" home faster.<br />
<br />
--------------------------------------------------------------------<br />
Project # 5 Group members:<br />
<br />
Chin, Jessie Man Wai<br />
<br />
Ooi, Yi Lin<br />
<br />
Shi, Yaqi<br />
<br />
Ngew, Shwen Lyng<br />
<br />
Title: The Application of Classification in Accelerated Underwriting (Insurance)<br />
<br />
Description: Accelerated Underwriting (AUW), also called “express underwriting,” is a faster and easier process for people with good health condition to obtain life insurance. The traditional underwriting process is often painful for both customers and insurers. From the customer's perspective, they have to complete different types of questionnaires and provide different medical tests involving blood, urine, saliva and other medical results. Underwriters on the other hand have to manually go through every single policy to access the risk of each applicant. AUW allows people, who are deemed “healthy” to forgo medical exams. Since COVID-19, it has become a more concerning topic as traditional underwriting cannot be performed due to the stay-at-home order. However, this imposes a burden on the insurance company to better estimate the risk associated with less testing results. <br />
<br />
This is where data science kicks in. With different classification methods, we can address the underwriting process’ five pain points: labor, speed, efficiency, pricing and mortality. This allows us to better estimate the risk and classify the clients for whether they are eligible for accelerated underwriting. For the final project, we use the data from one of the leading US insurers to analyze how we can classify our clients for AUW using the method of classification. We will be using factors such as health data, medical history, family history as well as insurance history to determine the eligibility.<br />
<br />
--------------------------------------------------------------------<br />
Project # 6 Group members:<br />
<br />
Wang, Carolyn<br />
<br />
Cyrenne, Ethan<br />
<br />
Nguyen, Dieu Hoa<br />
<br />
Sin, Mary Jane<br />
<br />
Title: Pawpularity (PetFinder Kaggle Competition)<br />
<br />
Description: Using images and metadata on the images to predict the popularity of pet photos, which is calculated based on page view statistics and other metrics from the PetFinder website.<br />
<br />
--------------------------------------------------------------------<br />
Project # 7 Group members:<br />
<br />
Bhattacharya, Vaibhav<br />
<br />
Chatoor, Amanda<br />
<br />
Prathap Das, Sutej<br />
<br />
Title: PetFinder.my - Pawpularity Contest [https://www.kaggle.com/c/petfinder-pawpularity-score/overview]<br />
<br />
Description: In this competition, we will analyze raw images and metadata to predict the “Pawpularity” of pet photos. We'll train and test our model on PetFinder.my's thousands of pet profiles.<br />
<br />
--------------------------------------------------------------------<br />
Project # 8 Group members:<br />
<br />
Yan, Xin<br />
<br />
Duan, Yishu<br />
<br />
Di, Xibei<br />
<br />
Title: The application of classification on company bankruptcy prediction<br />
<br />
Description: If a company goes bankrupt, all its employees will lose their jobs, and it is hard for them to find another suitable job in a short period. For the individual, the employee who loses the job due to bankruptcy will have no income for a period of time. This may lead to several negative consequences: increased homelessness as people do not have enough money to cover living expenses and increased crime rates as poverty increases. For the economy, if many companies go bankrupt at the same time, a huge number of employees will lose jobs, leading to a higher unemployment rate. This may cause a series of negative impact on the economy: loss of government tax revenue since the unemployed has no income and they do not need to pay the income taxes and increased inequality in the income distribution. <br />
<br />
Therefore, it can be seen that company bankruptcy negatively influences the individual, government, society, and the economy, this makes the prediction on company bankruptcy extremely essential. The purpose of the project is to predict whether a company will go bankrupt.<br />
--------------------------------------------------------------------<br />
Project # 9 Group members:<br />
<br />
Loke, Chun Waan<br />
<br />
Chong, Peter<br />
<br />
Osmond, Clarice<br />
<br />
Li, Zhilong<br />
<br />
Title: Popularity of Shelter Pet Photo Prediction using Varied ML Techniques<br />
<br />
Description: In this Kaggle competition, we will analyze raw images and metadata to predict the “Pawpularity” of pet photos.<br />
--------------------------------------------------------------------<br />
<br />
Project # 10 Group members:<br />
<br />
O'Farrell, Ethan<br />
<br />
D'Astous, Justin<br />
<br />
Hamed, Waqas<br />
<br />
Vladusic, Stefan<br />
<br />
Title: Pawpularity (Kaggle)<br />
<br />
Description: Predicting the popularity of animal photos based on photo metadata<br />
--------------------------------------------------------------------<br />
Project # 11 Group members:<br />
<br />
JunBin, Pan<br />
<br />
Title: Learning from Normality: Two-Stage Method with Autoencoder and Boosting Trees for Unsupervised Anomaly Detection<br />
<br />
Description: New algorithm for unsupervised anomaly detection<br />
--------------------------------------------------------------------<br />
Project # 12 Group members:<br />
<br />
Kar Lok, Ng<br />
<br />
Muhan (Iris), Li<br />
<br />
Title: NFL Health & Safety - Helmet Assignment<br />
<br />
Description: Assigning players to the helmet in a given footage of head collision in football play.<br />
--------------------------------------------------------------------<br />
Project # 13 Group members:<br />
<br />
Livochka, Anastasiia<br />
<br />
Wong, Cassandra<br />
<br />
Evans, David<br />
<br />
Yalsavar, Maryam<br />
<br />
Title: TBD<br />
<br />
Description: TBD<br />
--------------------------------------------------------------------<br />
Project # 14 Group Members:<br />
<br />
Zeng, Mingde<br />
<br />
Lin, Xiaoyu<br />
<br />
Fan, Joshua<br />
<br />
Rao, Chen Min<br />
<br />
Title: Toxic Comment Classification, Kaggle<br />
<br />
Description: Using Wikipedia comments labeled for toxicity to train a model that detects toxicity in comments.<br />
--------------------------------------------------------------------<br />
Project # 15 Group Members:<br />
<br />
Huang, Yuying<br />
<br />
Anugu, Ankitha<br />
<br />
Chen, Yushan<br />
<br />
Title: Implementation of the classification task between crop and weeds<br />
<br />
Description: Our work will be based on the paper ''Crop and Weeds Classification for Precision Agriculture using Context-Independent Pixel-Wise Segmentation''.<br />
--------------------------------------------------------------------<br />
Project # 16 Group Members:<br />
<br />
Wang, Lingshan<br />
<br />
Li, Yifan<br />
<br />
Liu, Ziyi<br />
<br />
Title: Implement and Improve CNN in Multi-Class Text Classification<br />
<br />
Description: We are going to apply Bidirectional Encoder Representations from Transformers (BERT) to classify real-world data (application to build an efficient case study interview materials classifier) and improve it algorithm-wise in the context of text classification, being supported with real-world data set. With the implementation of BERT, it allows us to further analyze the efficiency and practicality of the algorithm when dealing with imbalanced dataset in the data input level and modelling level.<br />
The dataset is composed of case study HTML files containing case information that can be classified into multiple industry categories. We will implement a multi-class classification to break down the information contained in each case material into some pre-determined subcategories (eg, behavior questions, consulting questions, questions for new business/market entry, etc.). We will attempt to process the complicated data into several data types(e.g. HTML, JSON, pandas data frames, etc.) and choose the most efficient raw data processing logic based on runtime and algorithm optimization.<br />
--------------------------------------------------------------------<br />
Project # 17 Group members:<br />
<br />
Malhi, Dilmeet<br />
<br />
Joshi, Vansh<br />
<br />
Syamala, Aavinash <br />
<br />
Islam, Sohan<br />
<br />
Title: Kaggle project: PetFinder.my - Pawpularity Contest<br />
<br />
Description: In this competition, we will analyze raw images provided by PetFinder.my to predict the “Pawpularity” of pet photos.<br />
--------------------------------------------------------------------<br />
<br />
Project # 18 Group members:<br />
<br />
Yuwei, Liu<br />
<br />
Daniel, Mao<br />
<br />
Title: Sartorius - Cell Instance Segmentation (Kaggle) [https://www.kaggle.com/c/sartorius-cell-instance-segmentation]<br />
<br />
Description: Detect single neuronal cells in microscopy images<br />
<br />
--------------------------------------------------------------------<br />
<br />
Project #19 Group members:<br />
<br />
Samuel, Senko<br />
<br />
Tyler, Verhaar<br />
<br />
Zhang, Bowen<br />
<br />
Title: NBA Game Prediction<br />
<br />
Description: We will build a win/loss classifier for NBA games using player and game data and also incorporating alternative data (ex. sports betting data).<br />
<br />
-------------------------------------------------------------------<br />
<br />
Project #20 Group members:<br />
<br />
Mitrache, Christian<br />
<br />
Renggli, Aaron<br />
<br />
Saini, Jessica<br />
<br />
Mossman, Alexandra<br />
<br />
Title: Classification and Deep Learning for Healthcare Provider Fraud Detection Analysis<br />
<br />
Description: TBD<br />
<br />
--------------------------------------------------------------------<br />
<br />
Project # 21 Group members:<br />
<br />
Wang, Kun<br />
<br />
Title: TBD<br />
<br />
Description : TBD<br />
<br />
--------------------------------------------------------------------<br />
<br />
Project # 22 Group members:<br />
<br />
Guray, Egemen<br />
<br />
Title: Traffic Sign Recognition System (TSRS): SVM and Convolutional Neural Network<br />
<br />
Description : I will build a prediction system to predict road signs in the German Traffic Sign Dataset using CNN.<br />
--------------------------------------------------------------------<br />
<br />
Project # 23 Group members:<br />
<br />
Bsodjahi<br />
<br />
Title: Modeling Pseudomonas aeruginosa bacteria state through its genes expression activity<br />
<br />
Description : Label Pseudomonas aeruginosa gene expression data through unsupervised learning (eg., EM algorithm) and then model the bacterial state as function of its genes expression</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat441F21&diff=51198stat441F212021-12-01T08:28:53Z<p>J47pan: </p>
<hr />
<div><br />
<br />
== [[F20-STAT 441/841 CM 763-Proposal| Project Proposal ]] ==<br />
<br />
<!--[https://goo.gl/forms/apurag4dr9kSR76X2 Your feedback on presentations]--><br />
<br />
=Paper presentation=<br />
{| class="wikitable"<br />
<br />
{| border="1" cellpadding="3"<br />
|-<br />
|width="60pt"|Date<br />
|width="250pt"|Name <br />
|width="15pt"|Paper number <br />
|width="700pt"|Title<br />
|width="15pt"|Link to the paper<br />
|width="30pt"|Link to the summary<br />
|width="30pt"|Link to the video<br />
|-<br />
|Sep 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary] || [https://youtu.be/JWozRg_X-Vg?list=PLehuLRPyt1HzXDemu7K4ETcF0Ld_B5adG&t=539]<br />
|-<br />
|Week of Nov 16 || Ali Ghodsi || || || || ||<br />
|-<br />
|Week of Nov 22 || Jared Feng, Xipeng Huang, Mingwei Xu, Tingzhou Yu|| || Don't Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification || [http://proceedings.mlr.press/v139/bai21c/bai21c.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Don%27t_Just_Blame_Over-parametrization Summary] ||<br />
|-<br />
|Week of Nov 29 || Kanika Chopra, Yush Rajcoomar || || Automatic Bank Fraud Detection Using Support Vector Machines || [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.863.5804&rep=rep1&type=pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Automatic_Bank_Fraud_Detection_Using_Support_Vector_Machines Summary] ||<br />
|-<br />
|Week of Nov 22 || Zeng Mingde, Lin Xiaoyu, Fan Joshua, Rao Chen Min || || Do Vision Transformers See Like Convolutional Neural Networks? || [https://proceedings.neurips.cc/paper/2021/file/652cf38361a209088302ba2b8b7f51e0-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Do_Vision_Transformers_See_Like_CNN Summary] ||<br />
|-<br />
|Week of Nov 22 || Justin D'Astous, Waqas Hamed, Stefan Vladusic, Ethan O'Farrell || || A Probabilistic Approach to Neural Network Pruning || [http://proceedings.mlr.press/v139/qian21a/qian21a.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Summary_of_A_Probabilistic_Approach_to_Neural_Network_Pruning Summary] ||<br />
|-<br />
|Week of Nov 22 || Cassandra Wong, Anastasiia Livochka, Maryam Yalsavar, David Evans || || Patch-Based Convolutional Neural Network for Whole Slide Tissue Image Classification || [https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Hou_Patch-Based_Convolutional_Neural_CVPR_2016_paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Patch_Based_Convolutional_Neural_Network_for_Whole_Slide_Tissue_Image_Classification Summary] ||<br />
|-<br />
|Week of Nov 29 || Jessie Man Wai Chin, Yi Lin Ooi, Yaqi Shi, Shwen Lyng Ngew || || CatBoost: unbiased boosting with categorical features || [https://proceedings.neurips.cc/paper/2018/file/14491b756b3a51daac41c24863285549-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=CatBoost:_unbiased_boosting_with_categorical_features Summary] ||<br />
|-<br />
|Week of Nov 29 || Eric Anderson, Chengzhi Wang, Kai Zhong, YiJing Zhou || || Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks || [https://arxiv.org/pdf/1804.00792.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Poison_Frogs_Neural_Networks Summary] ||<br />
|-<br />
|Week of Nov 29 || Ethan Cyrenne, Dieu Hoa Nguyen, Mary Jane Sin, Carolyn Wang || || Deep Residual Learning for Image Recognition || [https://openaccess.thecvf.com/content_cvpr_2016/papers/He_Deep_Residual_Learning_CVPR_2016_paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Residual_Learning_for_Image_Recognition_Summary Summary] ||<br />
|-<br />
|Week of Nov 29 || Bowen Zhang, Tyler Magnus Verhaar, Sam Senko || || Deep Double Descent: Where Bigger Models and More Data Hurt || [https://arxiv.org/pdf/1912.02292.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Double_Descent_Where_Bigger_Models_and_More_Data_Hurt Summary] ||<br />
|-<br />
|Week of Nov 29 || Chun Waan Loke, Peter Chong, Clarice Osmond, Zhilong Li|| || XGBoost: A Scalable Tree Boosting System || [https://www.kdd.org/kdd2016/papers/files/rfp0697-chenAemb.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=XGBoost Summary] ||<br />
|-<br />
|Week of Nov 22 || Ann Gie Wong, Curtis Li, Hannah Kerr || || The Detection of Black Ice Accidents for Preventative Automated Vehicles Using Convolutional Neural Networks || [https://www.mdpi.com/2079-9292/9/12/2178/htm Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=The_Detection_of_Black_Ice_Accidents_Using_CNNs&fbclid=IwAR0K4YdnL_hdRnOktmJn8BI6-Ra3oitjJof0YwluZgUP1LVFHK5jyiBZkvQ Summary] ||<br />
|-<br />
|Week of Nov 22 || Yuwei Liu, Daniel Mao|| || Depthwise Convolution Is All You Need for Learning Multiple Visual Domains || [https://arxiv.org/abs/1902.00927 Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Depthwise_Convolution_Is_All_You_Need_for_Learning_Multiple_Visual_Domains Summary] ||<br />
|-<br />
|Week of Nov 29 || Lingshan Wang, Yifan Li, Ziyi Liu || || Deep Learning for Extreme Multi-label Text Classification || [https://dl.acm.org/doi/pdf/10.1145/3077136.3080834 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Learning_for_Extreme_Multi-label_Text_Classification Summary]||<br />
|-<br />
|-<br />
|Week of Nov 29 || Kar Lok Ng, Muhan (Iris) Li || || Robust Imitation Learning from Noisy Demonstrations || [http://proceedings.mlr.press/v130/tangkaratt21a/tangkaratt21a.pdf Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Robust_Imitation_Learning_from_Noisy_Demonstrations Summary] ||<br />
|-<br />
|Week of Nov 29 ||Kun Wang || || Convolutional neural network for diagnosis of viral pneumonia and COVID-19 alike diseases|| [https://doi-org.proxy.lib.uwaterloo.ca/10.1111/exsy.12705 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Convolutional_neural_network_for_diagnosis_of_viral_pneumonia_and_COVID-19_alike_diseases Summary] ||<br />
|-<br />
|Week of Nov 29 ||Egemen Guray || || Traffic Sign Recognition System (TSRS): SVM and Convolutional Neural Network || [https://www.researchgate.net/publication/344399165_Traffic_Sign_Recognition_System_TSRS_SVM_and_Convolutional_Neural_Network Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Traffic_Sign_Recognition_System_(TSRS):_SVM_and_Convolutional_Neural_Network Summary] ||<br />
|-<br />
|Week of Nov 29 ||Bsodjahi || || Bayesian Network as a Decision Tool for Predicting ALS Disease ||[https://www.mdpi.com/2076-3425/11/2/150/htm Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Bayesian_Network_as_a_Decision_Tool_for_Predicting_ALS_Disease Summary]||<br />
|-<br />
|Week of Nov 29 ||Xin Yan, Yishu Duan, Xibei Di || || Predicting Hurricane Trajectories Using a Recurrent Neural Network || [https://arxiv.org/pdf/1802.02548.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Predicting_Hurricane_Trajectories_Using_a_Recurrent_Neural_Network Summary]||<br />
|-<br />
|Week of Nov 29 ||Ankitha Anugu, Yushan Chen, Yuying Huang || || A Game Theoretic Approach to Class-wise Selective Rationalization || [https://arxiv.org/pdf/1910.12853.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=A_Game_Theoretic_Approach_to_Class-wise_Selective_Rationalization#How_does_CAR_work_intuitively Summary]||<br />
|-<br />
|Week of Nov 29 ||Aavinash Syamala, Dilmeet Malhi, Sohan Islam, Vansh Joshi || || Research on Multiple Classification Based on Improved SVM Algorithm for Balanced Binary Decision Tree || [https://www.hindawi.com/journals/sp/2021/5560465/ Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Research_on_Multiple_Classification_Based_on_Improved_SVM_Algorithm_for_Balanced_Binary_Decision_Tree Summary]||<br />
|-<br />
|Week of Nov 29 ||Christian Mitrache, Alexandra Mossman, Jessica Saini, Aaron Renggli|| || U-Time: A Fully Convolutional Network for Time Series Segmentation Applied to Sleep Staging|| [https://proceedings.neurips.cc/paper/2019/file/57bafb2c2dfeefba931bb03a835b1fa9-Paper.pdf?fbclid=IwAR1dZpx9vU1pSPTSm_nwk6uBU7TYJ2HNTrsqjaH-9ZycE_PFpFjJoHg1zhQ]||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=U-Time:A_Fully_Convolutional_Network_for_Time_Series_Segmentation_Applied_to_Sleep_Staging_Summary]||<br />
|-<br />
|Week of Nov 29 ||Junbin Pan|| || Wide & Deep Learning for Recommender Systems || [https://arxiv.org/pdf/1606.07792v1.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems Summary]||</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51197Wide and Deep Learning for Recommender Systems2021-12-01T08:28:26Z<p>J47pan: </p>
<hr />
<div><br />
== Presented by ==<br />
<br />
Junbin Pan<br />
<br />
== Introduction ==<br />
<br />
This paper presents a jointly trained wide linear models and deep neural networks architecture - Wide & Deep Learning. In the past, deep neural networks which is good at generalization and generalized linear models with nonlinear feature transformations methods which is good at memorization are widely used in the recommender system. However, combining the benefits of the two models can achieve both memorization and generalization at the same time in recommender system. With jointly training wide linear models and deep neural networks, this paper has demonstrated that a newly proposed Wide & Deep learning outperforms wide-only and deep-only models in recommender systems under the Google Play app with over one billion active users and over one million apps.<br />
<br />
== Related Work ==<br />
<br />
1. '''Embedding-based models''' like factorization machines [5] factorizes the interactions between two variables as a dot product between two low dimensional embedding vectors to achieve generalization.<br />
<br />
2. '''Joint training of RNN and maximum entropy models with n-gram features''' in language models has significantly reduced the complexity of RNN by learning direct weights between inputs and outputs [4]. <br />
<br />
3. '''Deep residual learning''' [2] can reduce the difficulty of training deeper models and improves the accuracy with shortcut connections.<br />
<br />
4. '''Collaborative deep learning''' haven been used to couple deep learning for content information and collaborative filtering for the rating matrix [7].<br />
<br />
== Motivation ==<br />
<br />
Can we build a model to achieve both memorization and generalization? This question motivates the concept of joint training wide and deep models, specifically Wide & Deep Learning.<br />
<br />
The performance of generalized linear models with cross-product transformation can be improved by adding features that are less granular. However, this requires lots of work in feature engineering. On the other hand, the performance of embedding-based models can be improved by linear models with cross-product feature transformations to memorize the rules with a few number of parameters.<br />
<br />
Thus, to handle both problems would be to combine the wide and deep models in the training phase. Therefore, the architecture was motivated by Heng-Tze et al. [1] that overcome these difficulties by jointly training wide models and deep models together. It takes the advantage of both memorization and generalization.<br />
<br />
== Model Architecture ==<br />
<br />
[[File:netowrokstruct.png|700px|thumb|center]]<br />
<br />
The '''wide component''' is a GLM in the form of <math>y=w^Tx+b</math> as illustrated in the left part of Figure 1 where y is the prediction, x is a vector of d features, w are the model parameters in d-dimensional and b is the bias. And the feature set includes transformed features using the cross-product transformation which can be defined as:<br />
<br />
[[File:equation.png|700px|thumb|center]]<br />
<br />
And, this transformation adds nonlinearity to the GLM and captures the interactions between the binary features.<br />
<br />
The '''deep component''' is a feed-forward neural network as illustrated in the right part of Figure 1. For the sparse inputs, high dimensional categorical features are converted into a low-dimensional and dense real-valued vector (embedding vector). Then, the embedding vector is initialized randomly and trained to minimize the final loss function during training. Last, the low dimensional dense embedding vectors are fed into the hidden layers of the network in the forward pass.<br />
<br />
During the training phase, the wide component and deep component are combined using the weighted sum of their output log odds. This gives the prediction and then fed to one common logistic loss function for joint training with back-propagating the gradients from output to both parts of the model simultaneously using mini-batch stochastic optimization. Also, the author used FTRL [3] with L1 regularization and AdaGrad [1] as optimizers for the wide and deep part respectively.<br />
<br />
== Model Results ==<br />
<br />
The proposed architecture was implemented and evaluated in a real-world recommender system, Google Play app in two aspects: app acquisitions and serving performance. <br />
<br />
For app acquisition, the author conducted live online experiments in an A/B testing framework for 3 weeks, where in the control group, 1% users were randomly selected and presented with the previous recommendation models and in the experiment group, 1% users were randomly selected and presented with the Wide & Deep model using the same features as the wide model. Also, 1% users were randomly selected and presented with the deep part of the model with same network structure and features. In Table 1, the Wide & Deep model outperforms the wide model and deep model by 2.9 % and 3.9% respectively on online acquisition gain. And for offline experiments, the Wide & Deep model outperforms the wide model and deep model by 0.002 and 0.006 in terms of AUC. Note that the difference is relative small in offline compared to online since the labels in offline data are fixed while the online system can generate new exploratory recommendations using both memorization and generalization.<br />
<br />
[[File:abc.png|700px|thumb|center]]<br />
<br />
For serving performance, during the peak traffic, the author implemented multithreading and split each batch into smaller sizes which reduced the client-side latency from 31ms to 14ms as shown in Table 2.<br />
<br />
[[File:abc2.png|700px|thumb|center]]<br />
<br />
== Conclusion ==<br />
<br />
Achieving both memorization and generalization is important in recommender system. The Wide & Deep learning proposed in the paper combines wide model and deep model to achieve these two factors, where the wide linear models memorize sparse feature interactions with cross-product feature transformations while the deep neural network uses low-dimensional representation to generalize to unseen feature interactions. And the proposed model led to significant improvement on app acquisitions over wide models and deep models on the Google Play recommender system.<br />
<br />
== Critiques ==<br />
<br />
The Wide & Deep learning framework has dominated in the recommender system over the last 5 years where almost every company uses it. However, the model prefers to extract low dimensional or high dimensional combined features where it cannot extract both types of features at the same time. So, it requires specialized domain knowledge to do feature engineering and the model doesn't learn well on low dimensional combinational features.<br />
<br />
== References ==<br />
<br />
[1] J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121–2159, July 2011.<br />
<br />
[2] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2016.<br />
<br />
[3] H. B. McMahan. Follow-the-regularized-leader and mirror descent: Equivalence theorems and l1 regularization. In Proc. AISTATS, 2011.<br />
<br />
[4] T. Mikolov, A. Deoras, D. Povey, L. Burget, and J. H. Cernocky. Strategies for training large scale neural network language models. In IEEE Automatic Speech Recognition & Understanding Workshop, 2011.<br />
<br />
[5] S. Rendle. Factorization machines with libFM. ACM Trans. Intell. Syst. Technol., 3(3):57:1–57:22, May 2012.abc<br />
<br />
[6] Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, Hemal Shah. Wide & Deep Learning for Recommender Systems. arXiv:1606.07792v1 [cs.LG] 24 Jun 2016<br />
<br />
[7] H. Wang, N. Wang, and D.-Y. Yeung. Collaborative deep learning for recommender systems. In Proc. KDD, pages 1235–1244, 2015.</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51196Wide and Deep Learning for Recommender Systems2021-12-01T08:27:27Z<p>J47pan: </p>
<hr />
<div><br />
== Presented by ==<br />
<br />
Junbin Pan<br />
<br />
== Introduction ==<br />
<br />
This paper presents a jointly trained wide linear models and deep neural networks architecture - Wide & Deep Learning. In the past, deep neural networks which is good at generalization and generalized linear models with nonlinear feature transformations methods which is good at memorization are widely used in the recommender system. However, combining the benefits of the two models can achieve both memorization and generalization at the same time in recommender system. With jointly training wide linear models and deep neural networks, this paper has demonstrated that a newly proposed Wide & Deep learning outperforms wide-only and deep-only models in recommender systems under the Google Play app with over one billion active users and over one million apps.<br />
<br />
== Related Work ==<br />
<br />
1. '''Embedding-based models''' like factorization machines [5] factorizes the interactions between two variables as a dot product between two low dimensional embedding vectors to achieve generalization.<br />
<br />
2. Joint training of RNN and maximum entropy models with n-gram features in language models has significantly reduced the complexity of RNN by learning direct weights between inputs and outputs [4]. <br />
<br />
3. Deep residual learning [2] can reduce the difficulty of training deeper models and improves the accuracy with shortcut connections.<br />
<br />
4. Collaborative deep learning haven been used to couple deep learning for content information and collaborative filtering for the rating matrix [7].<br />
<br />
== Motivation ==<br />
<br />
Can we build a model to achieve both memorization and generalization? This question motivates the concept of joint training wide and deep models, specifically Wide & Deep Learning.<br />
<br />
The performance of generalized linear models with cross-product transformation can be improved by adding features that are less granular. However, this requires lots of work in feature engineering. On the other hand, the performance of embedding-based models can be improved by linear models with cross-product feature transformations to memorize the rules with a few number of parameters.<br />
<br />
Thus, to handle both problems would be to combine the wide and deep models in the training phase. Therefore, the architecture was motivated by Heng-Tze et al. [1] that overcome these difficulties by jointly training wide models and deep models together. It takes the advantage of both memorization and generalization.<br />
<br />
== Model Architecture ==<br />
<br />
[[File:netowrokstruct.png|700px|thumb|center]]<br />
<br />
The '''wide component''' is a GLM in the form of <math>y=w^Tx+b</math> as illustrated in the left part of Figure 1 where y is the prediction, x is a vector of d features, w are the model parameters in d-dimensional and b is the bias. And the feature set includes transformed features using the cross-product transformation which can be defined as:<br />
<br />
[[File:equation.png|700px|thumb|center]]<br />
<br />
And, this transformation adds nonlinearity to the GLM and captures the interactions between the binary features.<br />
<br />
The '''deep component''' is a feed-forward neural network as illustrated in the right part of Figure 1. For the sparse inputs, high dimensional categorical features are converted into a low-dimensional and dense real-valued vector (embedding vector). Then, the embedding vector is initialized randomly and trained to minimize the final loss function during training. Last, the low dimensional dense embedding vectors are fed into the hidden layers of the network in the forward pass.<br />
<br />
During the training phase, the wide component and deep component are combined using the weighted sum of their output log odds. This gives the prediction and then fed to one common logistic loss function for joint training with back-propagating the gradients from output to both parts of the model simultaneously using mini=batch stochastic optimization. Also, the author used FTRL [3] with L1 regularization and AdaGrad [1] as optimizers for the wide and deep part respectively.<br />
<br />
== Model Results ==<br />
<br />
The proposed architecture was implemented and evaluated in a real-world recommender system, Google Play app in two aspects: app acquisitions and serving performance. <br />
<br />
For app acquisition, the author conducted live online experiments in an A/B testing framework for 3 weeks, where in the control group, 1% users were randomly selected and presented with the previous recommendation models and in the experiment group, 1% users were randomly selected and presented with the Wide & Deep model using the same features as the wide model. Also, 1% users were randomly selected and presented with the deep part of the model with same network structure and features. In Table 1, the Wide & Deep model outperforms the wide model and deep model by 2.9 % and 3.9% respectively on online acquisition gain. And for offline experiments, the Wide & Deep model outperforms the wide model and deep model by 0.002 and 0.006 in terms of AUC. Note that the difference is relative small in offline compared to online since the labels in offline data are fixed while the online system can generate new exploratory recommendations using both memorization and generalization.<br />
<br />
[[File:abc.png|700px|thumb|center]]<br />
<br />
For serving performance, during the peak traffic, the author implemented multithreading and split each batch into smaller sizes which reduced the client-side latency from 31ms to 14ms as shown in Table 2.<br />
<br />
[[File:abc2.png|700px|thumb|center]]<br />
<br />
== Conclusion ==<br />
<br />
Achieving both memorization and generalization is important in recommender system. The Wide & Deep learning proposed in the paper combines wide model and deep model to achieve these two factors, where the wide linear models memorize sparse feature interactions with cross-product feature transformations while the deep neural network uses low-dimensional representation to generalize to unseen feature interactions. And the proposed model led to significant improvement on app acquisitions over wide models and deep models on the Google Play recommender system.<br />
<br />
== Critiques ==<br />
<br />
The Wide & Deep learning framework has dominated in the recommender system over the last 5 years where almost every company uses it. However, the model prefers to extract low dimensional or high dimensional combined features where it cannot extract both types of features at the same time. So, it requires specialized domain knowledge to do feature engineering and the model doesn't learn well on low dimensional combinational features.<br />
<br />
== References ==<br />
<br />
[1] J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121–2159, July 2011.<br />
<br />
[2] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2016.<br />
<br />
[3] H. B. McMahan. Follow-the-regularized-leader and mirror descent: Equivalence theorems and l1 regularization. In Proc. AISTATS, 2011.<br />
<br />
[4] T. Mikolov, A. Deoras, D. Povey, L. Burget, and J. H. Cernocky. Strategies for training large scale neural network language models. In IEEE Automatic Speech Recognition & Understanding Workshop, 2011.<br />
<br />
[5] S. Rendle. Factorization machines with libFM. ACM Trans. Intell. Syst. Technol., 3(3):57:1–57:22, May 2012.abc<br />
<br />
[6] Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, Hemal Shah. Wide & Deep Learning for Recommender Systems. arXiv:1606.07792v1 [cs.LG] 24 Jun 2016<br />
<br />
[7] H. Wang, N. Wang, and D.-Y. Yeung. Collaborative deep learning for recommender systems. In Proc. KDD, pages 1235–1244, 2015.</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=File:equation.png&diff=51195File:equation.png2021-12-01T08:24:01Z<p>J47pan: </p>
<hr />
<div></div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=File:netowrokstruct.png&diff=51194File:netowrokstruct.png2021-12-01T08:21:51Z<p>J47pan: </p>
<hr />
<div></div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51193Wide and Deep Learning for Recommender Systems2021-12-01T08:21:28Z<p>J47pan: </p>
<hr />
<div><br />
== Presented by ==<br />
<br />
Junbin Pan<br />
<br />
== Introduction ==<br />
<br />
This paper presents a jointly trained wide linear models and deep neural networks architecture - Wide & Deep Learning. In the past, deep neural networks which is good at generalization and generalized linear models with nonlinear feature transformations methods which is good at memorization are widely used in the recommender system. However, combining the benefits of the two models can achieve both memorization and generalization at the same time in recommender system. With jointly training wide linear models and deep neural networks, this paper has demonstrated that a newly proposed Wide & Deep learning outperforms wide-only and deep-only models in recommender systems under the Google Play app with over one billion active users and over one million apps.<br />
<br />
== Related Work ==<br />
<br />
1. '''Embedding-based models''' like factorization machines [5] factorizes the interactions between two variables as a dot product between two low dimensional embedding vectors to achieve generalization.<br />
<br />
2. Joint training of RNN and maximum entropy models with n-gram features in language models has significantly reduced the complexity of RNN by learning direct weights between inputs and outputs [4]. <br />
<br />
3. Deep residual learning [2] can reduce the difficulty of training deeper models and improves the accuracy with shortcut connections.<br />
<br />
4. Collaborative deep learning haven been used to couple deep learning for content information and collaborative filtering for the rating matrix [7].<br />
<br />
== Motivation ==<br />
<br />
Can we build a model to achieve both memorization and generalization? This question motivates the concept of joint training wide and deep models, specifically Wide & Deep Learning.<br />
<br />
The performance of generalized linear models with cross-product transformation can be improved by adding features that are less granular. However, this requires lots of work in feature engineering. On the other hand, the performance of embedding-based models can be improved by linear models with cross-product feature transformations to memorize the rules with a few number of parameters.<br />
<br />
Thus, to handle both problems would be to combine the wide and deep models in the training phase. Therefore, the architecture was motivated by Heng-Tze et al. [1] that overcome these difficulties by jointly training wide models and deep models together. It takes the advantage of both memorization and generalization.<br />
<br />
== Model Architecture ==<br />
<br />
[[File:netowrokstruct.png|700px|thumb|center]]<br />
<br />
The '''wide component''' is a GLM in the form of <math>y=w^Tx+b</math> as illustrated in the left part of Figure 1 where y is the prediction, x is a vector of d features, w are the model parameters in d-dimensional and b is the bias. And the feature set includes transformed features using the cross-product transformation which can be defined as:<br />
<br />
[[File:equation.png|700px|thumb|center]]<br />
<br />
And, this transformation adds nonlinearity to the GLM and captures the interactions between the binary features.<br />
<br />
The '''deep component''' is a feed-forward neural network as illustrated in the right part of Figure 1. For the sparse inputs, high dimensional categorical features are converted into a low-dimensional and dense real-valued vector (embedding vector). Then, the embedding vector is initialized randomly and trained to minimize the final loss function during training. Last, the low dimensional dense embedding vectors are fed into the hidden layers of the network in the forward pass.<br />
<br />
During the training phase, the wide component and deep component are combined using the weighted sum of their output log odds. This gives the prediction and then fed to one common logistic loss function for joint training with back-propagating the gradients from output to both parts of the model simultaneously using mini=batch stochastic optimization.<br />
<br />
== Model Results ==<br />
<br />
The proposed architecture was implemented and evaluated in a real-world recommender system, Google Play app in two aspects: app acquisitions and serving performance. <br />
<br />
For app acquisition, the author conducted live online experiments in an A/B testing framework for 3 weeks, where in the control group, 1% users were randomly selected and presented with the previous recommendation models and in the experiment group, 1% users were randomly selected and presented with the Wide & Deep model using the same features as the wide model. Also, 1% users were randomly selected and presented with the deep part of the model with same network structure and features. In Table 1, the Wide & Deep model outperforms the wide model and deep model by 2.9 % and 3.9% respectively on online acquisition gain. And for offline experiments, the Wide & Deep model outperforms the wide model and deep model by 0.002 and 0.006 in terms of AUC. Note that the difference is relative small in offline compared to online since the labels in offline data are fixed while the online system can generate new exploratory recommendations using both memorization and generalization.<br />
<br />
[[File:abc.png|700px|thumb|center]]<br />
<br />
For serving performance, during the peak traffic, the author implemented multithreading and split each batch into smaller sizes which reduced the client-side latency from 31ms to 14ms as shown in Table 2.<br />
<br />
[[File:abc2.png|700px|thumb|center]]<br />
<br />
== Conclusion ==<br />
<br />
Achieving both memorization and generalization is important in recommender system. The Wide & Deep learning proposed in the paper combines wide model and deep model to achieve these two factors, where the wide linear models memorize sparse feature interactions with cross-product feature transformations while the deep neural network uses low-dimensional representation to generalize to unseen feature interactions. And the proposed model led to significant improvement on app acquisitions over wide models and deep models on the Google Play recommender system.<br />
<br />
== Critiques ==<br />
<br />
The Wide & Deep learning framework has dominated in the recommender system over the last 5 years where almost every company uses it. However, the model prefers to extract low dimensional or high dimensional combined features where it cannot extract both types of features at the same time. So, it requires specialized domain knowledge to do feature engineering and the model doesn't learn well on low dimensional combinational features.<br />
<br />
== References ==<br />
<br />
[1] J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121–2159, July 2011.<br />
<br />
[2] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2016.<br />
<br />
[3] H. B. McMahan. Follow-the-regularized-leader and mirror descent: Equivalence theorems and l1 regularization. In Proc. AISTATS, 2011.<br />
<br />
[4] T. Mikolov, A. Deoras, D. Povey, L. Burget, and J. H. Cernocky. Strategies for training large scale neural network language models. In IEEE Automatic Speech Recognition & Understanding Workshop, 2011.<br />
<br />
[5] S. Rendle. Factorization machines with libFM. ACM Trans. Intell. Syst. Technol., 3(3):57:1–57:22, May 2012.abc<br />
<br />
[6] J. J. Tompson, A. Jain, Y. LeCun, and C. Bregler. Joint training of a convolutional network and a graphical model for human pose estimation. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, NIPS, pages 1799–1807. 2014.<br />
<br />
[7] H. Wang, N. Wang, and D.-Y. Yeung. Collaborative deep learning for recommender systems. In Proc. KDD, pages 1235–1244, 2015.<br />
<br />
[8] B. Yan and G. Chen. AppJoy: Personalized mobile application discovery. In MobiSys, pages 113–126, 2011.<br />
<br />
[9] Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, Hemal Shah. Wide & Deep Learning for Recommender Systems. arXiv:1606.07792v1 [cs.LG] 24 Jun 2016</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51192Wide and Deep Learning for Recommender Systems2021-12-01T08:04:03Z<p>J47pan: </p>
<hr />
<div><br />
== Presented by ==<br />
<br />
Junbin Pan<br />
<br />
== Introduction ==<br />
<br />
This paper presents a jointly trained wide linear models and deep neural networks architecture - Wide & Deep Learning. In the past, deep neural networks which is good at generalization and generalized linear models with nonlinear feature transformations methods which is good at memorization are widely used in the recommender system. However, combining the benefits of the two models can achieve both memorization and generalization at the same time in recommender system. With jointly training wide linear models and deep neural networks, this paper has demonstrated that a newly proposed Wide & Deep learning outperforms wide-only and deep-only models in recommender systems under the Google Play app with over one billion active users and over one million apps.<br />
<br />
== Related Work ==<br />
<br />
1. '''Embedding-based models''' like factorization machines [5] factorizes the interactions between two variables as a dot product between two low dimensional embedding vectors to achieve generalization.<br />
<br />
2. Joint training of RNN and maximum entropy models with n-gram features in language models has significantly reduced the complexity of RNN by learning direct weights between inputs and outputs [4]. <br />
<br />
3. Deep residual learning [2] can reduce the difficulty of training deeper models and improves the accuracy with shortcut connections.<br />
<br />
4. Collaborative deep learning haven been used to couple deep learning for content information and collaborative filtering for the rating matrix [7].<br />
<br />
== Motivation ==<br />
<br />
Can we build a model to achieve both memorization and generalization? This question motivates the concept of joint training wide and deep models, specifically Wide & Deep Learning.<br />
<br />
The performance of generalized linear models with cross-product transformation can be improved by adding features that are less granular. However, this requires lots of work in feature engineering. On the other hand, the performance of embedding-based models can be improved by linear models with cross-product feature transformations to memorize the rules with a few number of parameters.<br />
<br />
Thus, to handle both problems would be to combine the wide and deep models in the training phase. Therefore, the architecture was motivated by Heng-Tze et al. [1] that overcome these difficulties by jointly training wide models and deep models together. It takes the advantage of both memorization and generalization.<br />
<br />
== Model Architecture ==<br />
<br />
The wide component is a GLM in the form of <math>y=w^Tx+b</math> <br />
<br />
== Model Results ==<br />
<br />
The proposed architecture was implemented and evaluated in a real-world recommender system, Google Play app in two aspects: app acquisitions and serving performance. <br />
<br />
For app acquisition, the author conducted live online experiments in an A/B testing framework for 3 weeks, where in the control group, 1% users were randomly selected and presented with the previous recommendation models and in the experiment group, 1% users were randomly selected and presented with the Wide & Deep model using the same features as the wide model. Also, 1% users were randomly selected and presented with the deep part of the model with same network structure and features. In Table 1, the Wide & Deep model outperforms the wide model and deep model by 2.9 % and 3.9% respectively on online acquisition gain. And for offline experiments, the Wide & Deep model outperforms the wide model and deep model by 0.002 and 0.006 in terms of AUC. Note that the difference is relative small in offline compared to online since the labels in offline data are fixed while the online system can generate new exploratory recommendations using both memorization and generalization.<br />
<br />
[[File:abc.png|700px|thumb|center]]<br />
<br />
For serving performance, during the peak traffic, the author implemented multithreading and split each batch into smaller sizes which reduced the client-side latency from 31ms to 14ms as shown in Table 2.<br />
<br />
[[File:abc2.png|700px|thumb|center]]<br />
<br />
== Conclusion ==<br />
<br />
Achieving both memorization and generalization is important in recommender system. The Wide & Deep learning proposed in the paper combines wide model and deep model to achieve these two factors, where the wide linear models memorize sparse feature interactions with cross-product feature transformations while the deep neural network uses low-dimensional representation to generalize to unseen feature interactions. And the proposed model led to significant improvement on app acquisitions over wide models and deep models on the Google Play recommender system.<br />
<br />
== Critiques ==<br />
<br />
The Wide & Deep learning framework has dominated in the recommender system over the last 5 years where almost every company uses it. However, the model prefers to extract low dimensional or high dimensional combined features where it cannot extract both types of features at the same time. So, it requires specialized domain knowledge to do feature engineering and the model doesn't learn well on low dimensional combinational features.<br />
<br />
== References ==<br />
<br />
[1] J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121–2159, July 2011.<br />
<br />
[2] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2016.<br />
<br />
[3] H. B. McMahan. Follow-the-regularized-leader and mirror descent: Equivalence theorems and l1 regularization. In Proc. AISTATS, 2011.<br />
<br />
[4] T. Mikolov, A. Deoras, D. Povey, L. Burget, and J. H. Cernocky. Strategies for training large scale neural network language models. In IEEE Automatic Speech Recognition & Understanding Workshop, 2011.<br />
<br />
[5] S. Rendle. Factorization machines with libFM. ACM Trans. Intell. Syst. Technol., 3(3):57:1–57:22, May 2012.abc<br />
<br />
[6] J. J. Tompson, A. Jain, Y. LeCun, and C. Bregler. Joint training of a convolutional network and a graphical model for human pose estimation. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, NIPS, pages 1799–1807. 2014.<br />
<br />
[7] H. Wang, N. Wang, and D.-Y. Yeung. Collaborative deep learning for recommender systems. In Proc. KDD, pages 1235–1244, 2015.<br />
<br />
[8] B. Yan and G. Chen. AppJoy: Personalized mobile application discovery. In MobiSys, pages 113–126, 2011.<br />
<br />
[9] Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, Hemal Shah. Wide & Deep Learning for Recommender Systems. arXiv:1606.07792v1 [cs.LG] 24 Jun 2016</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=File:abc2.png&diff=51191File:abc2.png2021-12-01T07:08:01Z<p>J47pan: </p>
<hr />
<div></div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51190Wide and Deep Learning for Recommender Systems2021-12-01T07:07:50Z<p>J47pan: </p>
<hr />
<div><br />
== Presented by ==<br />
<br />
Junbin Pan<br />
<br />
== Introduction ==<br />
<br />
This paper presents a jointly trained wide linear models and deep neural networks architecture - Wide & Deep Learning. In the past, deep neural networks which is good at generalization and generalized linear models with nonlinear feature transformations methods which is good at memorization are widely used in the recommender system. However, combining the benefits of the two models can achieve both memorization and generalization at the same time in recommender system. With jointly training wide linear models and deep neural networks, this paper has demonstrated that a newly proposed Wide & Deep learning outperforms wide-only and deep-only models in recommender systems under the Google Play app with over one billion active users and over one million apps.<br />
<br />
== Related Work ==<br />
<br />
1. '''Generalized linear models''' like logistic regression are trained on binarized sparse features with one-hot encoding using cross-product transformation to achieve memorization, but these models don't generalize to unseen query-item feature pairs, which lacks in generalization.<br />
<br />
2. '''Embedding-based models''' like factorization machines (S. Rendle, 2012) or deep neural networks learns a low-dimensional dense embedding vector for each query and item feature to generalize on query-item feature pairs that have never been seen before by learning, but with less work on feature engineering. However, under a sparse and high rank query-item matrix, it is hard to learn the low-dimensional representation for the query-item matrix, which lacks in memorization.<br />
<br />
== Motivation ==<br />
<br />
Can we build a model to achieve both memorization and generalization? This question motivates the concept of joint training wide and deep models, specifically Wide & Deep Learning.<br />
<br />
The performance of generalized linear models with cross-product transformation can be improved by adding features that are less granular. However, this requires lots of work in feature engineering. On the other hand, the performance of embedding-based models can be improved by linar models with cross-product feature transformations to memorize the rules with a few number of parameters.<br />
<br />
Thus, to handle both problems would be to combine the wide and deep models in the training phase. Therefore, the architecture was motivated by Heng-Tze et al. [1] that overcome these difficulties by jointly training wide models and deep models together. It takes the advantage of both memorization and generalization.<br />
<br />
== Model Architecture ==<br />
<br />
abc<br />
<br />
== Model Results ==<br />
<br />
The proposed architecture was implemented and evaluated in a real-world recommender system, Google Play app in two aspects: app acquisitions and serving performance. <br />
<br />
For app acquisition, the author conducted live online experiments in an A/B testing framework for 3 weeks, where in the control group, 1% users were randomly selected and presented with the previous recommendation models and in the experiment group, 1% users were randomly selected and presented with the Wide & Deep model using the same features as the wide model. Also, 1% users were randomly selected and presented with the deep part of the model with same network structure and features. In Table 1, the Wide & Deep model outperforms the wide model and deep model by 2.9 % and 3.9% respectively on online acquisition gain. And for offline experiments, the Wide & Deep model outperforms the wide model and deep model by 0.002 and 0.006 in terms of AUC. Note that the difference is relative small in offline compared to online since the labels in offline data are fixed while the online system can generate new exploratory recommendations using both memorization and generalization.<br />
<br />
[[File:abc.png|700px|thumb|center]]<br />
<br />
For serving performance, during the peak traffic, the author implemented multithreading and split each batch into smaller sizes which reduced the client-side latency from 31ms to 14ms as shown in Table 2.<br />
<br />
[[File:abc2.png|700px|thumb|center]]<br />
<br />
== Conclusion ==<br />
<br />
abc<br />
<br />
== Critiques ==<br />
<br />
abc<br />
<br />
== References ==<br />
<br />
[1] Heng-Tze Cheng<br />
<br />
[1] J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121–2159, July 2011.<br />
<br />
[2] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2016.<br />
<br />
[3] H. B. McMahan. Follow-the-regularized-leader and mirror descent: Equivalence theorems and l1 regularization. In Proc. AISTATS, 2011.<br />
<br />
[4] T. Mikolov, A. Deoras, D. Povey, L. Burget, and J. H. Cernocky. Strategies for training large scale neural network language models. In IEEE Automatic Speech Recognition & Understanding Workshop, 2011.<br />
<br />
[5] S. Rendle. Factorization machines with libFM. ACM Trans. Intell. Syst. Technol., 3(3):57:1–57:22, May 2012.abc<br />
<br />
[6] J. J. Tompson, A. Jain, Y. LeCun, and C. Bregler. Joint training of a convolutional network and a graphical model for human pose estimation. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, NIPS, pages 1799–1807. 2014.<br />
<br />
[7] H. Wang, N. Wang, and D.-Y. Yeung. Collaborative deep learning for recommender systems. In Proc. KDD, pages 1235–1244, 2015.<br />
<br />
[8] B. Yan and G. Chen. AppJoy: Personalized mobile application discovery. In MobiSys, pages 113–126, 2011.</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=File:abc.png&diff=51189File:abc.png2021-12-01T07:06:48Z<p>J47pan: </p>
<hr />
<div></div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51188Wide and Deep Learning for Recommender Systems2021-12-01T07:06:24Z<p>J47pan: </p>
<hr />
<div><br />
== Presented by ==<br />
<br />
Junbin Pan<br />
<br />
== Introduction ==<br />
<br />
This paper presents a jointly trained wide linear models and deep neural networks architecture - Wide & Deep Learning. In the past, deep neural networks which is good at generalization and generalized linear models with nonlinear feature transformations methods which is good at memorization are widely used in the recommender system. However, combining the benefits of the two models can achieve both memorization and generalization at the same time in recommender system. With jointly training wide linear models and deep neural networks, this paper has demonstrated that a newly proposed Wide & Deep learning outperforms wide-only and deep-only models in recommender systems under the Google Play app with over one billion active users and over one million apps.<br />
<br />
== Related Work ==<br />
<br />
1. '''Generalized linear models''' like logistic regression are trained on binarized sparse features with one-hot encoding using cross-product transformation to achieve memorization, but these models don't generalize to unseen query-item feature pairs, which lacks in generalization.<br />
<br />
2. '''Embedding-based models''' like factorization machines (S. Rendle, 2012) or deep neural networks learns a low-dimensional dense embedding vector for each query and item feature to generalize on query-item feature pairs that have never been seen before by learning, but with less work on feature engineering. However, under a sparse and high rank query-item matrix, it is hard to learn the low-dimensional representation for the query-item matrix, which lacks in memorization.<br />
<br />
== Motivation ==<br />
<br />
Can we build a model to achieve both memorization and generalization? This question motivates the concept of joint training wide and deep models, specifically Wide & Deep Learning.<br />
<br />
The performance of generalized linear models with cross-product transformation can be improved by adding features that are less granular. However, this requires lots of work in feature engineering. On the other hand, the performance of embedding-based models can be improved by linar models with cross-product feature transformations to memorize the rules with a few number of parameters.<br />
<br />
Thus, to handle both problems would be to combine the wide and deep models in the training phase. Therefore, the architecture was motivated by Heng-Tze et al. [1] that overcome these difficulties by jointly training wide models and deep models together. It takes the advantage of both memorization and generalization.<br />
<br />
== Model Architecture ==<br />
<br />
abc<br />
<br />
== Model Results ==<br />
<br />
The proposed architecture was implemented and evaluated in a real-world recommender system, Google Play app in two aspects: app acquisitions and serving performance. <br />
<br />
For app acquisition, the author conducted live online experiments in an A/B testing framework for 3 weeks, where in the control group, 1% users were randomly selected and presented with the previous recommendation models and in the experiment group, 1% users were randomly selected and presented with the Wide & Deep model using the same features as the wide model. Also, 1% users were randomly selected and presented with the deep part of the model with same network structure and features. In Table 1, the Wide & Deep model outperforms the wide model and deep model by 2.9 % and 3.9% respectively on online acquisition gain. And for offline experiments, the Wide & Deep model outperforms the wide model and deep model by 0.002 and 0.006 in terms of AUC. Note that the difference is relative small in offline compared to online since the labels in offline data are fixed while the online system can generate new exploratory recommendations using both memorization and generalization.<br />
<br />
[[File:abc.png|700px|thumb|center]]<br />
<br />
For serving performance, during the peak traffic, the author implemented multithreading and split each batch into smaller sizes which reduced the client-side latency from 31ms to 14ms as shown in Table 2.<br />
<br />
== Conclusion ==<br />
<br />
abc<br />
<br />
== Critiques ==<br />
<br />
abc<br />
<br />
== References ==<br />
<br />
[1] Heng-Tze Cheng<br />
<br />
[1] J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121–2159, July 2011.<br />
<br />
[2] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2016.<br />
<br />
[3] H. B. McMahan. Follow-the-regularized-leader and mirror descent: Equivalence theorems and l1 regularization. In Proc. AISTATS, 2011.<br />
<br />
[4] T. Mikolov, A. Deoras, D. Povey, L. Burget, and J. H. Cernocky. Strategies for training large scale neural network language models. In IEEE Automatic Speech Recognition & Understanding Workshop, 2011.<br />
<br />
[5] S. Rendle. Factorization machines with libFM. ACM Trans. Intell. Syst. Technol., 3(3):57:1–57:22, May 2012.abc<br />
<br />
[6] J. J. Tompson, A. Jain, Y. LeCun, and C. Bregler. Joint training of a convolutional network and a graphical model for human pose estimation. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, NIPS, pages 1799–1807. 2014.<br />
<br />
[7] H. Wang, N. Wang, and D.-Y. Yeung. Collaborative deep learning for recommender systems. In Proc. KDD, pages 1235–1244, 2015.<br />
<br />
[8] B. Yan and G. Chen. AppJoy: Personalized mobile application discovery. In MobiSys, pages 113–126, 2011.</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51187Wide and Deep Learning for Recommender Systems2021-12-01T07:05:32Z<p>J47pan: </p>
<hr />
<div><br />
== Presented by ==<br />
<br />
Junbin Pan<br />
<br />
== Introduction ==<br />
<br />
This paper presents a jointly trained wide linear models and deep neural networks architecture - Wide & Deep Learning. In the past, deep neural networks which is good at generalization and generalized linear models with nonlinear feature transformations methods which is good at memorization are widely used in the recommender system. However, combining the benefits of the two models can achieve both memorization and generalization at the same time in recommender system. With jointly training wide linear models and deep neural networks, this paper has demonstrated that a newly proposed Wide & Deep learning outperforms wide-only and deep-only models in recommender systems under the Google Play app with over one billion active users and over one million apps.<br />
<br />
== Related Work ==<br />
<br />
1. '''Generalized linear models''' like logistic regression are trained on binarized sparse features with one-hot encoding using cross-product transformation to achieve memorization, but these models don't generalize to unseen query-item feature pairs, which lacks in generalization.<br />
<br />
2. '''Embedding-based models''' like factorization machines (S. Rendle, 2012) or deep neural networks learns a low-dimensional dense embedding vector for each query and item feature to generalize on query-item feature pairs that have never been seen before by learning, but with less work on feature engineering. However, under a sparse and high rank query-item matrix, it is hard to learn the low-dimensional representation for the query-item matrix, which lacks in memorization.<br />
<br />
== Motivation ==<br />
<br />
Can we build a model to achieve both memorization and generalization? This question motivates the concept of joint training wide and deep models, specifically Wide & Deep Learning.<br />
<br />
The performance of generalized linear models with cross-product transformation can be improved by adding features that are less granular. However, this requires lots of work in feature engineering. On the other hand, the performance of embedding-based models can be improved by linar models with cross-product feature transformations to memorize the rules with a few number of parameters.<br />
<br />
Thus, to handle both problems would be to combine the wide and deep models in the training phase. Therefore, the architecture was motivated by Heng-Tze et al. [1] that overcome these difficulties by jointly training wide models and deep models together. It takes the advantage of both memorization and generalization.<br />
<br />
== Model Architecture ==<br />
<br />
abc<br />
<br />
== Model Results ==<br />
<br />
The proposed architecture was implemented and evaluated in a real-world recommender system, Google Play app in two aspects: app acquisitions and serving performance. <br />
<br />
For app acquisition, the author conducted live online experiments in an A/B testing framework for 3 weeks, where in the control group, 1% users were randomly selected and presented with the previous recommendation models and in the experiment group, 1% users were randomly selected and presented with the Wide & Deep model using the same features as the wide model. Also, 1% users were randomly selected and presented with the deep part of the model with same network structure and features. In Table 1, the Wide & Deep model outperforms the wide model and deep model by 2.9 % and 3.9% respectively on online acquisition gain. And for offline experiments, the Wide & Deep model outperforms the wide model and deep model by 0.002 and 0.006 in terms of AUC. Note that the difference is relative small in offline compared to online since the labels in offline data are fixed while the online system can generate new exploratory recommendations using both memorization and generalization.<br />
<br />
[[File:abc.jpg|200px|thumb|left|picforabc]]<br />
<br />
For serving performance, during the peak traffic, the author implemented multithreading and split each batch into smaller sizes which reduced the client-side latency from 31ms to 14ms as shown in Table 2.<br />
<br />
== Conclusion ==<br />
<br />
abc<br />
<br />
== Critiques ==<br />
<br />
abc<br />
<br />
== References ==<br />
<br />
[1] Heng-Tze Cheng<br />
<br />
[1] J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121–2159, July 2011.<br />
<br />
[2] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2016.<br />
<br />
[3] H. B. McMahan. Follow-the-regularized-leader and mirror descent: Equivalence theorems and l1 regularization. In Proc. AISTATS, 2011.<br />
<br />
[4] T. Mikolov, A. Deoras, D. Povey, L. Burget, and J. H. Cernocky. Strategies for training large scale neural network language models. In IEEE Automatic Speech Recognition & Understanding Workshop, 2011.<br />
<br />
[5] S. Rendle. Factorization machines with libFM. ACM Trans. Intell. Syst. Technol., 3(3):57:1–57:22, May 2012.abc<br />
<br />
[6] J. J. Tompson, A. Jain, Y. LeCun, and C. Bregler. Joint training of a convolutional network and a graphical model for human pose estimation. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, NIPS, pages 1799–1807. 2014.<br />
<br />
[7] H. Wang, N. Wang, and D.-Y. Yeung. Collaborative deep learning for recommender systems. In Proc. KDD, pages 1235–1244, 2015.<br />
<br />
[8] B. Yan and G. Chen. AppJoy: Personalized mobile application discovery. In MobiSys, pages 113–126, 2011.</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51186Wide and Deep Learning for Recommender Systems2021-12-01T07:03:47Z<p>J47pan: </p>
<hr />
<div><br />
== Presented by ==<br />
<br />
Junbin Pan<br />
<br />
== Introduction ==<br />
<br />
This paper presents a jointly trained wide linear models and deep neural networks architecture - Wide & Deep Learning. In the past, deep neural networks which is good at generalization and generalized linear models with nonlinear feature transformations methods which is good at memorization are widely used in the recommender system. However, combining the benefits of the two models can achieve both memorization and generalization at the same time in recommender system. With jointly training wide linear models and deep neural networks, this paper has demonstrated that a newly proposed Wide & Deep learning outperforms wide-only and deep-only models in recommender systems under the Google Play app with over one billion active users and over one million apps.<br />
<br />
== Related Work ==<br />
<br />
1. '''Generalized linear models''' like logistic regression are trained on binarized sparse features with one-hot encoding using cross-product transformation to achieve memorization, but these models don't generalize to unseen query-item feature pairs, which lacks in generalization.<br />
<br />
2. '''Embedding-based models''' like factorization machines (S. Rendle, 2012) or deep neural networks learns a low-dimensional dense embedding vector for each query and item feature to generalize on query-item feature pairs that have never been seen before by learning, but with less work on feature engineering. However, under a sparse and high rank query-item matrix, it is hard to learn the low-dimensional representation for the query-item matrix, which lacks in memorization.<br />
<br />
== Motivation ==<br />
<br />
Can we build a model to achieve both memorization and generalization? This question motivates the concept of joint training wide and deep models, specifically Wide & Deep Learning.<br />
<br />
The performance of generalized linear models with cross-product transformation can be improved by adding features that are less granular. However, this requires lots of work in feature engineering. On the other hand, the performance of embedding-based models can be improved by linar models with cross-product feature transformations to memorize the rules with a few number of parameters.<br />
<br />
Thus, to handle both problems would be to combine the wide and deep models in the training phase. Therefore, the architecture was motivated by Heng-Tze et al. [1] that overcome these difficulties by jointly training wide models and deep models together. It takes the advantage of both memorization and generalization.<br />
<br />
== Model Architecture ==<br />
<br />
abc<br />
<br />
== Model Results ==<br />
<br />
The proposed architecture was implemented and evaluated in a real-world recommender system, Google Play app in two aspects: app acquisitions and serving performance. <br />
<br />
For app acquisition, the author conducted live online experiments in an A/B testing framework for 3 weeks, where in the control group, 1% users were randomly selected and presented with the previous recommendation models and in the experiment group, 1% users were randomly selected and presented with the Wide & Deep model using the same features as the wide model. Also, 1% users were randomly selected and presented with the deep part of the model with same network structure and features. In Table 1, the Wide & Deep model outperforms the wide model and deep model by 2.9 % and 3.9% respectively on online acquisition gain. And for offline experiments, the Wide & Deep model outperforms the wide model and deep model by 0.002 and 0.006 in terms of AUC. Note that the difference is relative small in offline compared to online since the labels in offline data are fixed while the online system can generate new exploratory recommendations using both memorization and generalization.<br />
<br />
[[File:abc.jpg|200px|thumb|left|picforabc]]<br />
<br />
<br />
For serving performance, during the peak traffic, the author implemented multithreading and split each batch into smaller sizes which reduced the client-side latency from 31ms to 14ms as shown in Table 2.<br />
<br />
== Conclusion ==<br />
<br />
abc<br />
<br />
== Critiques ==<br />
<br />
abc<br />
<br />
== References ==<br />
<br />
[1] Heng-Tze Cheng<br />
<br />
[1] J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121–2159, July 2011.<br />
<br />
[2] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2016.<br />
<br />
[3] H. B. McMahan. Follow-the-regularized-leader and mirror descent: Equivalence theorems and l1 regularization. In Proc. AISTATS, 2011.<br />
<br />
[4] T. Mikolov, A. Deoras, D. Povey, L. Burget, and J. H. Cernocky. Strategies for training large scale neural network language models. In IEEE Automatic Speech Recognition & Understanding Workshop, 2011.<br />
<br />
[5] S. Rendle. Factorization machines with libFM. ACM Trans. Intell. Syst. Technol., 3(3):57:1–57:22, May 2012.abc<br />
<br />
[6] J. J. Tompson, A. Jain, Y. LeCun, and C. Bregler. Joint training of a convolutional network and a graphical model for human pose estimation. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, NIPS, pages 1799–1807. 2014.<br />
<br />
[7] H. Wang, N. Wang, and D.-Y. Yeung. Collaborative deep learning for recommender systems. In Proc. KDD, pages 1235–1244, 2015.<br />
<br />
[8] B. Yan and G. Chen. AppJoy: Personalized mobile application discovery. In MobiSys, pages 113–126, 2011.</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51185Wide and Deep Learning for Recommender Systems2021-12-01T07:03:18Z<p>J47pan: </p>
<hr />
<div><br />
== Presented by ==<br />
<br />
Junbin Pan<br />
<br />
== Introduction ==<br />
<br />
This paper presents a jointly trained wide linear models and deep neural networks architecture - Wide & Deep Learning. In the past, deep neural networks which is good at generalization and generalized linear models with nonlinear feature transformations methods which is good at memorization are widely used in the recommender system. However, combining the benefits of the two models can achieve both memorization and generalization at the same time in recommender system. With jointly training wide linear models and deep neural networks, this paper has demonstrated that a newly proposed Wide & Deep learning outperforms wide-only and deep-only models in recommender systems under the Google Play app with over one billion active users and over one million apps.<br />
<br />
== Related Work ==<br />
<br />
1. '''Generalized linear models''' like logistic regression are trained on binarized sparse features with one-hot encoding using cross-product transformation to achieve memorization, but these models don't generalize to unseen query-item feature pairs, which lacks in generalization.<br />
<br />
2. '''Embedding-based models''' like factorization machines (S. Rendle, 2012) or deep neural networks learns a low-dimensional dense embedding vector for each query and item feature to generalize on query-item feature pairs that have never been seen before by learning, but with less work on feature engineering. However, under a sparse and high rank query-item matrix, it is hard to learn the low-dimensional representation for the query-item matrix, which lacks in memorization.<br />
<br />
== Motivation ==<br />
<br />
Can we build a model to achieve both memorization and generalization? This question motivates the concept of joint training wide and deep models, specifically Wide & Deep Learning.<br />
<br />
The performance of generalized linear models with cross-product transformation can be improved by adding features that are less granular. However, this requires lots of work in feature engineering. On the other hand, the performance of embedding-based models can be improved by linar models with cross-product feature transformations to memorize the rules with a few number of parameters.<br />
<br />
Thus, to handle both problems would be to combine the wide and deep models in the training phase. Therefore, the architecture was motivated by Heng-Tze et al. [1] that overcome these difficulties by jointly training wide models and deep models together. It takes the advantage of both memorization and generalization.<br />
<br />
== Model Architecture ==<br />
<br />
abc<br />
<br />
== Model Results ==<br />
<br />
The proposed architecture was implemented and evaluated in a real-world recommender system, Google Play app in two aspects: app acquisitions and serving performance. <br />
<br />
For app acquisition, the author conducted live online experiments in an A/B testing framework for 3 weeks, where in the control group, 1% users were randomly selected and presented with the previous recommendation models and in the experiment group, 1% users were randomly selected and presented with the Wide & Deep model using the same features as the wide model. Also, 1% users were randomly selected and presented with the deep part of the model with same network structure and features. In Table 1, the Wide & Deep model outperforms the wide model and deep model by 2.9 % and 3.9% respectively on online acquisition gain. And for offline experiments, the Wide & Deep model outperforms the wide model and deep model by 0.002 and 0.006 in terms of AUC. Note that the difference is relative small in offline compared to online since the labels in offline data are fixed while the online system can generate new exploratory recommendations using both memorization and generalization.<br />
<br />
[[File:abc.jpg|200px|thumb|left|alt text]]<br />
<br />
For serving performance, during the peak traffic, the author implemented multithreading and split each batch into smaller sizes which reduced the client-side latency from 31ms to 14ms as shown in Table 2.<br />
<br />
== Conclusion ==<br />
<br />
abc<br />
<br />
== Critiques ==<br />
<br />
abc<br />
<br />
== References ==<br />
<br />
[1] Heng-Tze Cheng<br />
<br />
[1] J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121–2159, July 2011.<br />
<br />
[2] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2016.<br />
<br />
[3] H. B. McMahan. Follow-the-regularized-leader and mirror descent: Equivalence theorems and l1 regularization. In Proc. AISTATS, 2011.<br />
<br />
[4] T. Mikolov, A. Deoras, D. Povey, L. Burget, and J. H. Cernocky. Strategies for training large scale neural network language models. In IEEE Automatic Speech Recognition & Understanding Workshop, 2011.<br />
<br />
[5] S. Rendle. Factorization machines with libFM. ACM Trans. Intell. Syst. Technol., 3(3):57:1–57:22, May 2012.abc<br />
<br />
[6] J. J. Tompson, A. Jain, Y. LeCun, and C. Bregler. Joint training of a convolutional network and a graphical model for human pose estimation. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, NIPS, pages 1799–1807. 2014.<br />
<br />
[7] H. Wang, N. Wang, and D.-Y. Yeung. Collaborative deep learning for recommender systems. In Proc. KDD, pages 1235–1244, 2015.<br />
<br />
[8] B. Yan and G. Chen. AppJoy: Personalized mobile application discovery. In MobiSys, pages 113–126, 2011.</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51184Wide and Deep Learning for Recommender Systems2021-12-01T06:59:59Z<p>J47pan: </p>
<hr />
<div><br />
== Presented by ==<br />
<br />
Junbin Pan<br />
<br />
== Introduction ==<br />
<br />
This paper presents a jointly trained wide linear models and deep neural networks architecture - Wide & Deep Learning. In the past, deep neural networks which is good at generalization and generalized linear models with nonlinear feature transformations methods which is good at memorization are widely used in the recommender system. However, combining the benefits of the two models can achieve both memorization and generalization at the same time in recommender system. With jointly training wide linear models and deep neural networks, this paper has demonstrated that a newly proposed Wide & Deep learning outperforms wide-only and deep-only models in recommender systems under the Google Play app with over one billion active users and over one million apps.<br />
<br />
== Related Work ==<br />
<br />
1. '''Generalized linear models''' like logistic regression are trained on binarized sparse features with one-hot encoding using cross-product transformation to achieve memorization, but these models don't generalize to unseen query-item feature pairs, which lacks in generalization.<br />
<br />
2. '''Embedding-based models''' like factorization machines (S. Rendle, 2012) or deep neural networks learns a low-dimensional dense embedding vector for each query and item feature to generalize on query-item feature pairs that have never been seen before by learning, but with less work on feature engineering. However, under a sparse and high rank query-item matrix, it is hard to learn the low-dimensional representation for the query-item matrix, which lacks in memorization.<br />
<br />
== Motivation ==<br />
<br />
Can we build a model to achieve both memorization and generalization? This question motivates the concept of joint training wide and deep models, specifically Wide & Deep Learning.<br />
<br />
The performance of generalized linear models with cross-product transformation can be improved by adding features that are less granular. However, this requires lots of work in feature engineering. On the other hand, the performance of embedding-based models can be improved by linar models with cross-product feature transformations to memorize the rules with a few number of parameters.<br />
<br />
Thus, to handle both problems would be to combine the wide and deep models in the training phase. Therefore, the architecture was motivated by Heng-Tze et al. [1] that overcome these difficulties by jointly training wide models and deep models together. It takes the advantage of both memorization and generalization.<br />
<br />
== Model Architecture ==<br />
<br />
abc<br />
<br />
== Model Results ==<br />
<br />
The proposed architecture was implemented and evaluated in a real-world recommender system, Google Play app in two aspects: app acquisitions and serving performance. <br />
<br />
For app acquisition, the author conducted live online experiments in an A/B testing framework for 3 weeks, where in the control group, 1% users were randomly selected and presented with the previous recommendation models and in the experiment group, 1% users were randomly selected and presented with the Wide & Deep model using the same features as the wide model. Also, 1% users were randomly selected and presented with the deep part of the model with same network structure and features. In Table 1, the Wide & Deep model outperforms the wide model and deep model by 2.9 % and 3.9% respectively on online acquisition gain. And for offline experiments, the Wide & Deep model outperforms the wide model and deep model by 0.002 and 0.006 in terms of AUC. Note that the difference is relative small in offline compared to online since the labels in offline data are fixed while the online system can generate new exploratory recommendations using both memorization and generalization.<br />
<br />
[[File:abc.jpg]]<br />
<br />
For serving performance, during the peak traffic, the author implemented multithreading and split each batch into smaller sizes which reduced the client-side latency from 31ms to 14ms as shown in Table 2.<br />
<br />
== Conclusion ==<br />
<br />
abc<br />
<br />
== Critiques ==<br />
<br />
abc<br />
<br />
== References ==<br />
<br />
[1] Heng-Tze Cheng<br />
<br />
[1] J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121–2159, July 2011.<br />
<br />
[2] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2016.<br />
<br />
[3] H. B. McMahan. Follow-the-regularized-leader and mirror descent: Equivalence theorems and l1 regularization. In Proc. AISTATS, 2011.<br />
<br />
[4] T. Mikolov, A. Deoras, D. Povey, L. Burget, and J. H. Cernocky. Strategies for training large scale neural network language models. In IEEE Automatic Speech Recognition & Understanding Workshop, 2011.<br />
<br />
[5] S. Rendle. Factorization machines with libFM. ACM Trans. Intell. Syst. Technol., 3(3):57:1–57:22, May 2012.abc<br />
<br />
[6] J. J. Tompson, A. Jain, Y. LeCun, and C. Bregler. Joint training of a convolutional network and a graphical model for human pose estimation. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, NIPS, pages 1799–1807. 2014.<br />
<br />
[7] H. Wang, N. Wang, and D.-Y. Yeung. Collaborative deep learning for recommender systems. In Proc. KDD, pages 1235–1244, 2015.<br />
<br />
[8] B. Yan and G. Chen. AppJoy: Personalized mobile application discovery. In MobiSys, pages 113–126, 2011.</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51183Wide and Deep Learning for Recommender Systems2021-12-01T05:19:10Z<p>J47pan: </p>
<hr />
<div><br />
== Presented by ==<br />
<br />
Junbin Pan<br />
<br />
== Introduction ==<br />
<br />
This paper presents a jointly trained wide linear models and deep neural networks architecture - Wide & Deep Learning. In the past, deep neural networks which is good at generalization and generalized linear models with nonlinear feature transformations methods which is good at memorization are widely used in the recommender system. However, combining the benefits of the two models can achieve both memorization and generalization at the same time in recommender system. With jointly training wide linear models and deep neural networks, this paper has demonstrated that a newly proposed Wide & Deep learning outperforms wide-only and deep-only models in recommender systems under the Google Play app with over one billion active users and over one million apps.<br />
<br />
== Related Work ==<br />
<br />
1. '''Generalized linear models''' like logistic regression are trained on binarized sparse features with one-hot encoding using cross-product transformation to achieve memorization, but these models don't generalize to unseen query-item feature pairs, which lacks in generalization.<br />
<br />
2. '''Embedding-based models''' like factorization machines (S. Rendle, 2012) or deep neural networks learns a low-dimensional dense embedding vector for each query and item feature to generalize on query-item feature pairs that have never been seen before by learning, but with less work on feature engineering. However, under a sparse and high rank query-item matrix, it is hard to learn the low-dimensional representation for the query-item matrix, which lacks in memorization.<br />
<br />
== Motivation ==<br />
<br />
Can we build a model to achieve both memorization and generalization? This question motivates the concept of joint training wide and deep models, specifically Wide & Deep Learning.<br />
<br />
The performance of generalized linear models with cross-product transformation can be improved by adding features that are less granular. However, this requires lots of work in feature engineering. On the other hand, the performance of embedding-based models can be improved by linar models with cross-product feature transformations to memorize the rules with a few number of parameters.<br />
<br />
Thus, to handle both problems would be to combine the wide and deep models in the training phase. Therefore, the architecture was motivated by Heng-Tze et al. [1] that overcome these difficulties by jointly training wide models and deep models together. It takes the advantage of both memorization and generalization.<br />
<br />
== Model Architecture ==<br />
<br />
abc<br />
<br />
== Model Results ==<br />
<br />
The proposed architecture was implemented and evaluated in a real-world recommender system, Google Play app in two aspects: app acquisitions and serving performance. <br />
<br />
For app acquisition, the author conducted live online experiments in an A/B testing framework for 3 weeks, where in the control group, 1% users were randomly selected and presented with the previous recommendation models and in the experiment group, 1% users were randomly selected and presented with the Wide & Deep model using the same features as the wide model. Also, 1% users were randomly selected and presented with the deep part of the model with same network structure and features. In Table 1, the Wide & Deep model outperforms the wide model and deep model by 2.9 % and 3.9% respectively on online acquisition gain. And for offline experiments, the Wide & Deep model outperforms the wide model and deep model by 0.002 and 0.006 in terms of AUC. Note that the difference is relative small in offline compared to online since the labels in offline data are fixed while the online system can generate new exploratory recommendations using both memorization and generalization.<br />
<br />
For serving performance, during the peak traffic, the author implemented multithreading and split each batch into smaller sizes which reduced the client-side latency from 31ms to 14ms as shown in Table 2.<br />
<br />
== Conclusion ==<br />
<br />
abc<br />
<br />
== Critiques ==<br />
<br />
abc<br />
<br />
== References ==<br />
<br />
[1] Heng-Tze Cheng<br />
<br />
[1] J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121–2159, July 2011.<br />
<br />
[2] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2016.<br />
<br />
[3] H. B. McMahan. Follow-the-regularized-leader and mirror descent: Equivalence theorems and l1 regularization. In Proc. AISTATS, 2011.<br />
<br />
[4] T. Mikolov, A. Deoras, D. Povey, L. Burget, and J. H. Cernocky. Strategies for training large scale neural network language models. In IEEE Automatic Speech Recognition & Understanding Workshop, 2011.<br />
<br />
[5] S. Rendle. Factorization machines with libFM. ACM Trans. Intell. Syst. Technol., 3(3):57:1–57:22, May 2012.abc<br />
<br />
[6] J. J. Tompson, A. Jain, Y. LeCun, and C. Bregler. Joint training of a convolutional network and a graphical model for human pose estimation. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, NIPS, pages 1799–1807. 2014.<br />
<br />
[7] H. Wang, N. Wang, and D.-Y. Yeung. Collaborative deep learning for recommender systems. In Proc. KDD, pages 1235–1244, 2015.<br />
<br />
[8] B. Yan and G. Chen. AppJoy: Personalized mobile application discovery. In MobiSys, pages 113–126, 2011.</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51182Wide and Deep Learning for Recommender Systems2021-12-01T04:19:44Z<p>J47pan: </p>
<hr />
<div><br />
== Presented by ==<br />
<br />
Junbin Pan<br />
<br />
== Introduction ==<br />
<br />
This paper presents a jointly trained wide linear models and deep neural networks architecture - Wide & Deep Learning. In the past, deep neural networks which is good at generalization and generalized linear models with nonlinear feature transformations methods which is good at memorization are widely used in the recommender system. However, combining the benefits of the two models can achieve both memorization and generalization at the same time in recommender system. With jointly training wide linear models and deep neural networks, this paper has demonstrated that a newly proposed Wide & Deep learning outperforms wide-only and deep-only models in recommender systems under the Google Play app with over one billion active users and over one million apps.<br />
<br />
== Related Work ==<br />
<br />
1. '''Generalized linear models''' like logistic regression are trained on binarized sparse features with one-hot encoding using cross-product transformation to achieve memorization, but these models don't generalize to unseen query-item feature pairs, which lacks in generalization.<br />
<br />
2. '''Embedding-based models''' like factorization machines (S. Rendle, 2012) or deep neural networks learns a low-dimensional dense embedding vector for each query and item feature to generalize on query-item feature pairs that have never been seen before by learning, but with less work on feature engineering. However, under a sparse and high rank query-item matrix, it is hard to learn the low-dimensional representation for the query-item matrix, which lacks in memorization.<br />
<br />
== Motivation ==<br />
<br />
Can we build a model to achieve both memorization and generalization? This question motivates the concept of joint training wide and deep models, specifically Wide & Deep Learning.<br />
<br />
The performance of generalized linear models with cross-product transformation can be improved by adding features that are less granular. However, this requires lots of work in feature engineering. On the other hand, the performance of embedding-based models can be improved by linar models with cross-product feature transformations to memorize the rules with a few number of parameters.<br />
<br />
Thus, to handle both problems would be to combine the wide and deep models in the training phase. Therefore, the architecture was motivated by Heng-Tze et al. [1] that overcome these difficulties by jointly training wide models and deep models together. It takes the advantage of both memorization and generalization.<br />
<br />
== Model Architecture ==<br />
<br />
abc<br />
<br />
== Model Results ==<br />
<br />
abc<br />
<br />
== Conclusion ==<br />
<br />
abc<br />
<br />
== Critiques ==<br />
<br />
abc<br />
<br />
== References ==<br />
<br />
[1] Heng-Tze Cheng<br />
<br />
[1] J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121–2159, July 2011.<br />
<br />
[2] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2016.<br />
<br />
[3] H. B. McMahan. Follow-the-regularized-leader and mirror descent: Equivalence theorems and l1 regularization. In Proc. AISTATS, 2011.<br />
<br />
[4] T. Mikolov, A. Deoras, D. Povey, L. Burget, and J. H. Cernocky. Strategies for training large scale neural network language models. In IEEE Automatic Speech Recognition & Understanding Workshop, 2011.<br />
<br />
[5] S. Rendle. Factorization machines with libFM. ACM Trans. Intell. Syst. Technol., 3(3):57:1–57:22, May 2012.abc<br />
<br />
[6] J. J. Tompson, A. Jain, Y. LeCun, and C. Bregler. Joint training of a convolutional network and a graphical model for human pose estimation. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, NIPS, pages 1799–1807. 2014.<br />
<br />
[7] H. Wang, N. Wang, and D.-Y. Yeung. Collaborative deep learning for recommender systems. In Proc. KDD, pages 1235–1244, 2015.<br />
<br />
[8] B. Yan and G. Chen. AppJoy: Personalized mobile application discovery. In MobiSys, pages 113–126, 2011.</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51181Wide and Deep Learning for Recommender Systems2021-12-01T04:16:52Z<p>J47pan: </p>
<hr />
<div><br />
== Presented by ==<br />
<br />
Junbin Pan<br />
<br />
== Introduction ==<br />
<br />
This paper presents a jointly trained wide linear models and deep neural networks architecture - Wide & Deep Learning. In the past, deep neural networks which is good at generalization and generalized linear models with nonlinear feature transformations methods which is good at memorization are widely used in the recommender system. However, combining the benefits of the two models can achieve both memorization and generalization at the same time in recommender system. With jointly training wide linear models and deep neural networks, this paper has demonstrated that a newly proposed Wide & Deep learning outperforms wide-only and deep-only models in recommender systems under the Google Play app with over one billion active users and over one million apps.<br />
<br />
== Related Work ==<br />
<br />
1. '''Generalized linear models''' like logistic regression are trained on binarized sparse features with one-hot encoding using cross-product transformation to achieve memorization, but these models don't generalize to unseen query-item feature pairs, which lacks in generalization.<br />
<br />
2. '''Embedding-based models''' like factorization machines (S. Rendle, 2012) or deep neural networks learns a low-dimensional dense embedding vector for each query and item feature to generalize on query-item feature pairs that have never been seen before by learning, but with less work on feature engineering. However, under a sparse and high rank query-item matrix, it is hard to learn the low-dimensional representation for the query-item matrix, which lacks in memorization.<br />
<br />
== Motivation ==<br />
<br />
Can we build a model to achieve both memorization and generalization? This question motivates the concept of joint training wide and deep models, specifically Wide & Deep Learning.<br />
<br />
The performance of generalized linear models with cross-product transformation can be improved by adding features that are less granular. However, this requires lots of work in feature engineering. On the other hand, the performance of embedding-based models can be improved by linar models with cross-product feature transformations to memorize the rules with a few number of parameters.<br />
<br />
Thus, to handle both problems would be to combine the wide and deep models together in the training phase. Therefore, the inception architecture was motivated by Heng-Tze et al. [1] that overcome these difficulties by clustering sparse matrices into relatively dense submatrices. It takes advantage of both extra sparsity and existing computational hardware.<br />
<br />
== Model Architecture ==<br />
<br />
abc<br />
<br />
== Model Results ==<br />
<br />
abc<br />
<br />
== Conclusion ==<br />
<br />
abc<br />
<br />
== Critiques ==<br />
<br />
abc<br />
<br />
== References ==<br />
<br />
[1] Heng-Tze Cheng<br />
<br />
[1] J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121–2159, July 2011.<br />
<br />
[2] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2016.<br />
<br />
[3] H. B. McMahan. Follow-the-regularized-leader and mirror descent: Equivalence theorems and l1 regularization. In Proc. AISTATS, 2011.<br />
<br />
[4] T. Mikolov, A. Deoras, D. Povey, L. Burget, and J. H. Cernocky. Strategies for training large scale neural network language models. In IEEE Automatic Speech Recognition & Understanding Workshop, 2011.<br />
<br />
[5] S. Rendle. Factorization machines with libFM. ACM Trans. Intell. Syst. Technol., 3(3):57:1–57:22, May 2012.abc<br />
<br />
[6] J. J. Tompson, A. Jain, Y. LeCun, and C. Bregler. Joint training of a convolutional network and a graphical model for human pose estimation. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, NIPS, pages 1799–1807. 2014.<br />
<br />
[7] H. Wang, N. Wang, and D.-Y. Yeung. Collaborative deep learning for recommender systems. In Proc. KDD, pages 1235–1244, 2015.<br />
<br />
[8] B. Yan and G. Chen. AppJoy: Personalized mobile application discovery. In MobiSys, pages 113–126, 2011.</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51180Wide and Deep Learning for Recommender Systems2021-12-01T03:59:43Z<p>J47pan: </p>
<hr />
<div><br />
== Presented by ==<br />
<br />
Junbin Pan<br />
<br />
== Introduction ==<br />
<br />
This paper presents a jointly trained wide linear models and deep neural networks architecture - Wide & Deep Learning. In the past, deep neural networks which is good at generalization and generalized linear models with nonlinear feature transformations methods which is good at memorization are widely used in the recommender system. However, combining the benefits of the two models can achieve both memorization and generalization at the same time in recommender system. With jointly training wide linear models and deep neural networks, this paper has demonstrated that a newly proposed Wide & Deep learning outperforms wide-only and deep-only models in recommender systems under the Google Play app with over one billion active users and over one million apps.<br />
<br />
== Related Work ==<br />
<br />
1. '''Embedding-based models''' like factorization machines (S. Rendle, 2012) or deep neural networks learns a low-dimensional dense embedding vector for each query and item feature to generalize on query-item feature pairs that have never been seen before by learning, but with less work on feature engineering. However, under a sparse and high rank query-item matrix, it is hard to learn the low-dimensional representation for the query-item matrix, which lacks in memorization.<br />
<br />
2. '''Generalized linear models''' like logistic regression are trained on binarized sparse features with one-hot encoding using cross-product transformation to achieve memorization, but these models do not generalize to unseen query-item feature pairs.<br />
<br />
== Model Architecture ==<br />
<br />
abc<br />
<br />
== Model Results ==<br />
<br />
abc<br />
<br />
== Conclusion ==<br />
<br />
abc<br />
<br />
== Critiques ==<br />
<br />
abc<br />
<br />
== References ==<br />
<br />
[1] J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121–2159, July 2011.<br />
<br />
[2] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2016.<br />
<br />
[3] H. B. McMahan. Follow-the-regularized-leader and mirror descent: Equivalence theorems and l1 regularization. In Proc. AISTATS, 2011.<br />
<br />
[4] T. Mikolov, A. Deoras, D. Povey, L. Burget, and J. H. Cernocky. Strategies for training large scale neural network language models. In IEEE Automatic Speech Recognition & Understanding Workshop, 2011.<br />
<br />
[5] S. Rendle. Factorization machines with libFM. ACM Trans. Intell. Syst. Technol., 3(3):57:1–57:22, May 2012.abc<br />
<br />
[6] J. J. Tompson, A. Jain, Y. LeCun, and C. Bregler. Joint training of a convolutional network and a graphical model for human pose estimation. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, NIPS, pages 1799–1807. 2014.<br />
<br />
[7] H. Wang, N. Wang, and D.-Y. Yeung. Collaborative deep learning for recommender systems. In Proc. KDD, pages 1235–1244, 2015.<br />
<br />
[8] B. Yan and G. Chen. AppJoy: Personalized mobile application discovery. In MobiSys, pages 113–126, 2011.</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51179Wide and Deep Learning for Recommender Systems2021-12-01T03:59:03Z<p>J47pan: </p>
<hr />
<div><br />
== Presented by ==<br />
<br />
Junbin Pan<br />
<br />
== Introduction ==<br />
<br />
This paper presents a jointly trained wide linear models and deep neural networks architecture - Wide & Deep Learning. In the past, deep neural networks which is good at generalization and generalized linear models with nonlinear feature transformations methods which is good at memorization are widely used in the recommender system. However, combining the benefits of the two models can achieve both memorization and generalization at the same time in recommender system. With jointly training wide linear models and deep neural networks, this paper has demonstrated that a newly proposed Wide & Deep learning outperforms wide-only and deep-only models in recommender systems under the Google Play app with over one billion active users and over one million apps.<br />
<br />
== Related Work ==<br />
<br />
1. Embedding-based models like factorization machines (S. Rendle, 2012) or deep neural networks learns a low-dimensional dense embedding vector for each query and item feature to generalize on query-item feature pairs that have never been seen before by learning, but with less work on feature engineering. However, under a sparse and high rank query-item matrix, it is hard to learn the low-dimensional representation for the query-item matrix, which lacks in memorization.<br />
<br />
2. Generalized linear models like logistic regression are trained on binarized sparse features with one-hot encoding using cross-product transformation to achieve memorization, but these models do not generalize to unseen query-item feature pairs.<br />
<br />
== Model Architecture ==<br />
<br />
abc<br />
<br />
== Model Results ==<br />
<br />
abc<br />
<br />
== Conclusion ==<br />
<br />
abc<br />
<br />
== Critiques ==<br />
<br />
abc<br />
<br />
== References ==<br />
<br />
[1] J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121–2159, July 2011.<br />
<br />
[2] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2016.<br />
<br />
[3] H. B. McMahan. Follow-the-regularized-leader and mirror descent: Equivalence theorems and l1 regularization. In Proc. AISTATS, 2011.<br />
<br />
[4] T. Mikolov, A. Deoras, D. Povey, L. Burget, and J. H. Cernocky. Strategies for training large scale neural network language models. In IEEE Automatic Speech Recognition & Understanding Workshop, 2011.<br />
<br />
[5] S. Rendle. Factorization machines with libFM. ACM Trans. Intell. Syst. Technol., 3(3):57:1–57:22, May 2012.abc<br />
<br />
[6] J. J. Tompson, A. Jain, Y. LeCun, and C. Bregler. Joint training of a convolutional network and a graphical model for human pose estimation. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, NIPS, pages 1799–1807. 2014.<br />
<br />
[7] H. Wang, N. Wang, and D.-Y. Yeung. Collaborative deep learning for recommender systems. In Proc. KDD, pages 1235–1244, 2015.<br />
<br />
[8] B. Yan and G. Chen. AppJoy: Personalized mobile application discovery. In MobiSys, pages 113–126, 2011.</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51178Wide and Deep Learning for Recommender Systems2021-12-01T03:58:53Z<p>J47pan: </p>
<hr />
<div><br />
== Presented by ==<br />
<br />
Junbin Pan<br />
<br />
== Introduction ==<br />
<br />
This paper presents a jointly trained wide linear models and deep neural networks architecture - Wide & Deep Learning. In the past, deep neural networks which is good at generalization and generalized linear models with nonlinear feature transformations methods which is good at memorization are widely used in the recommender system. However, combining the benefits of the two models can achieve both memorization and generalization at the same time in recommender system. With jointly training wide linear models and deep neural networks, this paper has demonstrated that a newly proposed Wide & Deep learning outperforms wide-only and deep-only models in recommender systems under the Google Play app with over one billion active users and over one million apps.<br />
<br />
== Related Work ==<br />
<br />
1. Embedding-based models like factorization machines (S. Rendle, 2012) or deep neural networks learns a low-dimensional dense embedding vector for each query and item feature to generalize on query-item feature pairs that have never been seen before by learning, but with less work on feature engineering. However, under a sparse and high rank query-item matrix, it is hard to learn the low-dimensional representation for the query-item matrix, which lacks in memorization.<br />
2. Generalized linear models like logistic regression are trained on binarized sparse features with one-hot encoding using cross-product transformation to achieve memorization, but these models do not generalize to unseen query-item feature pairs.<br />
<br />
== Model Architecture ==<br />
<br />
abc<br />
<br />
== Model Results ==<br />
<br />
abc<br />
<br />
== Conclusion ==<br />
<br />
abc<br />
<br />
== Critiques ==<br />
<br />
abc<br />
<br />
== References ==<br />
<br />
[1] J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121–2159, July 2011.<br />
<br />
[2] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2016.<br />
<br />
[3] H. B. McMahan. Follow-the-regularized-leader and mirror descent: Equivalence theorems and l1 regularization. In Proc. AISTATS, 2011.<br />
<br />
[4] T. Mikolov, A. Deoras, D. Povey, L. Burget, and J. H. Cernocky. Strategies for training large scale neural network language models. In IEEE Automatic Speech Recognition & Understanding Workshop, 2011.<br />
<br />
[5] S. Rendle. Factorization machines with libFM. ACM Trans. Intell. Syst. Technol., 3(3):57:1–57:22, May 2012.abc<br />
<br />
[6] J. J. Tompson, A. Jain, Y. LeCun, and C. Bregler. Joint training of a convolutional network and a graphical model for human pose estimation. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, NIPS, pages 1799–1807. 2014.<br />
<br />
[7] H. Wang, N. Wang, and D.-Y. Yeung. Collaborative deep learning for recommender systems. In Proc. KDD, pages 1235–1244, 2015.<br />
<br />
[8] B. Yan and G. Chen. AppJoy: Personalized mobile application discovery. In MobiSys, pages 113–126, 2011.</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat441F21&diff=51158stat441F212021-11-29T15:58:33Z<p>J47pan: /* Paper presentation */</p>
<hr />
<div><br />
<br />
== [[F20-STAT 441/841 CM 763-Proposal| Project Proposal ]] ==<br />
<br />
<!--[https://goo.gl/forms/apurag4dr9kSR76X2 Your feedback on presentations]--><br />
<br />
=Paper presentation=<br />
{| class="wikitable"<br />
<br />
{| border="1" cellpadding="3"<br />
|-<br />
|width="60pt"|Date<br />
|width="250pt"|Name <br />
|width="15pt"|Paper number <br />
|width="700pt"|Title<br />
|width="15pt"|Link to the paper<br />
|width="30pt"|Link to the summary<br />
|width="30pt"|Link to the video<br />
|-<br />
|Sep 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary] || [https://youtu.be/JWozRg_X-Vg?list=PLehuLRPyt1HzXDemu7K4ETcF0Ld_B5adG&t=539]<br />
|-<br />
|Week of Nov 16 || Ali Ghodsi || || || || ||<br />
|-<br />
|Week of Nov 22 || Jared Feng, Xipeng Huang, Mingwei Xu, Tingzhou Yu|| || Don't Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification || [http://proceedings.mlr.press/v139/bai21c/bai21c.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Don%27t_Just_Blame_Over-parametrization Summary] ||<br />
|-<br />
|Week of Nov 29 || Kanika Chopra, Yush Rajcoomar || || Automatic Bank Fraud Detection Using Support Vector Machines || [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.863.5804&rep=rep1&type=pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Automatic_Bank_Fraud_Detection_Using_Support_Vector_Machines Summary] ||<br />
|-<br />
|Week of Nov 22 || Zeng Mingde, Lin Xiaoyu, Fan Joshua, Rao Chen Min || || Do Vision Transformers See Like Convolutional Neural Networks? || [https://proceedings.neurips.cc/paper/2021/file/652cf38361a209088302ba2b8b7f51e0-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Do_Vision_Transformers_See_Like_CNN Summary] ||<br />
|-<br />
|Week of Nov 22 || Justin D'Astous, Waqas Hamed, Stefan Vladusic, Ethan O'Farrell || || A Probabilistic Approach to Neural Network Pruning || [http://proceedings.mlr.press/v139/qian21a/qian21a.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Summary_of_A_Probabilistic_Approach_to_Neural_Network_Pruning Summary] ||<br />
|-<br />
|Week of Nov 22 || Cassandra Wong, Anastasiia Livochka, Maryam Yalsavar, David Evans || || Patch-Based Convolutional Neural Network for Whole Slide Tissue Image Classification || [https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Hou_Patch-Based_Convolutional_Neural_CVPR_2016_paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Patch_Based_Convolutional_Neural_Network_for_Whole_Slide_Tissue_Image_Classification Summary] ||<br />
|-<br />
|Week of Nov 29 || Jessie Man Wai Chin, Yi Lin Ooi, Yaqi Shi, Shwen Lyng Ngew || || CatBoost: unbiased boosting with categorical features || [https://proceedings.neurips.cc/paper/2018/file/14491b756b3a51daac41c24863285549-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=CatBoost:_unbiased_boosting_with_categorical_features Summary] ||<br />
|-<br />
|Week of Nov 29 || Eric Anderson, Chengzhi Wang, Kai Zhong, YiJing Zhou || || Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks || [https://arxiv.org/pdf/1804.00792.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Poison_Frogs_Neural_Networks Summary] ||<br />
|-<br />
|Week of Nov 29 || Ethan Cyrenne, Dieu Hoa Nguyen, Mary Jane Sin, Carolyn Wang || || Deep Residual Learning for Image Recognition || [https://openaccess.thecvf.com/content_cvpr_2016/papers/He_Deep_Residual_Learning_CVPR_2016_paper.pdf Paper] || ||<br />
|-<br />
|Week of Nov 29 || Bowen Zhang, Tyler Magnus Verhaar, Sam Senko || || Deep Double Descent: Where Bigger Models and More Data Hurt || [https://arxiv.org/pdf/1912.02292.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Double_Descent_Where_Bigger_Models_and_More_Data_Hurt Summary] ||<br />
|-<br />
|Week of Nov 29 || Chun Waan Loke, Peter Chong, Clarice Osmond, Zhilong Li|| || XGBoost: A Scalable Tree Boosting System || [https://www.kdd.org/kdd2016/papers/files/rfp0697-chenAemb.pdf Paper] || ||<br />
|-<br />
|Week of Nov 22 || Ann Gie Wong, Curtis Li, Hannah Kerr || || The Detection of Black Ice Accidents for Preventative Automated Vehicles Using Convolutional Neural Networks || [https://www.mdpi.com/2079-9292/9/12/2178/htm Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=The_Detection_of_Black_Ice_Accidents_Using_CNNs&fbclid=IwAR0K4YdnL_hdRnOktmJn8BI6-Ra3oitjJof0YwluZgUP1LVFHK5jyiBZkvQ Summary] ||<br />
|-<br />
|Week of Nov 22 || Yuwei Liu, Daniel Mao|| || Depthwise Convolution Is All You Need for Learning Multiple Visual Domains || [https://arxiv.org/abs/1902.00927 Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Depthwise_Convolution_Is_All_You_Need_for_Learning_Multiple_Visual_Domains Summary] ||<br />
|-<br />
|Week of Nov 29 || Lingshan Wang, Yifan Li, Ziyi Liu || || Deep Learning for Extreme Multi-label Text Classification || [https://dl.acm.org/doi/pdf/10.1145/3077136.3080834 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Learning_for_Extreme_Multi-label_Text_Classification Summary]||<br />
|-<br />
|-<br />
|Week of Nov 29 || Kar Lok Ng, Muhan (Iris) Li || || Robust Imitation Learning from Noisy Demonstrations || [http://proceedings.mlr.press/v130/tangkaratt21a/tangkaratt21a.pdf Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Robust_Imitation_Learning_from_Noisy_Demonstrations Summary] ||<br />
|-<br />
|Week of Nov 29 ||Kun Wang || || Convolutional neural network for diagnosis of viral pneumonia and COVID-19 alike diseases|| [https://doi-org.proxy.lib.uwaterloo.ca/10.1111/exsy.12705 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Convolutional_neural_network_for_diagnosis_of_viral_pneumonia_and_COVID-19_alike_diseases Summary] ||<br />
|-<br />
|Week of Nov 29 ||Egemen Guray || || Traffic Sign Recognition System (TSRS): SVM and Convolutional Neural Network || [https://www.researchgate.net/publication/344399165_Traffic_Sign_Recognition_System_TSRS_SVM_and_Convolutional_Neural_Network Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Traffic_Sign_Recognition_System_(TSRS):_SVM_and_Convolutional_Neural_Network Summary] ||<br />
|-<br />
|Week of Nov 29 ||Bsodjahi || || Bayesian Network as a Decision Tool for Predicting ALS Disease ||[https://www.mdpi.com/2076-3425/11/2/150/htm Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Bayesian_Network_as_a_Decision_Tool_for_Predicting_ALS_Disease Summary]||<br />
|-<br />
|Week of Nov 29 ||Xin Yan, Yishu Duan, Xibei Di || || Predicting Hurricane Trajectories Using a Recurrent Neural Network || [https://arxiv.org/pdf/1802.02548.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Predicting_Hurricane_Trajectories_Using_a_Recurrent_Neural_Network Summary]||<br />
|-<br />
|Week of Nov 29 ||Ankitha Anugu, Yushan Chen, Yuying Huang || || A Game Theoretic Approach to Class-wise Selective Rationalization || [https://arxiv.org/pdf/1910.12853.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=A_Game_Theoretic_Approach_to_Class-wise_Selective_Rationalization#How_does_CAR_work_intuitively Summary]||<br />
|-<br />
|Week of Nov 29 ||Aavinash Syamala, Dilmeet Malhi, Sohan Islam, Vansh Joshi || || Research on Multiple Classification Based on Improved SVM Algorithm for Balanced Binary Decision Tree || [https://www.hindawi.com/journals/sp/2021/5560465/ Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Research_on_Multiple_Classification_Based_on_Improved_SVM_Algorithm_for_Balanced_Binary_Decision_Tree Summary]||<br />
|-<br />
|Week of Nov 29 ||Christian Mitrache, Alexandra Mossman, Jessica Saini, Aaron Renggli|| || U-Time: A Fully Convolutional Network for Time Series Segmentation Applied to Sleep Staging|| [https://proceedings.neurips.cc/paper/2019/file/57bafb2c2dfeefba931bb03a835b1fa9-Paper.pdf?fbclid=IwAR1dZpx9vU1pSPTSm_nwk6uBU7TYJ2HNTrsqjaH-9ZycE_PFpFjJoHg1zhQ]||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=U-Time:A_Fully_Convolutional_Network_for_Time_Series_Segmentation_Applied_to_Sleep_Staging_Summary]||<br />
|-<br />
|Week of Nov 29 ||Junbin Pan|| || Wide & Deep Learning for Recommender Systems || [https://arxiv.org/pdf/1606.07792v1.pdf Paper] || [Summary]||</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat441F21&diff=51157stat441F212021-11-29T15:58:13Z<p>J47pan: /* Paper presentation */</p>
<hr />
<div><br />
<br />
== [[F20-STAT 441/841 CM 763-Proposal| Project Proposal ]] ==<br />
<br />
<!--[https://goo.gl/forms/apurag4dr9kSR76X2 Your feedback on presentations]--><br />
<br />
=Paper presentation=<br />
{| class="wikitable"<br />
<br />
{| border="1" cellpadding="3"<br />
|-<br />
|width="60pt"|Date<br />
|width="250pt"|Name <br />
|width="15pt"|Paper number <br />
|width="700pt"|Title<br />
|width="15pt"|Link to the paper<br />
|width="30pt"|Link to the summary<br />
|width="30pt"|Link to the video<br />
|-<br />
|Sep 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary] || [https://youtu.be/JWozRg_X-Vg?list=PLehuLRPyt1HzXDemu7K4ETcF0Ld_B5adG&t=539]<br />
|-<br />
|Week of Nov 16 || Ali Ghodsi || || || || ||<br />
|-<br />
|Week of Nov 22 || Jared Feng, Xipeng Huang, Mingwei Xu, Tingzhou Yu|| || Don't Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification || [http://proceedings.mlr.press/v139/bai21c/bai21c.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Don%27t_Just_Blame_Over-parametrization Summary] ||<br />
|-<br />
|Week of Nov 29 || Kanika Chopra, Yush Rajcoomar || || Automatic Bank Fraud Detection Using Support Vector Machines || [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.863.5804&rep=rep1&type=pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Automatic_Bank_Fraud_Detection_Using_Support_Vector_Machines Summary] ||<br />
|-<br />
|Week of Nov 22 || Zeng Mingde, Lin Xiaoyu, Fan Joshua, Rao Chen Min || || Do Vision Transformers See Like Convolutional Neural Networks? || [https://proceedings.neurips.cc/paper/2021/file/652cf38361a209088302ba2b8b7f51e0-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Do_Vision_Transformers_See_Like_CNN Summary] ||<br />
|-<br />
|Week of Nov 22 || Justin D'Astous, Waqas Hamed, Stefan Vladusic, Ethan O'Farrell || || A Probabilistic Approach to Neural Network Pruning || [http://proceedings.mlr.press/v139/qian21a/qian21a.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Summary_of_A_Probabilistic_Approach_to_Neural_Network_Pruning Summary] ||<br />
|-<br />
|Week of Nov 22 || Cassandra Wong, Anastasiia Livochka, Maryam Yalsavar, David Evans || || Patch-Based Convolutional Neural Network for Whole Slide Tissue Image Classification || [https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Hou_Patch-Based_Convolutional_Neural_CVPR_2016_paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Patch_Based_Convolutional_Neural_Network_for_Whole_Slide_Tissue_Image_Classification Summary] ||<br />
|-<br />
|Week of Nov 29 || Jessie Man Wai Chin, Yi Lin Ooi, Yaqi Shi, Shwen Lyng Ngew || || CatBoost: unbiased boosting with categorical features || [https://proceedings.neurips.cc/paper/2018/file/14491b756b3a51daac41c24863285549-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=CatBoost:_unbiased_boosting_with_categorical_features Summary] ||<br />
|-<br />
|Week of Nov 29 || Eric Anderson, Chengzhi Wang, Kai Zhong, YiJing Zhou || || Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks || [https://arxiv.org/pdf/1804.00792.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Poison_Frogs_Neural_Networks Summary] ||<br />
|-<br />
|Week of Nov 29 || Ethan Cyrenne, Dieu Hoa Nguyen, Mary Jane Sin, Carolyn Wang || || Deep Residual Learning for Image Recognition || [https://openaccess.thecvf.com/content_cvpr_2016/papers/He_Deep_Residual_Learning_CVPR_2016_paper.pdf Paper] || ||<br />
|-<br />
|Week of Nov 29 || Bowen Zhang, Tyler Magnus Verhaar, Sam Senko || || Deep Double Descent: Where Bigger Models and More Data Hurt || [https://arxiv.org/pdf/1912.02292.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Double_Descent_Where_Bigger_Models_and_More_Data_Hurt Summary] ||<br />
|-<br />
|Week of Nov 29 || Chun Waan Loke, Peter Chong, Clarice Osmond, Zhilong Li|| || XGBoost: A Scalable Tree Boosting System || [https://www.kdd.org/kdd2016/papers/files/rfp0697-chenAemb.pdf Paper] || ||<br />
|-<br />
|Week of Nov 22 || Ann Gie Wong, Curtis Li, Hannah Kerr || || The Detection of Black Ice Accidents for Preventative Automated Vehicles Using Convolutional Neural Networks || [https://www.mdpi.com/2079-9292/9/12/2178/htm Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=The_Detection_of_Black_Ice_Accidents_Using_CNNs&fbclid=IwAR0K4YdnL_hdRnOktmJn8BI6-Ra3oitjJof0YwluZgUP1LVFHK5jyiBZkvQ Summary] ||<br />
|-<br />
|Week of Nov 22 || Yuwei Liu, Daniel Mao|| || Depthwise Convolution Is All You Need for Learning Multiple Visual Domains || [https://arxiv.org/abs/1902.00927 Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Depthwise_Convolution_Is_All_You_Need_for_Learning_Multiple_Visual_Domains Summary] ||<br />
|-<br />
|Week of Nov 29 || Lingshan Wang, Yifan Li, Ziyi Liu || || Deep Learning for Extreme Multi-label Text Classification || [https://dl.acm.org/doi/pdf/10.1145/3077136.3080834 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Learning_for_Extreme_Multi-label_Text_Classification Summary]||<br />
|-<br />
|-<br />
|Week of Nov 29 || Kar Lok Ng, Muhan (Iris) Li || || Robust Imitation Learning from Noisy Demonstrations || [http://proceedings.mlr.press/v130/tangkaratt21a/tangkaratt21a.pdf Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Robust_Imitation_Learning_from_Noisy_Demonstrations Summary] ||<br />
|-<br />
|Week of Nov 29 ||Kun Wang || || Convolutional neural network for diagnosis of viral pneumonia and COVID-19 alike diseases|| [https://doi-org.proxy.lib.uwaterloo.ca/10.1111/exsy.12705 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Convolutional_neural_network_for_diagnosis_of_viral_pneumonia_and_COVID-19_alike_diseases Summary] ||<br />
|-<br />
|Week of Nov 29 ||Egemen Guray || || Traffic Sign Recognition System (TSRS): SVM and Convolutional Neural Network || [https://www.researchgate.net/publication/344399165_Traffic_Sign_Recognition_System_TSRS_SVM_and_Convolutional_Neural_Network Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Traffic_Sign_Recognition_System_(TSRS):_SVM_and_Convolutional_Neural_Network Summary] ||<br />
|-<br />
|Week of Nov 29 ||Bsodjahi || || Bayesian Network as a Decision Tool for Predicting ALS Disease ||[https://www.mdpi.com/2076-3425/11/2/150/htm Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Bayesian_Network_as_a_Decision_Tool_for_Predicting_ALS_Disease Summary]||<br />
|-<br />
|Week of Nov 29 ||Xin Yan, Yishu Duan, Xibei Di || || Predicting Hurricane Trajectories Using a Recurrent Neural Network || [https://arxiv.org/pdf/1802.02548.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Predicting_Hurricane_Trajectories_Using_a_Recurrent_Neural_Network Summary]||<br />
|-<br />
|Week of Nov 29 ||Ankitha Anugu, Yushan Chen, Yuying Huang || || A Game Theoretic Approach to Class-wise Selective Rationalization || [https://arxiv.org/pdf/1910.12853.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=A_Game_Theoretic_Approach_to_Class-wise_Selective_Rationalization#How_does_CAR_work_intuitively Summary]||<br />
|-<br />
|Week of Nov 29 ||Aavinash Syamala, Dilmeet Malhi, Sohan Islam, Vansh Joshi || || Research on Multiple Classification Based on Improved SVM Algorithm for Balanced Binary Decision Tree || [https://www.hindawi.com/journals/sp/2021/5560465/ Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Research_on_Multiple_Classification_Based_on_Improved_SVM_Algorithm_for_Balanced_Binary_Decision_Tree Summary]||<br />
|-<br />
|Week of Nov 29 ||Christian Mitrache, Alexandra Mossman, Jessica Saini, Aaron Renggli|| || U-Time: A Fully Convolutional Network for Time Series Segmentation Applied to Sleep Staging|| [https://proceedings.neurips.cc/paper/2019/file/57bafb2c2dfeefba931bb03a835b1fa9-Paper.pdf?fbclid=IwAR1dZpx9vU1pSPTSm_nwk6uBU7TYJ2HNTrsqjaH-9ZycE_PFpFjJoHg1zhQ]||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=U-Time:A_Fully_Convolutional_Network_for_Time_Series_Segmentation_Applied_to_Sleep_Staging_Summary]||<br />
|-<br />
|Week of Nov 29 ||Junbin Pan|| || Wide & Deep Learning for Recommender Systems || [https://arxiv.org/pdf/1606.07792v1.pdf Paper] || []||</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51154Wide and Deep Learning for Recommender Systems2021-11-29T15:40:09Z<p>J47pan: </p>
<hr />
<div><br />
== Presented by ==<br />
<br />
Junbin Pan<br />
<br />
== Introduction ==<br />
<br />
This paper presents a jointly trained wide linear models and deep neural networks architecture - Wide & Deep Learning. This newly designed architecture can achieve both memorization and generalization for recommender systems<br />
<br />
<br />
== Related Work ==<br />
<br />
abc<br />
<br />
== Model Architecture ==<br />
<br />
abc<br />
<br />
== Model Results ==<br />
<br />
abc<br />
<br />
== Conclusion ==<br />
<br />
abc<br />
<br />
== Critiques ==<br />
<br />
abc<br />
<br />
== References ==<br />
<br />
[1] J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121–2159, July 2011.<br />
<br />
[2] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2016.<br />
<br />
[3] H. B. McMahan. Follow-the-regularized-leader and mirror descent: Equivalence theorems and l1 regularization. In Proc. AISTATS, 2011.<br />
<br />
[4] T. Mikolov, A. Deoras, D. Povey, L. Burget, and J. H. Cernocky. Strategies for training large scale neural network language models. In IEEE Automatic Speech Recognition & Understanding Workshop, 2011.<br />
<br />
[5] S. Rendle. Factorization machines with libFM. ACM Trans. Intell. Syst. Technol., 3(3):57:1–57:22, May 2012.<br />
<br />
[6] J. J. Tompson, A. Jain, Y. LeCun, and C. Bregler. Joint training of a convolutional network and a graphical model for human pose estimation. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, NIPS, pages 1799–1807. 2014.<br />
<br />
[7] H. Wang, N. Wang, and D.-Y. Yeung. Collaborative deep learning for recommender systems. In Proc. KDD, pages 1235–1244, 2015.<br />
<br />
[8] B. Yan and G. Chen. AppJoy: Personalized mobile application discovery. In MobiSys, pages 113–126, 2011.</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51152Wide and Deep Learning for Recommender Systems2021-11-29T10:34:54Z<p>J47pan: </p>
<hr />
<div><br />
== Presented by ==<br />
<br />
Junbin Pan<br />
<br />
== Introduction ==<br />
<br />
abc</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51151Wide and Deep Learning for Recommender Systems2021-11-29T10:34:26Z<p>J47pan: </p>
<hr />
<div>[http://www.example.com link title]<br />
<br />
== Presented by ==<br />
<br />
Junbin Pan</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51150Wide and Deep Learning for Recommender Systems2021-11-29T10:34:02Z<p>J47pan: </p>
<hr />
<div><br />
----<br />
<br />
== Presented by ==<br />
<br />
Junbin Pan</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51149Wide and Deep Learning for Recommender Systems2021-11-29T10:33:41Z<p>J47pan: </p>
<hr />
<div><br />
== Presented by ==<br />
<br />
Junbin Pan</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51148Wide and Deep Learning for Recommender Systems2021-11-29T10:33:26Z<p>J47pan: </p>
<hr />
<div># Presented by<br />
Junbin Pan</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat441F21&diff=51147stat441F212021-11-29T10:32:34Z<p>J47pan: /* Paper presentation */</p>
<hr />
<div><br />
<br />
== [[F20-STAT 441/841 CM 763-Proposal| Project Proposal ]] ==<br />
<br />
<!--[https://goo.gl/forms/apurag4dr9kSR76X2 Your feedback on presentations]--><br />
<br />
=Paper presentation=<br />
{| class="wikitable"<br />
<br />
{| border="1" cellpadding="3"<br />
|-<br />
|width="60pt"|Date<br />
|width="250pt"|Name <br />
|width="15pt"|Paper number <br />
|width="700pt"|Title<br />
|width="15pt"|Link to the paper<br />
|width="30pt"|Link to the summary<br />
|width="30pt"|Link to the video<br />
|-<br />
|Sep 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary] || [https://youtu.be/JWozRg_X-Vg?list=PLehuLRPyt1HzXDemu7K4ETcF0Ld_B5adG&t=539]<br />
|-<br />
|Week of Nov 16 || Ali Ghodsi || || || || ||<br />
|-<br />
|Week of Nov 22 || Jared Feng, Xipeng Huang, Mingwei Xu, Tingzhou Yu|| || Don't Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification || [http://proceedings.mlr.press/v139/bai21c/bai21c.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Don%27t_Just_Blame_Over-parametrization Summary] ||<br />
|-<br />
|Week of Nov 29 || Kanika Chopra, Yush Rajcoomar || || Automatic Bank Fraud Detection Using Support Vector Machines || [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.863.5804&rep=rep1&type=pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Automatic_Bank_Fraud_Detection_Using_Support_Vector_Machines Summary] ||<br />
|-<br />
|Week of Nov 22 || Zeng Mingde, Lin Xiaoyu, Fan Joshua, Rao Chen Min || || Do Vision Transformers See Like Convolutional Neural Networks? || [https://proceedings.neurips.cc/paper/2021/file/652cf38361a209088302ba2b8b7f51e0-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Do_Vision_Transformers_See_Like_CNN Summary] ||<br />
|-<br />
|Week of Nov 22 || Justin D'Astous, Waqas Hamed, Stefan Vladusic, Ethan O'Farrell || || A Probabilistic Approach to Neural Network Pruning || [http://proceedings.mlr.press/v139/qian21a/qian21a.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Summary_of_A_Probabilistic_Approach_to_Neural_Network_Pruning Summary] ||<br />
|-<br />
|Week of Nov 22 || Cassandra Wong, Anastasiia Livochka, Maryam Yalsavar, David Evans || || Patch-Based Convolutional Neural Network for Whole Slide Tissue Image Classification || [https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Hou_Patch-Based_Convolutional_Neural_CVPR_2016_paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Patch_Based_Convolutional_Neural_Network_for_Whole_Slide_Tissue_Image_Classification Summary] ||<br />
|-<br />
|Week of Nov 29 || Jessie Man Wai Chin, Yi Lin Ooi, Yaqi Shi, Shwen Lyng Ngew || || CatBoost: unbiased boosting with categorical features || [https://proceedings.neurips.cc/paper/2018/file/14491b756b3a51daac41c24863285549-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=CatBoost:_unbiased_boosting_with_categorical_features Summary] ||<br />
|-<br />
|Week of Nov 29 || Eric Anderson, Chengzhi Wang, Kai Zhong, YiJing Zhou || || Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks || [https://arxiv.org/pdf/1804.00792.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Poison_Frogs_Neural_Networks Summary] ||<br />
|-<br />
|Week of Nov 29 || Ethan Cyrenne, Dieu Hoa Nguyen, Mary Jane Sin, Carolyn Wang || || Deep Residual Learning for Image Recognition || [https://openaccess.thecvf.com/content_cvpr_2016/papers/He_Deep_Residual_Learning_CVPR_2016_paper.pdf Paper] || ||<br />
|-<br />
|Week of Nov 29 || Bowen Zhang, Tyler Magnus Verhaar, Sam Senko || || Deep Double Descent: Where Bigger Models and More Data Hurt || [https://arxiv.org/pdf/1912.02292.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Double_Descent_Where_Bigger_Models_and_More_Data_Hurt Summary] ||<br />
|-<br />
|Week of Nov 29 || Chun Waan Loke, Peter Chong, Clarice Osmond, Zhilong Li|| || XGBoost: A Scalable Tree Boosting System || [https://www.kdd.org/kdd2016/papers/files/rfp0697-chenAemb.pdf Paper] || ||<br />
|-<br />
|Week of Nov 22 || Ann Gie Wong, Curtis Li, Hannah Kerr || || The Detection of Black Ice Accidents for Preventative Automated Vehicles Using Convolutional Neural Networks || [https://www.mdpi.com/2079-9292/9/12/2178/htm Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=The_Detection_of_Black_Ice_Accidents_Using_CNNs&fbclid=IwAR0K4YdnL_hdRnOktmJn8BI6-Ra3oitjJof0YwluZgUP1LVFHK5jyiBZkvQ Summary] ||<br />
|-<br />
|Week of Nov 22 || Yuwei Liu, Daniel Mao|| || Depthwise Convolution Is All You Need for Learning Multiple Visual Domains || [https://arxiv.org/abs/1902.00927 Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Depthwise_Convolution_Is_All_You_Need_for_Learning_Multiple_Visual_Domains Summary] ||<br />
|-<br />
|Week of Nov 29 || Lingshan Wang, Yifan Li, Ziyi Liu || || Deep Learning for Extreme Multi-label Text Classification || [https://dl.acm.org/doi/pdf/10.1145/3077136.3080834 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Learning_for_Extreme_Multi-label_Text_Classification Summary]||<br />
|-<br />
|-<br />
|Week of Nov 29 || Kar Lok Ng, Muhan (Iris) Li || || Robust Imitation Learning from Noisy Demonstrations || [http://proceedings.mlr.press/v130/tangkaratt21a/tangkaratt21a.pdf Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Robust_Imitation_Learning_from_Noisy_Demonstrations Summary] ||<br />
|-<br />
|Week of Nov 29 ||Kun Wang || || Convolutional neural network for diagnosis of viral pneumonia and COVID-19 alike diseases|| [https://doi-org.proxy.lib.uwaterloo.ca/10.1111/exsy.12705 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Convolutional_neural_network_for_diagnosis_of_viral_pneumonia_and_COVID-19_alike_diseases Summary] ||<br />
|-<br />
|Week of Nov 29 ||Egemen Guray || || Traffic Sign Recognition System (TSRS): SVM and Convolutional Neural Network || [https://www.researchgate.net/publication/344399165_Traffic_Sign_Recognition_System_TSRS_SVM_and_Convolutional_Neural_Network Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Traffic_Sign_Recognition_System_(TSRS):_SVM_and_Convolutional_Neural_Network Summary] ||<br />
|-<br />
|Week of Nov 29 ||Bsodjahi || || Bayesian Network as a Decision Tool for Predicting ALS Disease ||[https://www.mdpi.com/2076-3425/11/2/150/htm Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Bayesian_Network_as_a_Decision_Tool_for_Predicting_ALS_Disease Summary]||<br />
|-<br />
|Week of Nov 29 ||Xin Yan, Yishu Duan, Xibei Di || || Predicting Hurricane Trajectories Using a Recurrent Neural Network || [https://arxiv.org/pdf/1802.02548.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Predicting_Hurricane_Trajectories_Using_a_Recurrent_Neural_Network Summary]||<br />
|-<br />
|Week of Nov 29 ||Ankitha Anugu, Yushan Chen, Yuying Huang || || A Game Theoretic Approach to Class-wise Selective Rationalization || [https://arxiv.org/pdf/1910.12853.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=A_Game_Theoretic_Approach_to_Class-wise_Selective_Rationalization#How_does_CAR_work_intuitively Summary]||<br />
|-<br />
|Week of Nov 29 ||Aavinash Syamala, Dilmeet Malhi, Sohan Islam, Vansh Joshi || || Research on Multiple Classification Based on Improved SVM Algorithm for Balanced Binary Decision Tree || [https://www.hindawi.com/journals/sp/2021/5560465/ Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Research_on_Multiple_Classification_Based_on_Improved_SVM_Algorithm_for_Balanced_Binary_Decision_Tree Summary]||<br />
|-<br />
|Week of Nov 29 ||Christian Mitrache, Alexandra Mossman, Jessica Saini, Aaron Renggli|| || U-Time: A Fully Convolutional Network for Time Series Segmentation Applied to Sleep Staging|| [https://proceedings.neurips.cc/paper/2019/file/57bafb2c2dfeefba931bb03a835b1fa9-Paper.pdf?fbclid=IwAR1dZpx9vU1pSPTSm_nwk6uBU7TYJ2HNTrsqjaH-9ZycE_PFpFjJoHg1zhQ]||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=U-Time:A_Fully_Convolutional_Network_for_Time_Series_Segmentation_Applied_to_Sleep_Staging_Summary]||<br />
|-<br />
|Week of Nov 29 ||Junbin Pan|| || Wide & Deep Learning for Recommender Systems || [https://arxiv.org/pdf/1606.07792v1.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems Summary]||</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat441F21&diff=51146stat441F212021-11-29T10:32:18Z<p>J47pan: /* Paper presentation */</p>
<hr />
<div><br />
<br />
== [[F20-STAT 441/841 CM 763-Proposal| Project Proposal ]] ==<br />
<br />
<!--[https://goo.gl/forms/apurag4dr9kSR76X2 Your feedback on presentations]--><br />
<br />
=Paper presentation=<br />
{| class="wikitable"<br />
<br />
{| border="1" cellpadding="3"<br />
|-<br />
|width="60pt"|Date<br />
|width="250pt"|Name <br />
|width="15pt"|Paper number <br />
|width="700pt"|Title<br />
|width="15pt"|Link to the paper<br />
|width="30pt"|Link to the summary<br />
|width="30pt"|Link to the video<br />
|-<br />
|Sep 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary] || [https://youtu.be/JWozRg_X-Vg?list=PLehuLRPyt1HzXDemu7K4ETcF0Ld_B5adG&t=539]<br />
|-<br />
|Week of Nov 16 || Ali Ghodsi || || || || ||<br />
|-<br />
|Week of Nov 22 || Jared Feng, Xipeng Huang, Mingwei Xu, Tingzhou Yu|| || Don't Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification || [http://proceedings.mlr.press/v139/bai21c/bai21c.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Don%27t_Just_Blame_Over-parametrization Summary] ||<br />
|-<br />
|Week of Nov 29 || Kanika Chopra, Yush Rajcoomar || || Automatic Bank Fraud Detection Using Support Vector Machines || [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.863.5804&rep=rep1&type=pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Automatic_Bank_Fraud_Detection_Using_Support_Vector_Machines Summary] ||<br />
|-<br />
|Week of Nov 22 || Zeng Mingde, Lin Xiaoyu, Fan Joshua, Rao Chen Min || || Do Vision Transformers See Like Convolutional Neural Networks? || [https://proceedings.neurips.cc/paper/2021/file/652cf38361a209088302ba2b8b7f51e0-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Do_Vision_Transformers_See_Like_CNN Summary] ||<br />
|-<br />
|Week of Nov 22 || Justin D'Astous, Waqas Hamed, Stefan Vladusic, Ethan O'Farrell || || A Probabilistic Approach to Neural Network Pruning || [http://proceedings.mlr.press/v139/qian21a/qian21a.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Summary_of_A_Probabilistic_Approach_to_Neural_Network_Pruning Summary] ||<br />
|-<br />
|Week of Nov 22 || Cassandra Wong, Anastasiia Livochka, Maryam Yalsavar, David Evans || || Patch-Based Convolutional Neural Network for Whole Slide Tissue Image Classification || [https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Hou_Patch-Based_Convolutional_Neural_CVPR_2016_paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Patch_Based_Convolutional_Neural_Network_for_Whole_Slide_Tissue_Image_Classification Summary] ||<br />
|-<br />
|Week of Nov 29 || Jessie Man Wai Chin, Yi Lin Ooi, Yaqi Shi, Shwen Lyng Ngew || || CatBoost: unbiased boosting with categorical features || [https://proceedings.neurips.cc/paper/2018/file/14491b756b3a51daac41c24863285549-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=CatBoost:_unbiased_boosting_with_categorical_features Summary] ||<br />
|-<br />
|Week of Nov 29 || Eric Anderson, Chengzhi Wang, Kai Zhong, YiJing Zhou || || Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks || [https://arxiv.org/pdf/1804.00792.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Poison_Frogs_Neural_Networks Summary] ||<br />
|-<br />
|Week of Nov 29 || Ethan Cyrenne, Dieu Hoa Nguyen, Mary Jane Sin, Carolyn Wang || || Deep Residual Learning for Image Recognition || [https://openaccess.thecvf.com/content_cvpr_2016/papers/He_Deep_Residual_Learning_CVPR_2016_paper.pdf Paper] || ||<br />
|-<br />
|Week of Nov 29 || Bowen Zhang, Tyler Magnus Verhaar, Sam Senko || || Deep Double Descent: Where Bigger Models and More Data Hurt || [https://arxiv.org/pdf/1912.02292.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Double_Descent_Where_Bigger_Models_and_More_Data_Hurt Summary] ||<br />
|-<br />
|Week of Nov 29 || Chun Waan Loke, Peter Chong, Clarice Osmond, Zhilong Li|| || XGBoost: A Scalable Tree Boosting System || [https://www.kdd.org/kdd2016/papers/files/rfp0697-chenAemb.pdf Paper] || ||<br />
|-<br />
|Week of Nov 22 || Ann Gie Wong, Curtis Li, Hannah Kerr || || The Detection of Black Ice Accidents for Preventative Automated Vehicles Using Convolutional Neural Networks || [https://www.mdpi.com/2079-9292/9/12/2178/htm Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=The_Detection_of_Black_Ice_Accidents_Using_CNNs&fbclid=IwAR0K4YdnL_hdRnOktmJn8BI6-Ra3oitjJof0YwluZgUP1LVFHK5jyiBZkvQ Summary] ||<br />
|-<br />
|Week of Nov 22 || Yuwei Liu, Daniel Mao|| || Depthwise Convolution Is All You Need for Learning Multiple Visual Domains || [https://arxiv.org/abs/1902.00927 Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Depthwise_Convolution_Is_All_You_Need_for_Learning_Multiple_Visual_Domains Summary] ||<br />
|-<br />
|Week of Nov 29 || Lingshan Wang, Yifan Li, Ziyi Liu || || Deep Learning for Extreme Multi-label Text Classification || [https://dl.acm.org/doi/pdf/10.1145/3077136.3080834 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Learning_for_Extreme_Multi-label_Text_Classification Summary]||<br />
|-<br />
|-<br />
|Week of Nov 29 || Kar Lok Ng, Muhan (Iris) Li || || Robust Imitation Learning from Noisy Demonstrations || [http://proceedings.mlr.press/v130/tangkaratt21a/tangkaratt21a.pdf Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Robust_Imitation_Learning_from_Noisy_Demonstrations Summary] ||<br />
|-<br />
|Week of Nov 29 ||Kun Wang || || Convolutional neural network for diagnosis of viral pneumonia and COVID-19 alike diseases|| [https://doi-org.proxy.lib.uwaterloo.ca/10.1111/exsy.12705 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Convolutional_neural_network_for_diagnosis_of_viral_pneumonia_and_COVID-19_alike_diseases Summary] ||<br />
|-<br />
|Week of Nov 29 ||Egemen Guray || || Traffic Sign Recognition System (TSRS): SVM and Convolutional Neural Network || [https://www.researchgate.net/publication/344399165_Traffic_Sign_Recognition_System_TSRS_SVM_and_Convolutional_Neural_Network Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Traffic_Sign_Recognition_System_(TSRS):_SVM_and_Convolutional_Neural_Network Summary] ||<br />
|-<br />
|Week of Nov 29 ||Bsodjahi || || Bayesian Network as a Decision Tool for Predicting ALS Disease ||[https://www.mdpi.com/2076-3425/11/2/150/htm Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Bayesian_Network_as_a_Decision_Tool_for_Predicting_ALS_Disease Summary]||<br />
|-<br />
|Week of Nov 29 ||Xin Yan, Yishu Duan, Xibei Di || || Predicting Hurricane Trajectories Using a Recurrent Neural Network || [https://arxiv.org/pdf/1802.02548.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Predicting_Hurricane_Trajectories_Using_a_Recurrent_Neural_Network Summary]||<br />
|-<br />
|Week of Nov 29 ||Ankitha Anugu, Yushan Chen, Yuying Huang || || A Game Theoretic Approach to Class-wise Selective Rationalization || [https://arxiv.org/pdf/1910.12853.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=A_Game_Theoretic_Approach_to_Class-wise_Selective_Rationalization#How_does_CAR_work_intuitively Summary]||<br />
|-<br />
|Week of Nov 29 ||Aavinash Syamala, Dilmeet Malhi, Sohan Islam, Vansh Joshi || || Research on Multiple Classification Based on Improved SVM Algorithm for Balanced Binary Decision Tree || [https://www.hindawi.com/journals/sp/2021/5560465/ Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Research_on_Multiple_Classification_Based_on_Improved_SVM_Algorithm_for_Balanced_Binary_Decision_Tree Summary]||<br />
|-<br />
|Week of Nov 29 ||Christian Mitrache, Alexandra Mossman, Jessica Saini, Aaron Renggli|| || U-Time: A Fully Convolutional Network for Time Series Segmentation Applied to Sleep Staging|| [https://proceedings.neurips.cc/paper/2019/file/57bafb2c2dfeefba931bb03a835b1fa9-Paper.pdf?fbclid=IwAR1dZpx9vU1pSPTSm_nwk6uBU7TYJ2HNTrsqjaH-9ZycE_PFpFjJoHg1zhQ]||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=U-Time:A_Fully_Convolutional_Network_for_Time_Series_Segmentation_Applied_to_Sleep_Staging_Summary]||<br />
|-<br />
|Week of Nov 29 ||Junbin Pan|| || Wide & Deep Learning for Recommender Systems || [https://arxiv.org/pdf/1606.07792v1.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems]||</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=Wide_and_Deep_Learning_for_Recommender_Systems&diff=51145Wide and Deep Learning for Recommender Systems2021-11-29T10:32:00Z<p>J47pan: Created page with "test"</p>
<hr />
<div>test</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:J47pan&diff=51144User:J47pan2021-11-29T10:30:09Z<p>J47pan: my title</p>
<hr />
<div>zXc</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat441F21&diff=51143stat441F212021-11-29T10:10:28Z<p>J47pan: /* Paper presentation */</p>
<hr />
<div><br />
<br />
== [[F20-STAT 441/841 CM 763-Proposal| Project Proposal ]] ==<br />
<br />
<!--[https://goo.gl/forms/apurag4dr9kSR76X2 Your feedback on presentations]--><br />
<br />
=Paper presentation=<br />
{| class="wikitable"<br />
<br />
{| border="1" cellpadding="3"<br />
|-<br />
|width="60pt"|Date<br />
|width="250pt"|Name <br />
|width="15pt"|Paper number <br />
|width="700pt"|Title<br />
|width="15pt"|Link to the paper<br />
|width="30pt"|Link to the summary<br />
|width="30pt"|Link to the video<br />
|-<br />
|Sep 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary] || [https://youtu.be/JWozRg_X-Vg?list=PLehuLRPyt1HzXDemu7K4ETcF0Ld_B5adG&t=539]<br />
|-<br />
|Week of Nov 16 || Ali Ghodsi || || || || ||<br />
|-<br />
|Week of Nov 22 || Jared Feng, Xipeng Huang, Mingwei Xu, Tingzhou Yu|| || Don't Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification || [http://proceedings.mlr.press/v139/bai21c/bai21c.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Don%27t_Just_Blame_Over-parametrization Summary] ||<br />
|-<br />
|Week of Nov 29 || Kanika Chopra, Yush Rajcoomar || || Automatic Bank Fraud Detection Using Support Vector Machines || [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.863.5804&rep=rep1&type=pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Automatic_Bank_Fraud_Detection_Using_Support_Vector_Machines Summary] ||<br />
|-<br />
|Week of Nov 22 || Zeng Mingde, Lin Xiaoyu, Fan Joshua, Rao Chen Min || || Do Vision Transformers See Like Convolutional Neural Networks? || [https://proceedings.neurips.cc/paper/2021/file/652cf38361a209088302ba2b8b7f51e0-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Do_Vision_Transformers_See_Like_CNN Summary] ||<br />
|-<br />
|Week of Nov 22 || Justin D'Astous, Waqas Hamed, Stefan Vladusic, Ethan O'Farrell || || A Probabilistic Approach to Neural Network Pruning || [http://proceedings.mlr.press/v139/qian21a/qian21a.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Summary_of_A_Probabilistic_Approach_to_Neural_Network_Pruning Summary] ||<br />
|-<br />
|Week of Nov 22 || Cassandra Wong, Anastasiia Livochka, Maryam Yalsavar, David Evans || || Patch-Based Convolutional Neural Network for Whole Slide Tissue Image Classification || [https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Hou_Patch-Based_Convolutional_Neural_CVPR_2016_paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Patch_Based_Convolutional_Neural_Network_for_Whole_Slide_Tissue_Image_Classification Summary] ||<br />
|-<br />
|Week of Nov 29 || Jessie Man Wai Chin, Yi Lin Ooi, Yaqi Shi, Shwen Lyng Ngew || || CatBoost: unbiased boosting with categorical features || [https://proceedings.neurips.cc/paper/2018/file/14491b756b3a51daac41c24863285549-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=CatBoost:_unbiased_boosting_with_categorical_features Summary] ||<br />
|-<br />
|Week of Nov 29 || Eric Anderson, Chengzhi Wang, Kai Zhong, YiJing Zhou || || Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks || [https://arxiv.org/pdf/1804.00792.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Poison_Frogs_Neural_Networks Summary] ||<br />
|-<br />
|Week of Nov 29 || Ethan Cyrenne, Dieu Hoa Nguyen, Mary Jane Sin, Carolyn Wang || || Deep Residual Learning for Image Recognition || [https://openaccess.thecvf.com/content_cvpr_2016/papers/He_Deep_Residual_Learning_CVPR_2016_paper.pdf Paper] || ||<br />
|-<br />
|Week of Nov 29 || Bowen Zhang, Tyler Magnus Verhaar, Sam Senko || || Deep Double Descent: Where Bigger Models and More Data Hurt || [https://arxiv.org/pdf/1912.02292.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Double_Descent_Where_Bigger_Models_and_More_Data_Hurt Summary] ||<br />
|-<br />
|Week of Nov 29 || Chun Waan Loke, Peter Chong, Clarice Osmond, Zhilong Li|| || XGBoost: A Scalable Tree Boosting System || [https://www.kdd.org/kdd2016/papers/files/rfp0697-chenAemb.pdf Paper] || ||<br />
|-<br />
|Week of Nov 22 || Ann Gie Wong, Curtis Li, Hannah Kerr || || The Detection of Black Ice Accidents for Preventative Automated Vehicles Using Convolutional Neural Networks || [https://www.mdpi.com/2079-9292/9/12/2178/htm Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=The_Detection_of_Black_Ice_Accidents_Using_CNNs&fbclid=IwAR0K4YdnL_hdRnOktmJn8BI6-Ra3oitjJof0YwluZgUP1LVFHK5jyiBZkvQ Summary] ||<br />
|-<br />
|Week of Nov 22 || Yuwei Liu, Daniel Mao|| || Depthwise Convolution Is All You Need for Learning Multiple Visual Domains || [https://arxiv.org/abs/1902.00927 Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Depthwise_Convolution_Is_All_You_Need_for_Learning_Multiple_Visual_Domains Summary] ||<br />
|-<br />
|Week of Nov 29 || Lingshan Wang, Yifan Li, Ziyi Liu || || Deep Learning for Extreme Multi-label Text Classification || [https://dl.acm.org/doi/pdf/10.1145/3077136.3080834 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Learning_for_Extreme_Multi-label_Text_Classification Summary]||<br />
|-<br />
|-<br />
|Week of Nov 29 || Kar Lok Ng, Muhan (Iris) Li || || Robust Imitation Learning from Noisy Demonstrations || [http://proceedings.mlr.press/v130/tangkaratt21a/tangkaratt21a.pdf Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Robust_Imitation_Learning_from_Noisy_Demonstrations Summary] ||<br />
|-<br />
|Week of Nov 29 ||Kun Wang || || Convolutional neural network for diagnosis of viral pneumonia and COVID-19 alike diseases|| [https://doi-org.proxy.lib.uwaterloo.ca/10.1111/exsy.12705 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Convolutional_neural_network_for_diagnosis_of_viral_pneumonia_and_COVID-19_alike_diseases Summary] ||<br />
|-<br />
|Week of Nov 29 ||Egemen Guray || || Traffic Sign Recognition System (TSRS): SVM and Convolutional Neural Network || [https://www.researchgate.net/publication/344399165_Traffic_Sign_Recognition_System_TSRS_SVM_and_Convolutional_Neural_Network Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Traffic_Sign_Recognition_System_(TSRS):_SVM_and_Convolutional_Neural_Network Summary] ||<br />
|-<br />
|Week of Nov 29 ||Bsodjahi || || Bayesian Network as a Decision Tool for Predicting ALS Disease ||[https://www.mdpi.com/2076-3425/11/2/150/htm Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Bayesian_Network_as_a_Decision_Tool_for_Predicting_ALS_Disease Summary]||<br />
|-<br />
|Week of Nov 29 ||Xin Yan, Yishu Duan, Xibei Di || || Predicting Hurricane Trajectories Using a Recurrent Neural Network || [https://arxiv.org/pdf/1802.02548.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Predicting_Hurricane_Trajectories_Using_a_Recurrent_Neural_Network Summary]||<br />
|-<br />
|Week of Nov 29 ||Ankitha Anugu, Yushan Chen, Yuying Huang || || A Game Theoretic Approach to Class-wise Selective Rationalization || [https://arxiv.org/pdf/1910.12853.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=A_Game_Theoretic_Approach_to_Class-wise_Selective_Rationalization#How_does_CAR_work_intuitively Summary]||<br />
|-<br />
|Week of Nov 29 ||Aavinash Syamala, Dilmeet Malhi, Sohan Islam, Vansh Joshi || || Research on Multiple Classification Based on Improved SVM Algorithm for Balanced Binary Decision Tree || [https://www.hindawi.com/journals/sp/2021/5560465/ Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Research_on_Multiple_Classification_Based_on_Improved_SVM_Algorithm_for_Balanced_Binary_Decision_Tree Summary]||<br />
|-<br />
|Week of Nov 29 ||Christian Mitrache, Alexandra Mossman, Jessica Saini, Aaron Renggli|| || U-Time: A Fully Convolutional Network for Time Series Segmentation Applied to Sleep Staging|| [https://proceedings.neurips.cc/paper/2019/file/57bafb2c2dfeefba931bb03a835b1fa9-Paper.pdf?fbclid=IwAR1dZpx9vU1pSPTSm_nwk6uBU7TYJ2HNTrsqjaH-9ZycE_PFpFjJoHg1zhQ]||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=U-Time:A_Fully_Convolutional_Network_for_Time_Series_Segmentation_Applied_to_Sleep_Staging_Summary]||<br />
|-<br />
|Week of Nov 29 ||Junbin Pan|| || Wide & Deep Learning for Recommender Systems || [https://arxiv.org/pdf/1606.07792v1.pdf Paper] || []||</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat441F21&diff=51142stat441F212021-11-29T10:09:57Z<p>J47pan: /* Paper presentation */</p>
<hr />
<div><br />
<br />
== [[F20-STAT 441/841 CM 763-Proposal| Project Proposal ]] ==<br />
<br />
<!--[https://goo.gl/forms/apurag4dr9kSR76X2 Your feedback on presentations]--><br />
<br />
=Paper presentation=<br />
{| class="wikitable"<br />
<br />
{| border="1" cellpadding="3"<br />
|-<br />
|width="60pt"|Date<br />
|width="250pt"|Name <br />
|width="15pt"|Paper number <br />
|width="700pt"|Title<br />
|width="15pt"|Link to the paper<br />
|width="30pt"|Link to the summary<br />
|width="30pt"|Link to the video<br />
|-<br />
|Sep 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary] || [https://youtu.be/JWozRg_X-Vg?list=PLehuLRPyt1HzXDemu7K4ETcF0Ld_B5adG&t=539]<br />
|-<br />
|Week of Nov 16 || Ali Ghodsi || || || || ||<br />
|-<br />
|Week of Nov 22 || Jared Feng, Xipeng Huang, Mingwei Xu, Tingzhou Yu|| || Don't Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification || [http://proceedings.mlr.press/v139/bai21c/bai21c.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Don%27t_Just_Blame_Over-parametrization Summary] ||<br />
|-<br />
|Week of Nov 29 || Kanika Chopra, Yush Rajcoomar || || Automatic Bank Fraud Detection Using Support Vector Machines || [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.863.5804&rep=rep1&type=pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Automatic_Bank_Fraud_Detection_Using_Support_Vector_Machines Summary] ||<br />
|-<br />
|Week of Nov 22 || Zeng Mingde, Lin Xiaoyu, Fan Joshua, Rao Chen Min || || Do Vision Transformers See Like Convolutional Neural Networks? || [https://proceedings.neurips.cc/paper/2021/file/652cf38361a209088302ba2b8b7f51e0-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Do_Vision_Transformers_See_Like_CNN Summary] ||<br />
|-<br />
|Week of Nov 22 || Justin D'Astous, Waqas Hamed, Stefan Vladusic, Ethan O'Farrell || || A Probabilistic Approach to Neural Network Pruning || [http://proceedings.mlr.press/v139/qian21a/qian21a.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Summary_of_A_Probabilistic_Approach_to_Neural_Network_Pruning Summary] ||<br />
|-<br />
|Week of Nov 22 || Cassandra Wong, Anastasiia Livochka, Maryam Yalsavar, David Evans || || Patch-Based Convolutional Neural Network for Whole Slide Tissue Image Classification || [https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Hou_Patch-Based_Convolutional_Neural_CVPR_2016_paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Patch_Based_Convolutional_Neural_Network_for_Whole_Slide_Tissue_Image_Classification Summary] ||<br />
|-<br />
|Week of Nov 29 || Jessie Man Wai Chin, Yi Lin Ooi, Yaqi Shi, Shwen Lyng Ngew || || CatBoost: unbiased boosting with categorical features || [https://proceedings.neurips.cc/paper/2018/file/14491b756b3a51daac41c24863285549-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=CatBoost:_unbiased_boosting_with_categorical_features Summary] ||<br />
|-<br />
|Week of Nov 29 || Eric Anderson, Chengzhi Wang, Kai Zhong, YiJing Zhou || || Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks || [https://arxiv.org/pdf/1804.00792.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Poison_Frogs_Neural_Networks Summary] ||<br />
|-<br />
|Week of Nov 29 || Ethan Cyrenne, Dieu Hoa Nguyen, Mary Jane Sin, Carolyn Wang || || Deep Residual Learning for Image Recognition || [https://openaccess.thecvf.com/content_cvpr_2016/papers/He_Deep_Residual_Learning_CVPR_2016_paper.pdf Paper] || ||<br />
|-<br />
|Week of Nov 29 || Bowen Zhang, Tyler Magnus Verhaar, Sam Senko || || Deep Double Descent: Where Bigger Models and More Data Hurt || [https://arxiv.org/pdf/1912.02292.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Double_Descent_Where_Bigger_Models_and_More_Data_Hurt Summary] ||<br />
|-<br />
|Week of Nov 29 || Chun Waan Loke, Peter Chong, Clarice Osmond, Zhilong Li|| || XGBoost: A Scalable Tree Boosting System || [https://www.kdd.org/kdd2016/papers/files/rfp0697-chenAemb.pdf Paper] || ||<br />
|-<br />
|Week of Nov 22 || Ann Gie Wong, Curtis Li, Hannah Kerr || || The Detection of Black Ice Accidents for Preventative Automated Vehicles Using Convolutional Neural Networks || [https://www.mdpi.com/2079-9292/9/12/2178/htm Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=The_Detection_of_Black_Ice_Accidents_Using_CNNs&fbclid=IwAR0K4YdnL_hdRnOktmJn8BI6-Ra3oitjJof0YwluZgUP1LVFHK5jyiBZkvQ Summary] ||<br />
|-<br />
|Week of Nov 22 || Yuwei Liu, Daniel Mao|| || Depthwise Convolution Is All You Need for Learning Multiple Visual Domains || [https://arxiv.org/abs/1902.00927 Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Depthwise_Convolution_Is_All_You_Need_for_Learning_Multiple_Visual_Domains Summary] ||<br />
|-<br />
|Week of Nov 29 || Lingshan Wang, Yifan Li, Ziyi Liu || || Deep Learning for Extreme Multi-label Text Classification || [https://dl.acm.org/doi/pdf/10.1145/3077136.3080834 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Learning_for_Extreme_Multi-label_Text_Classification Summary]||<br />
|-<br />
|-<br />
|Week of Nov 29 || Kar Lok Ng, Muhan (Iris) Li || || Robust Imitation Learning from Noisy Demonstrations || [http://proceedings.mlr.press/v130/tangkaratt21a/tangkaratt21a.pdf Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Robust_Imitation_Learning_from_Noisy_Demonstrations Summary] ||<br />
|-<br />
|Week of Nov 29 ||Kun Wang || || Convolutional neural network for diagnosis of viral pneumonia and COVID-19 alike diseases|| [https://doi-org.proxy.lib.uwaterloo.ca/10.1111/exsy.12705 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Convolutional_neural_network_for_diagnosis_of_viral_pneumonia_and_COVID-19_alike_diseases Summary] ||<br />
|-<br />
|Week of Nov 29 ||Egemen Guray || || Traffic Sign Recognition System (TSRS): SVM and Convolutional Neural Network || [https://www.researchgate.net/publication/344399165_Traffic_Sign_Recognition_System_TSRS_SVM_and_Convolutional_Neural_Network Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Traffic_Sign_Recognition_System_(TSRS):_SVM_and_Convolutional_Neural_Network Summary] ||<br />
|-<br />
|Week of Nov 29 ||Bsodjahi || || Bayesian Network as a Decision Tool for Predicting ALS Disease ||[https://www.mdpi.com/2076-3425/11/2/150/htm Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Bayesian_Network_as_a_Decision_Tool_for_Predicting_ALS_Disease Summary]||<br />
|-<br />
|Week of Nov 29 ||Xin Yan, Yishu Duan, Xibei Di || || Predicting Hurricane Trajectories Using a Recurrent Neural Network || [https://arxiv.org/pdf/1802.02548.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Predicting_Hurricane_Trajectories_Using_a_Recurrent_Neural_Network Summary]||<br />
|-<br />
|Week of Nov 29 ||Ankitha Anugu, Yushan Chen, Yuying Huang || || A Game Theoretic Approach to Class-wise Selective Rationalization || [https://arxiv.org/pdf/1910.12853.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=A_Game_Theoretic_Approach_to_Class-wise_Selective_Rationalization#How_does_CAR_work_intuitively Summary]||<br />
|-<br />
|Week of Nov 29 ||Aavinash Syamala, Dilmeet Malhi, Sohan Islam, Vansh Joshi || || Research on Multiple Classification Based on Improved SVM Algorithm for Balanced Binary Decision Tree || [https://www.hindawi.com/journals/sp/2021/5560465/ Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Research_on_Multiple_Classification_Based_on_Improved_SVM_Algorithm_for_Balanced_Binary_Decision_Tree Summary]||<br />
|-<br />
|Week of Nov 29 ||Christian Mitrache, Alexandra Mossman, Jessica Saini, Aaron Renggli|| || U-Time: A Fully Convolutional Network for Time Series Segmentation Applied to Sleep Staging|| [https://proceedings.neurips.cc/paper/2019/file/57bafb2c2dfeefba931bb03a835b1fa9-Paper.pdf?fbclid=IwAR1dZpx9vU1pSPTSm_nwk6uBU7TYJ2HNTrsqjaH-9ZycE_PFpFjJoHg1zhQ]||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=U-Time:A_Fully_Convolutional_Network_for_Time_Series_Segmentation_Applied_to_Sleep_Staging_Summary]||<br />
|-<br />
|Week of Nov 29 ||Junbin Pan|| || Wide & Deep Learning for Recommender Systems || [https://arxiv.org/pdf/1606.07792v1.pdf] || []||</div>J47panhttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=F21-STAT_441/841_CM_763-Proposal&diff=49964F21-STAT 441/841 CM 763-Proposal2021-10-07T15:27:13Z<p>J47pan: </p>
<hr />
<div>Use this format (Don’t remove Project 0)<br />
<br />
Project # 0 Group members:<br />
<br />
Last name, First name<br />
<br />
Last name, First name<br />
<br />
Last name, First name<br />
<br />
Last name, First name<br />
<br />
Title: Making a String Telephone<br />
<br />
Description: We use paper cups to make a string phone and talk with friends while learning about sound waves with this science project. (Explain your project in one or two paragraphs).<br />
<br />
--------------------------------------------------------------------<br />
Project # 1 Group members:<br />
<br />
Feng, Jared<br />
<br />
Huang, Xipeng<br />
<br />
Xu, Mingwei<br />
<br />
Yu, Tingzhou<br />
<br />
Title: <br />
<br />
Description:<br />
--------------------------------------------------------------------<br />
Project # 2 Group members:<br />
<br />
Anderson, Eric<br />
<br />
Wang, Chengzhi<br />
<br />
Zhong, Kai<br />
<br />
Zhou, Yi Jing<br />
<br />
Title: Application of Neural Networks<br />
<br />
Description: To be filled in before Oct 8th.<br />
<br />
--------------------------------------------------------------------<br />
Project # 3 Group members:<br />
<br />
Chopra, Kanika<br />
<br />
Rajcoomar, Yush<br />
<br />
Title: TBD <br />
<br />
Description: TBD<br />
<br />
--------------------------------------------------------------------<br />
Project # 4 Group members:<br />
<br />
Zhang, Bowen<br />
<br />
Li, Shaozhong<br />
<br />
Kerr, Hannah<br />
<br />
Wong, Ann gie<br />
<br />
Title: Classification<br />
<br />
Description: TBD<br />
<br />
--------------------------------------------------------------------<br />
Project # 5 Group members:<br />
<br />
Chin, Jessie Man Wai<br />
<br />
Ooi, Yi Lin<br />
<br />
Shi, Yaqi<br />
<br />
Ngew, Shwen Lyng<br />
<br />
Title: TBD<br />
<br />
Description: TBD<br />
<br />
--------------------------------------------------------------------<br />
Project # 6 Group members:<br />
<br />
Wang, Carolyn<br />
<br />
Cyrenne, Ethan<br />
<br />
Hoa, Dieu<br />
<br />
Sin, Mary Jane<br />
<br />
Title: TBD<br />
<br />
Description: TBD<br />
<br />
--------------------------------------------------------------------<br />
Project # 7 Group members:<br />
<br />
Bhattacharya, Vaibhav<br />
<br />
Chatoor, Amanda<br />
<br />
Prathap Das, Sutej<br />
<br />
Title: PetFinder.my - Pawpularity Contest [https://www.kaggle.com/c/petfinder-pawpularity-score/overview]<br />
<br />
Description: In this competition, we will analyze raw images and metadata to predict the “Pawpularity” of pet photos. We'll train and test our model on PetFinder.my's thousands of pet profiles.<br />
<br />
--------------------------------------------------------------------<br />
Project # 8 Group members:<br />
<br />
Xu, Siming<br />
<br />
Yan, Xin<br />
<br />
Duan, Yishu<br />
<br />
Di, Xibei<br />
<br />
Title: TBD<br />
<br />
Description: TBD<br />
--------------------------------------------------------------------<br />
Project # 9 Group members:<br />
<br />
Loke, Chun Waan<br />
<br />
Chong, Peter<br />
<br />
Osmond, Clarice<br />
<br />
Title: TBD<br />
<br />
Description: TBD<br />
<br />
--------------------------------------------------------------------<br />
<br />
Project # 10 Group members:<br />
<br />
O'Farrell, Ethan<br />
<br />
D'Astous, Justin<br />
<br />
Hamed, Waqas<br />
<br />
Vladusic, Stefan<br />
<br />
Title: TBD<br />
<br />
Description: TBD<br />
--------------------------------------------------------------------<br />
Project # 11 Group members:<br />
<br />
JunBin, Pan<br />
<br />
Title: TBD<br />
<br />
Description: TBD<br />
--------------------------------------------------------------------</div>J47pan