http://wiki.math.uwaterloo.ca/statwiki/api.php?action=feedcontributions&user=T358wang&feedformat=atomstatwiki - User contributions [US]2022-09-25T14:19:58ZUser contributionsMediaWiki 1.28.3http://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat441F21&diff=46178stat441F212020-11-24T02:30:09Z<p>T358wang: /* Paper presentation */</p>
<hr />
<div><br />
<br />
== [[F20-STAT 441/841 CM 763-Proposal| Project Proposal ]] ==<br />
<br />
<!--[https://goo.gl/forms/apurag4dr9kSR76X2 Your feedback on presentations]--><br />
<br />
= Record your contributions here [https://docs.google.com/spreadsheets/d/10CHiJpAylR6kB9QLqN7lZHN79D9YEEW6CDTH27eAhbQ/edit?usp=sharing]=<br />
<br />
Use the following notations:<br />
<br />
P: You have written a summary/critique on the paper.<br />
<br />
T: You had a technical contribution on a paper (excluding the paper that you present).<br />
<br />
E: You had an editorial contribution on a paper (excluding the paper that you present).<br />
<br />
=Paper presentation=<br />
{| class="wikitable"<br />
<br />
{| border="1" cellpadding="3"<br />
|-<br />
|width="60pt"|Date<br />
|width="250pt"|Name <br />
|width="15pt"|Paper number <br />
|width="700pt"|Title<br />
|width="15pt"|Link to the paper<br />
|width="30pt"|Link to the summary<br />
|width="30pt"|Link to the video<br />
|-<br />
|Sep 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary] || [https://youtu.be/JWozRg_X-Vg?list=PLehuLRPyt1HzXDemu7K4ETcF0Ld_B5adG&t=539]<br />
|-<br />
|Week of Nov 16 ||Sharman Bharat, Li Dylan,Lu Leonie, Li Mingdao || 1|| Risk prediction in life insurance industry using supervised learning algorithms || [https://rdcu.be/b780J Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Bsharman Summary] ||<br />
[https://www.youtube.com/watch?v=TVLpSFYgF0c&feature=youtu.be]<br />
|-<br />
|Week of Nov 16 || Delaney Smith, Mohammad Assem Mahmoud || 2|| Influenza Forecasting Framework based on Gaussian Processes || [https://proceedings.icml.cc/static/paper_files/icml/2020/1239-Paper.pdf Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Influenza_Forecasting_Framework_based_on_Gaussian_Processes Summary]|| [https://www.youtube.com/watch?v=HZG9RAHhpXc&feature=youtu.be]<br />
|-<br />
|Week of Nov 16 || Tatianna Krikella, Swaleh Hussain, Grace Tompkins || 3|| Processing of Missing Data by Neural Networks || [http://papers.nips.cc/paper/7537-processing-of-missing-data-by-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Gtompkin Summary] || [https://learn.uwaterloo.ca/d2l/ext/rp/577051/lti/framedlaunch/6ec1ebea-5547-46a2-9e4f-e3dc9d79fd54]<br />
|-<br />
|Week of Nov 16 ||Jonathan Chow, Nyle Dharani, Ildar Nasirov ||4 ||Streaming Bayesian Inference for Crowdsourced Classification ||[https://papers.nips.cc/paper/9439-streaming-bayesian-inference-for-crowdsourced-classification.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Streaming_Bayesian_Inference_for_Crowdsourced_Classification Summary] || [https://www.youtube.com/watch?v=UgVRzi9Ubws]<br />
|-<br />
|Week of Nov 16 || Matthew Hall, Johnathan Chalaturnyk || 5|| Neural Ordinary Differential Equations || [https://papers.nips.cc/paper/7892-neural-ordinary-differential-equations.pdf] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Neural_ODEs Summary]||<br />
|-<br />
|Week of Nov 16 || Luwen Chang, Qingyang Yu, Tao Kong, Tianrong Sun || 6|| Adversarial Attacks on Copyright Detection Systems || Paper [https://proceedings.icml.cc/static/paper_files/icml/2020/1894-Paper.pdf] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Adversarial_Attacks_on_Copyright_Detection_Systems Summary] || [https://www.youtube.com/watch?v=bQI9S6bCo8o]<br />
|-<br />
|Week of Nov 16 || Casey De Vera, Solaiman Jawad || 7|| IPBoost – Non-Convex Boosting via Integer Programming || [https://arxiv.org/pdf/2002.04679.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=IPBoost Summary] || [https://www.youtube.com/watch?v=4DhJDGC4pyI&feature=youtu.be]<br />
|-<br />
|Week of Nov 16 || Yuxin Wang, Evan Peters, Yifan Mou, Sangeeth Kalaichanthiran || 8|| What Game Are We Playing? End-to-end Learning in Normal and Extensive Form Games || [https://arxiv.org/pdf/1805.02777.pdf] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=what_game_are_we_playing Summary] || [https://www.youtube.com/watch?v=9qJoVxo3hnI&feature=youtu.be]<br />
|-<br />
|Week of Nov 16 || Yuchuan Wu || 9|| || || ||<br />
|-<br />
|Week of Nov 16 || Zhou Zeping, Siqi Li, Yuqin Fang, Fu Rao || 10|| A survey of neural network-based cancer prediction models from microarray data || [https://www.sciencedirect.com/science/article/pii/S0933365717305067 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Y93fang Summary] || [https://youtu.be/B8pPUU8ypO0]<br />
|-<br />
|Week of Nov 23 ||Jinjiang Lian, Jiawen Hou, Yisheng Zhu, Mingzhe Huang || 11|| DROCC: Deep Robust One-Class Classification || [https://proceedings.icml.cc/static/paper_files/icml/2020/6556-Paper.pdf paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:J46hou Summary] || [https://www.youtube.com/watch?v=tvCEvvy54X8&ab_channel=JJLian]<br />
|-<br />
|Week of Nov 23 || Bushra Haque, Hayden Jones, Michael Leung, Cristian Mustatea || 12|| Combine Convolution with Recurrent Networks for Text Classification || [https://arxiv.org/pdf/2006.15795.pdf Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Cvmustat Summary] || [https://www.youtube.com/watch?v=or5RTxDnZDo]<br />
|-<br />
|Week of Nov 23 || Taohao Wang, Zeren Shen, Zihao Guo, Rui Chen || 13|| Large Scale Landmark Recognition via Deep Metric Learning || [https://arxiv.org/pdf/1908.10192.pdf paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:T358wang Summary] || [https://www.youtube.com/watch?v=K9NypDNPLJo Video]<br />
|-<br />
|Week of Nov 23 || Qianlin Song, William Loh, Junyue Bai, Phoebe Choi || 14|| Task Understanding from Confusing Multi-task Data || [https://proceedings.icml.cc/static/paper_files/icml/2020/578-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Task_Understanding_from_Confusing_Multi-task_Data Summary] || [https://youtu.be/i_5PQdfuH-Y]<br />
|-<br />
|Week of Nov 23 || Rui Gong, Xuetong Wang, Xinqi Ling, Di Ma || 15|| Semantic Relation Classification via Convolution Neural Network|| [https://www.aclweb.org/anthology/S18-1127.pdf paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Semantic_Relation_Classification——via_Convolution_Neural_Network Summary]|| [https://www.youtube.com/watch?v=m9o3NuMUKkU&ab_channel=DiMa video]<br />
|-<br />
|Week of Nov 23 || Xiaolan Xu, Robin Wen, Yue Weng, Beizhen Chang || 16|| Graph Structure of Neural Networks || [https://proceedings.icml.cc/paper/2020/file/757b505cfd34c64c85ca5b5690ee5293-Paper.pdf Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Graph_Structure_of_Neural_Networks Summary] || [https://youtu.be/x9eUgEwntcs Video]<br />
|-<br />
|Week of Nov 23 ||Hansa Halim, Sanjana Rajendra Naik, Samka Marfua, Shawrupa Proshasty || 17|| Superhuman AI for multiplayer poker || [https://www.cs.cmu.edu/~noamb/papers/19-Science-Superhuman.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Superhuman_AI_for_Multiplayer_Poker Summary]|| [https://www.youtube.com/watch?v=kazqcOwbtTI Video]<br />
|-<br />
|Week of Nov 23 ||Guanting Pan, Haocheng Chang, Zaiwei Zhang || 18|| Point-of-Interest Recommendation: Exploiting Self-Attentive Autoencoders with Neighbor-Aware Influence || [https://arxiv.org/pdf/1809.10770.pdf Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Point-of-Interest_Recommendation:_Exploiting_Self-Attentive_Autoencoders_with_Neighbor-Aware_Influence Summary] || [https://www.youtube.com/watch?v=aAwjaos_Hus Video]<br />
|-<br />
|Week of Nov 23 || Jerry Huang, Daniel Jiang, Minyan Dai || 19|| Neural Speed Reading Via Skim-RNN ||[https://arxiv.org/pdf/1711.02085.pdf?fbclid=IwAR3EeFsKM_b5p9Ox7X9mH-1oI3U3oOKPBy3xUOBN0XvJa7QW2ZeJJ9ypQVo Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Neural_Speed_Reading_via_Skim-RNN Summary]|| [https://youtu.be/vOENmt9jgVE Video]<br />
|-<br />
|Week of Nov 23 ||Ruixian Chin, Yan Kai Tan, Jason Ong, Wen Cheen Chiew || 20|| DivideMix: Learning with Noisy Labels as Semi-supervised Learning || [https://openreview.net/pdf?id=HJgExaVtwr Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Yktan Summary]|| [https://www.youtube.com/watch?v=48xYZXifjS0&ab_channel=SeakraChin]<br />
|-<br />
|Week of Nov 30 || Banno Dion, Battista Joseph, Kahn Solomon || 21|| Music Recommender System Based on Genre using Convolutional Recurrent Neural Networks || [https://www.sciencedirect.com/science/article/pii/S1877050919310646] || ||<br />
|-<br />
|Week of Nov 30 || Sai Arvind Budaraju, Isaac Ellmen, Dorsa Mohammadrezaei, Emilee Carson || 22|| A universal SNP and small-indel variant caller using deep neural networks||[https://www.nature.com/articles/nbt.4235.epdf?author_access_token=q4ZmzqvvcGBqTuKyKgYrQ9RgN0jAjWel9jnR3ZoTv0NuM3saQzpZk8yexjfPUhdFj4zyaA4Yvq0LWBoCYQ4B9vqPuv8e2HHy4vShDgEs8YxI_hLs9ov6Y1f_4fyS7kGZ Paper] || ||<br />
|-<br />
|Week of Nov 30 || Daniel Fagan, Cooper Brooke, Maya Perelman || 23|| Efficient kNN Classification With Different Number of Nearest Neighbors || [https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7898482 Paper] || ||<br />
|-<br />
|Week of Nov 30 || Karam Abuaisha, Evan Li, Jason Pu, Nicholas Vadivelu || 24|| Being Bayesian about Categorical Probability || [https://proceedings.icml.cc/static/paper_files/icml/2020/3560-Paper.pdf Paper] || ||<br />
|-<br />
|Week of Nov 30 || Anas Mahdi Will Thibault Jan Lau Jiwon Yang || 25|| Loss Function Search for Face Recognition<br />
|| [https://proceedings.icml.cc/static/paper_files/icml/2020/245-Paper.pdf] paper || ||<br />
|-<br />
|Week of Nov 30 ||Zihui (Betty) Qin, Wenqi (Maggie) Zhao, Muyuan Yang, Amartya (Marty) Mukherjee || 26|| Deep Learning for Cardiologist-level Myocardial Infarction Detection in Electrocardiograms || [https://arxiv.org/pdf/1912.07618.pdf?fbclid=IwAR0RwATSn4CiT3qD9LuywYAbJVw8YB3nbex8Kl19OCExIa4jzWaUut3oVB0 Paper] || Summary [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Learning_for_Cardiologist-level_Myocardial_Infarction_Detection_in_Electrocardiograms&fbclid=IwAR1Tad2DAM7LT6NXXuSYDZtHHBvN0mjZtDdCOiUFFq_XwVcQxG3hU-3XcaE] ||<br />
|-<br />
|Week of Nov 30 || Stan Lee, Seokho Lim, Kyle Jung, Daehyun Kim || 27|| Bag of Tricks for Efficient Text Classification || [https://arxiv.org/pdf/1607.01759.pdf paper] || ||<br />
|-<br />
|Week of Nov 30 || Yawen Wang, Danmeng Cui, ZiJie Jiang, Mingkang Jiang, Haotian Ren, Haris Bin Zahid || 28|| A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques || [https://arxiv.org/pdf/1707.02919.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Describtion_of_Text_Mining Summary] ||<br />
|-<br />
|Week of Nov 30 || Qing Guo, XueGuang Ma, James Ni, Yuanxin Wang || 29|| Mask R-CNN || [https://arxiv.org/pdf/1703.06870.pdf Paper] || ||<br />
|-<br />
|Week of Nov 30 || Bertrand Sodjahin, Junyi Yang, Jill Yu Chieh Wang, Yu Min Wu, Calvin Li || 30|| Research paper classifcation systems based on TF‑IDF and LDA schemes || [https://hcis-journal.springeropen.com/articles/10.1186/s13673-019-0192-7?fbclid=IwAR3swO-eFrEbj1BUQfmomJazxxeFR6SPgr6gKayhs38Y7aBG-zX1G3XWYRM Paper] || ||<br />
|-<br />
|Week of Nov 30 || Daniel Zhang, Jacky Yao, Scholar Sun, Russell Parco, Ian Cheung || 31 || Speech2Face: Learning the Face Behind a Voice || [https://arxiv.org/pdf/1905.09773.pdf?utm_source=thenewstack&utm_medium=website&utm_campaign=platform Paper] || ||<br />
|-<br />
|Week of Nov 30 || Siyuan Xia, Jiaxiang Liu, Jiabao Dong, Yipeng Du || 32 || Evaluating Machine Accuracy on ImageNet || [https://proceedings.icml.cc/static/paper_files/icml/2020/6173-Paper.pdf] || ||<br />
|-<br />
|Week of Nov 30 || Mushi Wang, Siyuan Qiu, Yan Yu || 33 || Surround Vehicle Motion Prediction Using LSTM-RNN for Motion Planning of Autonomous Vehicles at Multi-Lane Turn Intersections || [https://ieeexplore.ieee.org/abstract/document/8957421 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Surround_Vehicle_Motion_Prediction Summary] ||</div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat441F21&diff=46176stat441F212020-11-24T02:29:42Z<p>T358wang: /* Paper presentation */</p>
<hr />
<div><br />
<br />
== [[F20-STAT 441/841 CM 763-Proposal| Project Proposal ]] ==<br />
<br />
<!--[https://goo.gl/forms/apurag4dr9kSR76X2 Your feedback on presentations]--><br />
<br />
= Record your contributions here [https://docs.google.com/spreadsheets/d/10CHiJpAylR6kB9QLqN7lZHN79D9YEEW6CDTH27eAhbQ/edit?usp=sharing]=<br />
<br />
Use the following notations:<br />
<br />
P: You have written a summary/critique on the paper.<br />
<br />
T: You had a technical contribution on a paper (excluding the paper that you present).<br />
<br />
E: You had an editorial contribution on a paper (excluding the paper that you present).<br />
<br />
=Paper presentation=<br />
{| class="wikitable"<br />
<br />
{| border="1" cellpadding="3"<br />
|-<br />
|width="60pt"|Date<br />
|width="250pt"|Name <br />
|width="15pt"|Paper number <br />
|width="700pt"|Title<br />
|width="15pt"|Link to the paper<br />
|width="30pt"|Link to the summary<br />
|width="30pt"|Link to the video<br />
|-<br />
|Sep 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary] || [https://youtu.be/JWozRg_X-Vg?list=PLehuLRPyt1HzXDemu7K4ETcF0Ld_B5adG&t=539]<br />
|-<br />
|Week of Nov 16 ||Sharman Bharat, Li Dylan,Lu Leonie, Li Mingdao || 1|| Risk prediction in life insurance industry using supervised learning algorithms || [https://rdcu.be/b780J Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Bsharman Summary] ||<br />
[https://www.youtube.com/watch?v=TVLpSFYgF0c&feature=youtu.be]<br />
|-<br />
|Week of Nov 16 || Delaney Smith, Mohammad Assem Mahmoud || 2|| Influenza Forecasting Framework based on Gaussian Processes || [https://proceedings.icml.cc/static/paper_files/icml/2020/1239-Paper.pdf Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Influenza_Forecasting_Framework_based_on_Gaussian_Processes Summary]|| [https://www.youtube.com/watch?v=HZG9RAHhpXc&feature=youtu.be]<br />
|-<br />
|Week of Nov 16 || Tatianna Krikella, Swaleh Hussain, Grace Tompkins || 3|| Processing of Missing Data by Neural Networks || [http://papers.nips.cc/paper/7537-processing-of-missing-data-by-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Gtompkin Summary] || [https://learn.uwaterloo.ca/d2l/ext/rp/577051/lti/framedlaunch/6ec1ebea-5547-46a2-9e4f-e3dc9d79fd54]<br />
|-<br />
|Week of Nov 16 ||Jonathan Chow, Nyle Dharani, Ildar Nasirov ||4 ||Streaming Bayesian Inference for Crowdsourced Classification ||[https://papers.nips.cc/paper/9439-streaming-bayesian-inference-for-crowdsourced-classification.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Streaming_Bayesian_Inference_for_Crowdsourced_Classification Summary] || [https://www.youtube.com/watch?v=UgVRzi9Ubws]<br />
|-<br />
|Week of Nov 16 || Matthew Hall, Johnathan Chalaturnyk || 5|| Neural Ordinary Differential Equations || [https://papers.nips.cc/paper/7892-neural-ordinary-differential-equations.pdf] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Neural_ODEs Summary]||<br />
|-<br />
|Week of Nov 16 || Luwen Chang, Qingyang Yu, Tao Kong, Tianrong Sun || 6|| Adversarial Attacks on Copyright Detection Systems || Paper [https://proceedings.icml.cc/static/paper_files/icml/2020/1894-Paper.pdf] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Adversarial_Attacks_on_Copyright_Detection_Systems Summary] || [https://www.youtube.com/watch?v=bQI9S6bCo8o]<br />
|-<br />
|Week of Nov 16 || Casey De Vera, Solaiman Jawad || 7|| IPBoost – Non-Convex Boosting via Integer Programming || [https://arxiv.org/pdf/2002.04679.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=IPBoost Summary] || [https://www.youtube.com/watch?v=4DhJDGC4pyI&feature=youtu.be]<br />
|-<br />
|Week of Nov 16 || Yuxin Wang, Evan Peters, Yifan Mou, Sangeeth Kalaichanthiran || 8|| What Game Are We Playing? End-to-end Learning in Normal and Extensive Form Games || [https://arxiv.org/pdf/1805.02777.pdf] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=what_game_are_we_playing Summary] || [https://www.youtube.com/watch?v=9qJoVxo3hnI&feature=youtu.be]<br />
|-<br />
|Week of Nov 16 || Yuchuan Wu || 9|| || || ||<br />
|-<br />
|Week of Nov 16 || Zhou Zeping, Siqi Li, Yuqin Fang, Fu Rao || 10|| A survey of neural network-based cancer prediction models from microarray data || [https://www.sciencedirect.com/science/article/pii/S0933365717305067 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Y93fang Summary] || [https://youtu.be/B8pPUU8ypO0]<br />
|-<br />
|Week of Nov 23 ||Jinjiang Lian, Jiawen Hou, Yisheng Zhu, Mingzhe Huang || 11|| DROCC: Deep Robust One-Class Classification || [https://proceedings.icml.cc/static/paper_files/icml/2020/6556-Paper.pdf paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:J46hou Summary] || [https://www.youtube.com/watch?v=tvCEvvy54X8&ab_channel=JJLian]<br />
|-<br />
|Week of Nov 23 || Bushra Haque, Hayden Jones, Michael Leung, Cristian Mustatea || 12|| Combine Convolution with Recurrent Networks for Text Classification || [https://arxiv.org/pdf/2006.15795.pdf Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Cvmustat Summary] || [https://www.youtube.com/watch?v=or5RTxDnZDo]<br />
|-<br />
|Week of Nov 23 || Taohao Wang, Zeren Shen, Zihao Guo, Rui Chen || 13|| Large Scale Landmark Recognition via Deep Metric Learning || [https://arxiv.org/pdf/1908.10192.pdf paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:T358wang Summary] | [https://www.youtube.com/watch?v=K9NypDNPLJo Video]|<br />
|-<br />
|Week of Nov 23 || Qianlin Song, William Loh, Junyue Bai, Phoebe Choi || 14|| Task Understanding from Confusing Multi-task Data || [https://proceedings.icml.cc/static/paper_files/icml/2020/578-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Task_Understanding_from_Confusing_Multi-task_Data Summary] || [https://youtu.be/i_5PQdfuH-Y]<br />
|-<br />
|Week of Nov 23 || Rui Gong, Xuetong Wang, Xinqi Ling, Di Ma || 15|| Semantic Relation Classification via Convolution Neural Network|| [https://www.aclweb.org/anthology/S18-1127.pdf paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Semantic_Relation_Classification——via_Convolution_Neural_Network Summary]|| [https://www.youtube.com/watch?v=m9o3NuMUKkU&ab_channel=DiMa video]<br />
|-<br />
|Week of Nov 23 || Xiaolan Xu, Robin Wen, Yue Weng, Beizhen Chang || 16|| Graph Structure of Neural Networks || [https://proceedings.icml.cc/paper/2020/file/757b505cfd34c64c85ca5b5690ee5293-Paper.pdf Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Graph_Structure_of_Neural_Networks Summary] || [https://youtu.be/x9eUgEwntcs Video]<br />
|-<br />
|Week of Nov 23 ||Hansa Halim, Sanjana Rajendra Naik, Samka Marfua, Shawrupa Proshasty || 17|| Superhuman AI for multiplayer poker || [https://www.cs.cmu.edu/~noamb/papers/19-Science-Superhuman.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Superhuman_AI_for_Multiplayer_Poker Summary]|| [https://www.youtube.com/watch?v=kazqcOwbtTI Video]<br />
|-<br />
|Week of Nov 23 ||Guanting Pan, Haocheng Chang, Zaiwei Zhang || 18|| Point-of-Interest Recommendation: Exploiting Self-Attentive Autoencoders with Neighbor-Aware Influence || [https://arxiv.org/pdf/1809.10770.pdf Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Point-of-Interest_Recommendation:_Exploiting_Self-Attentive_Autoencoders_with_Neighbor-Aware_Influence Summary] || [https://www.youtube.com/watch?v=aAwjaos_Hus Video]<br />
|-<br />
|Week of Nov 23 || Jerry Huang, Daniel Jiang, Minyan Dai || 19|| Neural Speed Reading Via Skim-RNN ||[https://arxiv.org/pdf/1711.02085.pdf?fbclid=IwAR3EeFsKM_b5p9Ox7X9mH-1oI3U3oOKPBy3xUOBN0XvJa7QW2ZeJJ9ypQVo Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Neural_Speed_Reading_via_Skim-RNN Summary]|| [https://youtu.be/vOENmt9jgVE Video]<br />
|-<br />
|Week of Nov 23 ||Ruixian Chin, Yan Kai Tan, Jason Ong, Wen Cheen Chiew || 20|| DivideMix: Learning with Noisy Labels as Semi-supervised Learning || [https://openreview.net/pdf?id=HJgExaVtwr Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Yktan Summary]|| [https://www.youtube.com/watch?v=48xYZXifjS0&ab_channel=SeakraChin]<br />
|-<br />
|Week of Nov 30 || Banno Dion, Battista Joseph, Kahn Solomon || 21|| Music Recommender System Based on Genre using Convolutional Recurrent Neural Networks || [https://www.sciencedirect.com/science/article/pii/S1877050919310646] || ||<br />
|-<br />
|Week of Nov 30 || Sai Arvind Budaraju, Isaac Ellmen, Dorsa Mohammadrezaei, Emilee Carson || 22|| A universal SNP and small-indel variant caller using deep neural networks||[https://www.nature.com/articles/nbt.4235.epdf?author_access_token=q4ZmzqvvcGBqTuKyKgYrQ9RgN0jAjWel9jnR3ZoTv0NuM3saQzpZk8yexjfPUhdFj4zyaA4Yvq0LWBoCYQ4B9vqPuv8e2HHy4vShDgEs8YxI_hLs9ov6Y1f_4fyS7kGZ Paper] || ||<br />
|-<br />
|Week of Nov 30 || Daniel Fagan, Cooper Brooke, Maya Perelman || 23|| Efficient kNN Classification With Different Number of Nearest Neighbors || [https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7898482 Paper] || ||<br />
|-<br />
|Week of Nov 30 || Karam Abuaisha, Evan Li, Jason Pu, Nicholas Vadivelu || 24|| Being Bayesian about Categorical Probability || [https://proceedings.icml.cc/static/paper_files/icml/2020/3560-Paper.pdf Paper] || ||<br />
|-<br />
|Week of Nov 30 || Anas Mahdi Will Thibault Jan Lau Jiwon Yang || 25|| Loss Function Search for Face Recognition<br />
|| [https://proceedings.icml.cc/static/paper_files/icml/2020/245-Paper.pdf] paper || ||<br />
|-<br />
|Week of Nov 30 ||Zihui (Betty) Qin, Wenqi (Maggie) Zhao, Muyuan Yang, Amartya (Marty) Mukherjee || 26|| Deep Learning for Cardiologist-level Myocardial Infarction Detection in Electrocardiograms || [https://arxiv.org/pdf/1912.07618.pdf?fbclid=IwAR0RwATSn4CiT3qD9LuywYAbJVw8YB3nbex8Kl19OCExIa4jzWaUut3oVB0 Paper] || Summary [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Learning_for_Cardiologist-level_Myocardial_Infarction_Detection_in_Electrocardiograms&fbclid=IwAR1Tad2DAM7LT6NXXuSYDZtHHBvN0mjZtDdCOiUFFq_XwVcQxG3hU-3XcaE] ||<br />
|-<br />
|Week of Nov 30 || Stan Lee, Seokho Lim, Kyle Jung, Daehyun Kim || 27|| Bag of Tricks for Efficient Text Classification || [https://arxiv.org/pdf/1607.01759.pdf paper] || ||<br />
|-<br />
|Week of Nov 30 || Yawen Wang, Danmeng Cui, ZiJie Jiang, Mingkang Jiang, Haotian Ren, Haris Bin Zahid || 28|| A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques || [https://arxiv.org/pdf/1707.02919.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Describtion_of_Text_Mining Summary] ||<br />
|-<br />
|Week of Nov 30 || Qing Guo, XueGuang Ma, James Ni, Yuanxin Wang || 29|| Mask R-CNN || [https://arxiv.org/pdf/1703.06870.pdf Paper] || ||<br />
|-<br />
|Week of Nov 30 || Bertrand Sodjahin, Junyi Yang, Jill Yu Chieh Wang, Yu Min Wu, Calvin Li || 30|| Research paper classifcation systems based on TF‑IDF and LDA schemes || [https://hcis-journal.springeropen.com/articles/10.1186/s13673-019-0192-7?fbclid=IwAR3swO-eFrEbj1BUQfmomJazxxeFR6SPgr6gKayhs38Y7aBG-zX1G3XWYRM Paper] || ||<br />
|-<br />
|Week of Nov 30 || Daniel Zhang, Jacky Yao, Scholar Sun, Russell Parco, Ian Cheung || 31 || Speech2Face: Learning the Face Behind a Voice || [https://arxiv.org/pdf/1905.09773.pdf?utm_source=thenewstack&utm_medium=website&utm_campaign=platform Paper] || ||<br />
|-<br />
|Week of Nov 30 || Siyuan Xia, Jiaxiang Liu, Jiabao Dong, Yipeng Du || 32 || Evaluating Machine Accuracy on ImageNet || [https://proceedings.icml.cc/static/paper_files/icml/2020/6173-Paper.pdf] || ||<br />
|-<br />
|Week of Nov 30 || Mushi Wang, Siyuan Qiu, Yan Yu || 33 || Surround Vehicle Motion Prediction Using LSTM-RNN for Motion Planning of Autonomous Vehicles at Multi-Lane Turn Intersections || [https://ieeexplore.ieee.org/abstract/document/8957421 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Surround_Vehicle_Motion_Prediction Summary] ||</div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:T358wang&diff=46136User:T358wang2020-11-23T21:40:26Z<p>T358wang: /* Methodology */</p>
<hr />
<div><br />
== Group ==<br />
Rui Chen, Zeren Shen, Zihao Guo, Taohao Wang<br />
<br />
== Introduction ==<br />
<br />
Landmark recognition is an image retrieval task with its own specific challenges. This paper provides a new and effective method to recognize landmark images, which has been successfully applied to actual images. In this way, statues, buildings, and characteristic objects can be effectively identified.<br />
<br />
There are many difficulties encountered in the development process:<br />
<br />
1. The first problem is that the concept of landmarks cannot be strictly defined, because landmarks can be any object and building.<br />
<br />
2. The second problem is that the same landmark can be photographed from different angles. The results of the multi-angle shooting will result in very different picture characteristics. But the system needs to accurately identify different landmarks. <br />
<br />
3. The third problem is that the landmark recognition system must recognize a large number of landmarks, and the recognition must achieve high accuracy. <br />
<br />
These problems require that the system should have a very low false alarm rate and high recognition accuracy. <br />
There are also three potential problems:<br />
<br />
1. The processed data set contains a little error content, the image content is not clean and the quantity is huge.<br />
<br />
2. The algorithm for learning the training set must be fast and scalable.<br />
<br />
3. While displaying high-quality judgment landmarks, there is no image geographic information mixed.<br />
<br />
The article describes the deep convolutional neural network (CNN) architecture, loss function, training method, and inference aspects. Using this model, similar metrics to the state of the art model in the test were obtained and the inference time was found to be 15 times faster. Further, because of the efficient architecture, the system can serve in an online fashion. The results of quantitative experiments will be displayed through testing and deployment effect analysis to prove the effectiveness of the model.<br />
<br />
== Related Work ==<br />
<br />
Landmark recognition can be regarded as one of the tasks of image retrieval, and a large number of documents are concentrated on image retrieval tasks. In the past two decades, the field of image retrieval has made significant progress, and the main methods can be divided into two categories. <br />
The first is a classic retrieval method using local features, a method based on local feature descriptors organized in bag-of-words, spatial verification, Hamming embedding, and query expansion. These methods are dominant in image retrieval. Later, until the rise of deep convolutional neural networks (CNN), deep convolutional neural networks (CNN) were used to generate global descriptors of input images.<br />
Another method is to selectively match the kernel Hamming embedding method extension. With the advent of deep convolutional neural networks, the most effective image retrieval method is based on training CNNs for specific tasks. Deep networks are very powerful for semantic feature representation, which allows us to effectively use them for landmark recognition. This method shows good results but brings additional memory and complexity costs. <br />
The DELF (DEep local feature) by Noh et al. proved promising results. This method combines the classic local feature method with deep learning. This allows us to extract local features from the input image and then use RANSAC for geometric verification. <br />
The goal of the project is to describe a method for accurate and fast large-scale landmark recognition using the advantages of deep convolutional neural networks.<br />
<br />
<br />
== Methodology ==<br />
<br />
This section will describe in detail the CNN architecture, loss function, training procedure, and inference implementation of the landmark recognition system. The figure below is an overview of the landmark recognition system.<br />
<br />
[[File:t358wang_landmark_recog_system.png |center|800px]]<br />
<br />
The landmark CNN consists of three parts including the main network, the embedding layer, and the classification layer. To obtain a CNN main network suitable for training landmark recognition model, fine-tuning is applied and several pre-trained backbones (Residual Networks) based on other similar datasets, including ResNet-50, ResNet-200, SE-ResNext-101, and Wide Residual Network (WRN-50-2), are evaluated based on inference quality and efficiency. Based on the evaluation results, WRN-50-2 is selected as the optimal backbone architecture.<br />
<br />
[[File:t358wang_backbones.png |center|600px]]<br />
<br />
For the embedding layer, as shown in the below figure, the last fully-connected layer after the averaging pool is removed. Instead, a fully-connected 2048 <math>\times</math> 512 layer and a batch normalization are added as the embedding layer. After the batch norm, a fully-connected 512 <math>\times</math> n layer is added as the classification layer. The below figure shows the overview of the CNN architecture of the landmark recognition system.<br />
<br />
[[File:t358wang_network_arch.png |center|800px]]<br />
<br />
To effectively determine the embedding vectors for each landmark class (centroids), the network needs to be trained to have the members of each class to be as close as possible to the centroids. Several suitable loss functions are evaluated including Contrastive Loss, Arcface, and Center loss. The center loss is selected since it achieves the optimal test results and it trains a center of embeddings of each class and penalizes distances between image embeddings as well as their class centers. In addition, the center loss is a simple addition to softmax loss and is trivial to implement.<br />
<br />
When implementing the loss function, a new additional class that includes all non-landmark instances needs to be added and the center loss function needs to be modified as follows: Let n be the number of landmark classes, m be the mini-batch size, <math>x_i \in R^d</math> is the i-th embedding and <math>y_i</math> is the corresponding label where <math>y_i \in</math> {1,...,n,n+1}, n+1 is the label of the non-landmark class. Denote <math>W \in R^{d \times n}</math> as the weights of the classifier layer, <math>W_j</math> as its j-th column. Let <math>c_{y_i}</math> be the <math>y_i</math> th embeddings center from Center loss and <math>\lambda</math> be the balancing parameter of Center loss. Then the final loss function will be: <br />
<br />
[[File:t358wang_loss_function.png |center|600px]]<br />
<br />
In the training procedure, the stochastic gradient descent(SGD) will be used as the optimizer with momentum=0.9 and weight decay = 5e-3. For the center loss function, the parameter <math>\lambda</math> is set to 5e-5. Each image is resized to 256 <math>\times</math> 256 and several data augmentations are applied to the dataset including random resized crop, color jitter and random flip. The training dataset is divided into four parts based on the geographical affiliation of cities where landmarks are located: Europe/Russia, North America/Australia/Oceania, Middle East/North Africa, and the Far East. <br />
<br />
The paper introduces curriculum learning for landmark recognition, which is shown in the below figure. The algorithm is trained for 30 epochs and the learning rate <math>\alpha_1, \alpha_2, \alpha_3</math> will be reduced by a factor of 10 at the 12th epoch and 24th epoch.<br />
<br />
[[File:t358wang_algorithm1.png |center|600px]]<br />
<br />
In the inference phase, the paper introduces the term “centroids” which are embedding vectors that are calculated by averaging embeddings and are used to describe landmark classes. The calculation of centroids is significant to effectively determine whether a query image contains a landmark. The paper proposes two approaches to help the inference algorithm to calculate the centroids. First, instead of using the entire training data for each landmark, data cleaning is done to remove most of the redundant and irrelevant elements in the image. Second, since each landmark can have different shooting angles, it is more efficient to calculate a separate centroid for each shooting angle. Hence, a hierarchical agglomerative clustering algorithm is proposed to partition training data into several valid clusters for each landmark and the set of centroids for a landmark L can be represented by <math>\mu_{l_j} = \frac{1}{v} \sum_{i \in C_j} x_i, j \in 1,...,v</math> where v is the number of valid clusters for landmark L. <br />
<br />
Once the centroids are calculated for each landmark class, the system can make decisions whether there is any landmark in an image. The query image is passed through the landmark CNN and the resulting embedding vector is compared with all centroids by dot product similarity using approximate k-nearest neighbors (AKNN). To distinguish landmark classes from non-landmark, a threshold <math>\eta</math> is set and it will be compared with the maximum similarity to determine if the image contains any landmarks.<br />
<br />
The full inference algorithm is described in the below figure.<br />
<br />
[[File:t358wang_algorithm2.png |center|600px]]<br />
<br />
== Experiments and Analysis ==<br />
<br />
'''Offline test'''<br />
<br />
In order to measure the quality of the model, an offline test set was collected and manually labeled. According to the calculations, photos containing landmarks make up 1 − 3% of the total number of photos on average. This distribution was emulated in an offline test, and the geo-information and landmark references weren’t used. <br />
The results of this test are presented in the table below. Two metrics were used to measure the results of experiments: Sensitivity — the accuracy of a model on images with landmarks (also called Recall) and Specificity — the accuracy of a model on images without landmarks. Several types of DELF were evaluated, and the best results in terms of sensitivity and specificity were included in the table below. The table also contains the results of the model trained only with Softmax loss, Softmax, and Center loss. Thus, the table below reflects improvements in our approach with the addition of new elements in it.<br />
<br />
[[File:t358wang_models_eval.png |center|600px]]<br />
<br />
It’s very important to understand how a model works on “rare” landmarks due to the small amount of data for them. Therefore, the behavior of the model was examined separately on “rare” and “frequent” landmarks in the table below. The column “Part from total number” shows what percentage of landmark examples in the offline test has the corresponding type of landmarks. And we find that the sensitivity of “frequent” landmarks are much higher than “rare” landmarks.<br />
<br />
[[File:t358wang_rare_freq.png |center|600px]]<br />
<br />
Analysis of the behavior of the model in different categories of landmarks in the offline test is presented in the table below. These results show that the model can successfully work with various categories of landmarks. Predictably better results (92% of sensitivity and 99.5% of specificity) could also be obtained when the offline test with geo-information was launched on the model.<br />
<br />
[[File:t358wang_landmark_category.png |center|600px]]<br />
<br />
'''Revisited Paris dataset'''<br />
<br />
Revisited Paris dataset (RPar)[2] was also used to measure the quality of the landmark recognition approach. This dataset with Revisited Oxford (ROxf) is standard benchmarks for the comparison of image retrieval algorithms. In the recognition, it is important to determine the landmark, which is contained in the query image. Images of the same landmark can have different shooting angles or taken inside/outside the building. Thus, it is reasonable to measure the quality of the model in the standard and adapt it to the task settings. That means not all classes from queries are presented in the landmark dataset. For those images containing correct landmarks but taken from different shooting angles within the building, we transferred them to the “junk” category, which does not influence the final score and makes the test markup closer to our model’s goal. Results on RPar with and without distractors in medium and hard modes are presented in the table below. <br />
<br />
[[File:t358wang_methods_eval1.png |center|600px]]<br />
<br />
[[File:t358wang_methods_eval2.png |center|600px]]<br />
<br />
== Comparison ==<br />
<br />
Recent most efficient approaches to landmark recognition are built on fine-tuned CNN. We chose to compare our method to DELF on how well each performs on recognition tasks. A brief summary is given below:<br />
<br />
[[File:t358wang_comparison.png |center|600px]]<br />
<br />
''' Offline test and timing '''<br />
<br />
Both approaches obtained similar results for image retrieval in the offline test (shown in the sensitivity&specificity table), but the proposed approach is much faster on the inference stage and more memory efficient.<br />
<br />
To be more detailed, during the inference stage, DELF needs more forward passes through CNN, has to search the entire database, and performs the RANSAC method for geometric verification. All of them make it much more time-consuming than our proposed approach. Our approach mainly uses centroids, this makes it take less time and needs to store fewer elements.<br />
<br />
== Conclusion ==<br />
<br />
In this paper we were hoping to solve some difficulties emerging when trying to apply landmark recognition to the production level: there might not be a clean & sufficiently large database for interesting tasks, algorithms should be fast, scalable, and should aim for low FP and high accuracy.<br />
<br />
While aiming for these goals, we presented a way of cleaning landmark data. And most importantly, we introduced the usage of embeddings of deep CNN to make recognition fast and scalable, trained by curriculum learning techniques with modified versions of Center loss. Compared to the state-of-the-art methods, this approach shows similar results but is much faster and suitable for implementation on a large scale.<br />
<br />
== References ==<br />
[1] Andrei Boiarov and Eduard Tyantov. 2019. Large Scale Landmark Recogni- tion via Deep Metric Learning. In The 28th ACM International Conference on Information and Knowledge Management (CIKM ’19), November 3–7, 2019, Beijing, China. ACM, New York, NY, USA, 10 pages. https://arxiv.org/pdf/1908.10192.pdf 3357384.3357956<br />
<br />
[2] FilipRadenović,AhmetIscen,GiorgosTolias,YannisAvrithis,andOndřejChum.<br />
2018. Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking.<br />
arXiv preprint arXiv:1803.11285 (2018).</div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:T358wang&diff=46077User:T358wang2020-11-23T17:16:31Z<p>T358wang: /* Methodology */</p>
<hr />
<div><br />
== Group ==<br />
Rui Chen, Zeren Shen, Zihao Guo, Taohao Wang<br />
<br />
== Introduction ==<br />
<br />
Landmark recognition is an image retrieval task with its own specific challenges and this paper provides a new and effective method to recognize landmark images. And this method has been successfully applied to actual images. In this way, statue buildings and characteristic objects can be effectively identified.<br />
<br />
There are many difficulties encountered in the development process:<br />
<br />
1. The first problem is that the concept of landmarks cannot be strictly defined, because landmarks can be any object and building.<br />
<br />
2. The second problem is that the same landmark can be photographed from different angles. The results of the multi-angle shooting will result in very different picture characteristics. But the system needs to accurately identify different landmarks. <br />
<br />
3. The third problem is that the landmark recognition system must recognize a large number of landmarks, and the recognition must achieve high accuracy. <br />
<br />
These problems require that the system should have a very low false alarm rate and high recognition accuracy. <br />
There are also three potential problems:<br />
<br />
1. The processed data set contains a little error content, the image content is not clean and the quantity is huge.<br />
<br />
2. The algorithm for learning the training set must be fast and scalable.<br />
<br />
3. While displaying high-quality judgment landmarks, there is no image geographic information mixed.<br />
<br />
The article describes the deep convolutional neural network (CNN) architecture, loss function, training method, and inference aspects. The results of quantitative experiments will be displayed through testing and deployment effect analysis to prove the effectiveness of our model. Finally, it introduces the comparison results between the method proposed in this paper and the latest model and gives a conclusion.<br />
<br />
== Related Work ==<br />
<br />
Landmark recognition can be regarded as one of the tasks of image retrieval, and a large number of documents are concentrated on image retrieval tasks. In the past two decades, the field of image retrieval has made significant progress, and the main methods can be divided into two categories. <br />
The first is a classic retrieval method using local features, a method based on local feature descriptors organized in bag-of-words, spatial verification, Hamming embedding, and query expansion. These methods are dominant in image retrieval. Later, until the rise of deep convolutional neural networks (CNN), deep convolutional neural networks (CNN) were used to generate global descriptors of input images.<br />
Another method is to selectively match the kernel Hamming embedding method extension. With the advent of deep convolutional neural networks, the most effective image retrieval method is based on training CNNs for specific tasks. Deep networks are very powerful for semantic feature representation, which allows us to effectively use them for landmark recognition. This method shows good results but brings additional memory and complexity costs. <br />
The DELF (DEep local feature) by Noh et al. proved promising results. This method combines the classic local feature method with deep learning. This allows us to extract local features from the input image and then use RANSAC for geometric verification. <br />
The goal of the project is to describe a method for accurate and fast large-scale landmark recognition using the advantages of deep convolutional neural networks.<br />
<br />
<br />
== Methodology ==<br />
<br />
This section will describe in detail the CNN architecture, loss function, training procedure, and inference implementation of the landmark recognition system. The figure below is an overview of the landmark recognition system.<br />
<br />
[[File:t358wang_landmark_recog_system.png |center|800px]]<br />
<br />
The landmark CNN consists of three parts including the main network, the embedding layer, and the classification layer. To obtain a CNN main network suitable for training landmark recognition model, fine-tuning is applied and several pre-trained backbones (Residual Networks) based on other similar datasets, including ResNet-50, ResNet-200, SE-ResNext-101, and Wide Residual Network (WRN-50-2), are evaluated based on inference quality and efficiency. Based on the evaluation results, WRN-50-2 is selected as the optimal backbone architecture.<br />
<br />
[[File:t358wang_backbones.png |center|600px]]<br />
<br />
For the embedding layer, as shown in figure x, the last fully-connected layer after the averaging pool is removed. Instead, a fully-connected 2048 <math>\times</math> 512 layer and a batch normalization are added as the embedding layer. After the batch norm, a fully-connected 512 <math>\times</math> n layer is added as the classification layer. The below figure shows the overview of the CNN architecture of the landmark recognition system.<br />
<br />
[[File:t358wang_network_arch.png |center|800px]]<br />
<br />
To effectively determine the embedding vectors for each landmark class (centroids), the network needs to be trained to have the members of each class to be as close as possible to the centroids. Several suitable loss functions are evaluated including Contrastive Loss, Arcface, and Center loss. The center loss is selected since it achieves the optimal test results and it trains a center of embeddings of each class and penalizes distances between image embeddings as well as their class centers. In addition, the center loss is a simple addition to softmax loss and is trivial to implement.<br />
<br />
When implementing the loss function, a new additional class that includes all non-landmark instances needs to be added and the center loss function needs to be modified as follows: Let n be the number of landmark classes, m be the mini-batch size, <math>x_i \in R^d</math> is the i-th embedding and <math>y_i</math> is the corresponding label where <math>y_i \in</math> {1,...,n,n+1}, n+1 is the label of the non-landmark class. Denote <math>W \in R^{d \times n}</math> as the weights of the classifier layer, <math>W_j</math> as its j-th column. Let <math>c_{y_i}</math> be the <math>y_i</math> th embeddings center from Center loss and <math>\lambda</math> be the balancing parameter of Center loss. Then the final loss function will be: <br />
<br />
[[File:t358wang_loss_function.png |center|600px]]<br />
<br />
In the training procedure, the stochastic gradient descent(SGD) will be used as the optimizer with momentum=0.9 and weight decay = 5e-3. For the center loss function, the parameter <math>\lambda</math> is set to 5e-5. Each image is resized to 256 <math>\times</math> 256 and several data augmentations are applied to the dataset including random resized crop, color jitter and random flip. The training dataset is divided into four parts based on the geographical affiliation of cities where landmarks are located: Europe/Russia, North America/Australia/Oceania, Middle East/North Africa, and the Far East. <br />
<br />
The paper introduces curriculum learning for landmark recognition, which is shown in the below figure. The algorithm is trained for 30 epochs and the learning rate <math>\alpha_1, \alpha_2, \alpha_3</math> will be reduced by a factor of 10 at the 12th epoch and 24th epoch.<br />
<br />
[[File:t358wang_algorithm1.png |center|600px]]<br />
<br />
In the inference phase, the paper introduces the term “centroids” which are embedding vectors that are calculated by averaging embeddings and are used to describe landmark classes. The calculation of centroids is significant to effectively determine whether a query image contains a landmark. The paper proposes two approaches to help the inference algorithm to calculate the centroids. First, instead of using the entire training data for each landmark, data cleaning is done to remove most of the redundant and irrelevant elements in the image. Second, since each landmark can have different shooting angles, it is more efficient to calculate a separate centroid for each shooting angle. Hence, a hierarchical agglomerative clustering algorithm is proposed to partition training data into several valid clusters for each landmark and the set of centroids for a landmark L can be represented by <math>\mu_{l_j} = \frac{1}{v} \sum_{i \in C_j} x_i, j \in 1,...,v</math> where v is the number of valid clusters for landmark L. <br />
<br />
Once the centroids are calculated for each landmark class, the system can make decisions whether there is any landmark in an image. The query image is passed through the landmark CNN and the resulting embedding vector is compared with all centroids by dot product similarity using approximate k-nearest neighbors (AKNN). To distinguish landmark classes from non-landmark, a threshold <math>\eta</math> is set and it will be compared with the maximum similarity to determine if the image contains any landmarks.<br />
<br />
The full inference algorithm is described in the below figure.<br />
<br />
[[File:t358wang_algorithm2.png |center|600px]]<br />
<br />
== Experiments and Analysis ==<br />
<br />
'''Offline test'''<br />
<br />
In order to measure the quality of the model, an offline test set was collected and manually labeled. According to the calculations, photos containing landmarks make up 1 − 3% of the total number of photos on average. This distribution was emulated in an offline test, and the geo-information and landmark references weren’t used. <br />
The results of this test are presented in Table 4. Two metrics were used to measure the results of experiments: Sensitivity — the accuracy of a model on images with landmarks (also called Recall) and Specificity — the accuracy of a model on images without landmarks. Several types of DELF were evaluated, and the best results in terms of sensitivity and specificity were included in Table 4. The table also contains the results of the model trained only with Softmax loss, Softmax, and Center loss. Thus, Table 4 reflects improvements in our approach with the addition of new elements in it.<br />
<br />
[[File:t358wang_models_eval.png |center|600px]]<br />
<br />
It’s very important to understand how a model works on “rare” landmarks due to the small amount of data for them. Therefore, the behavior of the model was examined separately on “rare” and “frequent” landmarks in Table 5. The column “Part from total number” shows what percentage of landmark examples in the offline test has the corresponding type of landmarks.<br />
<br />
[[File:t358wang_rare_freq.png |center|600px]]<br />
<br />
Analysis of the behavior of the model in different categories of landmarks in the offline test is presented in Table 6. These results show that the model can successfully work with various categories of landmarks. Predictably better results (92% of sensitivity and 99.5% of specificity) could also be obtained when the offline test with geo-information was launched on the model.<br />
<br />
[[File:t358wang_landmark_category.png |center|600px]]<br />
<br />
'''Revisited Paris dataset'''<br />
<br />
Revisited Paris dataset (RPar) was also used to measure the quality of the landmark recognition approach. This dataset with Revisited Oxford (ROxf) is standard benchmarks for the comparison of image retrieval algorithms. In the recognition, it is important to determine the landmark, which is contained in the query image. Images of the same landmark can have different shooting angles or taken inside/outside the building. Thus, it is reasonable to measure the quality of the model in the standard and adapt it to the task settings. That means not all classes from queries are presented in the landmark dataset. For those images containing correct landmarks but taken from different shooting angles within the building, we transferred them to the “junk” category, which does not influence the final score and makes the test markup closer to our model’s goal. Results on RPar with and without distractors in medium and hard modes are presented in Tables 7 and 8. <br />
<br />
[[File:t358wang_methods_eval1.png |center|600px]]<br />
<br />
[[File:t358wang_methods_eval2.png |center|600px]]<br />
<br />
<br />
== Comparison ==<br />
<br />
Recent most efficient approaches to landmark recognition are built on fine-tuned CNN. We chose to compare our method to DELF on how well each performs on recognition tasks. A brief summary is given below:<br />
<br />
[[File:t358wang_comparison.png |center|600px]]<br />
<br />
<br />
Offline test and timing<br />
<br />
Both approaches obtained similar results for image retrieval in the offline test (shown in table 4), but the proposed approach is much faster on the inference stage and more memory efficient.<br />
<br />
To be more detailed, during the inference stage, DELF needs more forward passes through CNN, has to search the entire database, and performs the RANSAC method for geometric verification. All of them make it much more time-consuming than our proposed approach. Our approach mainly uses centroids, this makes it take less time and needs to store fewer elements.<br />
<br />
<br />
== Conclusion ==<br />
<br />
In this paper we were hoping to solve some difficulties emerging when trying to apply landmark recognition to the production level: there might not be a clean & sufficiently large database for interesting tasks, algorithms should be fast, scalable, and should aim for low FP and high accuracy.<br />
<br />
While aiming for these goals, we presented a way of cleaning landmark data. And most importantly, we introduced the usage of embeddings of deep CNN to make recognition fast and scalable, trained by curriculum learning techniques with modified versions of Center loss. Compared to the state-of-the-art methods, this approach shows similar results but is much faster and suitable for implementation on a large scale.<br />
<br />
== References ==</div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat441F21&diff=46076stat441F212020-11-23T17:12:31Z<p>T358wang: /* Paper presentation */</p>
<hr />
<div><br />
<br />
== [[F20-STAT 441/841 CM 763-Proposal| Project Proposal ]] ==<br />
<br />
<!--[https://goo.gl/forms/apurag4dr9kSR76X2 Your feedback on presentations]--><br />
<br />
= Record your contributions here [https://docs.google.com/spreadsheets/d/10CHiJpAylR6kB9QLqN7lZHN79D9YEEW6CDTH27eAhbQ/edit?usp=sharing]=<br />
<br />
Use the following notations:<br />
<br />
P: You have written a summary/critique on the paper.<br />
<br />
T: You had a technical contribution on a paper (excluding the paper that you present).<br />
<br />
E: You had an editorial contribution on a paper (excluding the paper that you present).<br />
<br />
=Paper presentation=<br />
{| class="wikitable"<br />
<br />
{| border="1" cellpadding="3"<br />
|-<br />
|width="60pt"|Date<br />
|width="250pt"|Name <br />
|width="15pt"|Paper number <br />
|width="700pt"|Title<br />
|width="15pt"|Link to the paper<br />
|width="30pt"|Link to the summary<br />
|width="30pt"|Link to the video<br />
|-<br />
|Sep 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary] || [https://youtu.be/JWozRg_X-Vg?list=PLehuLRPyt1HzXDemu7K4ETcF0Ld_B5adG&t=539]<br />
|-<br />
|Week of Nov 16 ||Sharman Bharat, Li Dylan,Lu Leonie, Li Mingdao || 1|| Risk prediction in life insurance industry using supervised learning algorithms || [https://rdcu.be/b780J Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Bsharman Summary] ||<br />
[https://www.youtube.com/watch?v=TVLpSFYgF0c&feature=youtu.be]<br />
|-<br />
|Week of Nov 16 || Delaney Smith, Mohammad Assem Mahmoud || 2|| Influenza Forecasting Framework based on Gaussian Processes || [https://proceedings.icml.cc/static/paper_files/icml/2020/1239-Paper.pdf Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Influenza_Forecasting_Framework_based_on_Gaussian_Processes Summary]|| [https://www.youtube.com/watch?v=HZG9RAHhpXc&feature=youtu.be]<br />
|-<br />
|Week of Nov 16 || Tatianna Krikella, Swaleh Hussain, Grace Tompkins || 3|| Processing of Missing Data by Neural Networks || [http://papers.nips.cc/paper/7537-processing-of-missing-data-by-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Gtompkin Summary] || [https://learn.uwaterloo.ca/d2l/ext/rp/577051/lti/framedlaunch/6ec1ebea-5547-46a2-9e4f-e3dc9d79fd54]<br />
|-<br />
|Week of Nov 16 ||Jonathan Chow, Nyle Dharani, Ildar Nasirov ||4 ||Streaming Bayesian Inference for Crowdsourced Classification ||[https://papers.nips.cc/paper/9439-streaming-bayesian-inference-for-crowdsourced-classification.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Streaming_Bayesian_Inference_for_Crowdsourced_Classification Summary] || [https://www.youtube.com/watch?v=UgVRzi9Ubws]<br />
|-<br />
|Week of Nov 16 || Matthew Hall, Johnathan Chalaturnyk || 5|| Neural Ordinary Differential Equations || [https://papers.nips.cc/paper/7892-neural-ordinary-differential-equations.pdf] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Neural_ODEs Summary]||<br />
|-<br />
|Week of Nov 16 || Luwen Chang, Qingyang Yu, Tao Kong, Tianrong Sun || 6|| Adversarial Attacks on Copyright Detection Systems || Paper [https://proceedings.icml.cc/static/paper_files/icml/2020/1894-Paper.pdf] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Adversarial_Attacks_on_Copyright_Detection_Systems Summary] || [https://www.youtube.com/watch?v=bQI9S6bCo8o]<br />
|-<br />
|Week of Nov 16 || Casey De Vera, Solaiman Jawad || 7|| IPBoost – Non-Convex Boosting via Integer Programming || [https://arxiv.org/pdf/2002.04679.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=IPBoost Summary] || [https://www.youtube.com/watch?v=4DhJDGC4pyI&feature=youtu.be]<br />
|-<br />
|Week of Nov 16 || Yuxin Wang, Evan Peters, Yifan Mou, Sangeeth Kalaichanthiran || 8|| What Game Are We Playing? End-to-end Learning in Normal and Extensive Form Games || [https://arxiv.org/pdf/1805.02777.pdf] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=what_game_are_we_playing Summary] || [https://www.youtube.com/watch?v=9qJoVxo3hnI&feature=youtu.be]<br />
|-<br />
|Week of Nov 16 || Yuchuan Wu || 9|| || || ||<br />
|-<br />
|Week of Nov 16 || Zhou Zeping, Siqi Li, Yuqin Fang, Fu Rao || 10|| A survey of neural network-based cancer prediction models from microarray data || [https://www.sciencedirect.com/science/article/pii/S0933365717305067 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Y93fang Summary] || [https://youtu.be/B8pPUU8ypO0]<br />
|-<br />
|Week of Nov 23 ||Jinjiang Lian, Jiawen Hou, Yisheng Zhu, Mingzhe Huang || 11|| DROCC: Deep Robust One-Class Classification || [https://proceedings.icml.cc/static/paper_files/icml/2020/6556-Paper.pdf paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:J46hou Summary] || [https://www.youtube.com/watch?v=tvCEvvy54X8&ab_channel=JJLian]<br />
|-<br />
|Week of Nov 23 || Bushra Haque, Hayden Jones, Michael Leung, Cristian Mustatea || 12|| Combine Convolution with Recurrent Networks for Text Classification || [https://arxiv.org/pdf/2006.15795.pdf Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Cvmustat Summary] || [https://www.youtube.com/watch?v=or5RTxDnZDo]<br />
|-<br />
|Week of Nov 23 || Taohao Wang, Zeren Shen, Zihao Guo, Rui Chen || 13|| Large Scale Landmark Recognition via Deep Metric Learning || [https://arxiv.org/pdf/1908.10192.pdf paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:T358wang Summary] ||<br />
|-<br />
|Week of Nov 23 || Qianlin Song, William Loh, Junyue Bai, Phoebe Choi || 14|| Task Understanding from Confusing Multi-task Data || [https://proceedings.icml.cc/static/paper_files/icml/2020/578-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Task_Understanding_from_Confusing_Multi-task_Data Summary] || [https://youtu.be/i_5PQdfuH-Y]<br />
|-<br />
|Week of Nov 23 || Rui Gong, Xuetong Wang, Xinqi Ling, Di Ma || 15|| Semantic Relation Classification via Convolution Neural Network|| [https://www.aclweb.org/anthology/S18-1127.pdf paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Semantic_Relation_Classification——via_Convolution_Neural_Network Summary]||<br />
|-<br />
|Week of Nov 23 || Xiaolan Xu, Robin Wen, Yue Weng, Beizhen Chang || 16|| Graph Structure of Neural Networks || [https://proceedings.icml.cc/paper/2020/file/757b505cfd34c64c85ca5b5690ee5293-Paper.pdf Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Graph_Structure_of_Neural_Networks Summary] || [https://youtu.be/x9eUgEwntcs Video]<br />
|-<br />
|Week of Nov 23 ||Hansa Halim, Sanjana Rajendra Naik, Samka Marfua, Shawrupa Proshasty || 17|| Superhuman AI for multiplayer poker || [https://www.cs.cmu.edu/~noamb/papers/19-Science-Superhuman.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Superhuman_AI_for_Multiplayer_Poker Summary]|| [https://www.youtube.com/watch?v=kazqcOwbtTI Video]<br />
|-<br />
|Week of Nov 23 ||Guanting Pan, Haocheng Chang, Zaiwei Zhang || 18|| Point-of-Interest Recommendation: Exploiting Self-Attentive Autoencoders with Neighbor-Aware Influence || [https://arxiv.org/pdf/1809.10770.pdf Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Point-of-Interest_Recommendation:_Exploiting_Self-Attentive_Autoencoders_with_Neighbor-Aware_Influence Summary] || [https://www.youtube.com/watch?v=aAwjaos_Hus]<br />
|-<br />
|Week of Nov 23 || Jerry Huang, Daniel Jiang, Minyan Dai || 19|| Neural Speed Reading Via Skim-RNN ||[https://arxiv.org/pdf/1711.02085.pdf?fbclid=IwAR3EeFsKM_b5p9Ox7X9mH-1oI3U3oOKPBy3xUOBN0XvJa7QW2ZeJJ9ypQVo Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Neural_Speed_Reading_via_Skim-RNN Summary]|| [https://youtu.be/vOENmt9jgVE Video]<br />
|-<br />
|Week of Nov 23 ||Ruixian Chin, Yan Kai Tan, Jason Ong, Wen Cheen Chiew || 20|| DivideMix: Learning with Noisy Labels as Semi-supervised Learning || [https://openreview.net/pdf?id=HJgExaVtwr Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Yktan Summary]||<br />
|-<br />
|Week of Nov 30 || Banno Dion, Battista Joseph, Kahn Solomon || 21|| Music Recommender System Based on Genre using Convolutional Recurrent Neural Networks || [https://www.sciencedirect.com/science/article/pii/S1877050919310646] || ||<br />
|-<br />
|Week of Nov 30 || Sai Arvind Budaraju, Isaac Ellmen, Dorsa Mohammadrezaei, Emilee Carson || 22|| A universal SNP and small-indel variant caller using deep neural networks||[https://www.nature.com/articles/nbt.4235.epdf?author_access_token=q4ZmzqvvcGBqTuKyKgYrQ9RgN0jAjWel9jnR3ZoTv0NuM3saQzpZk8yexjfPUhdFj4zyaA4Yvq0LWBoCYQ4B9vqPuv8e2HHy4vShDgEs8YxI_hLs9ov6Y1f_4fyS7kGZ Paper] || ||<br />
|-<br />
|Week of Nov 30 || Daniel Fagan, Cooper Brooke, Maya Perelman || 23|| Efficient kNN Classification With Different Number of Nearest Neighbors || [https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7898482 Paper] || ||<br />
|-<br />
|Week of Nov 30 || Karam Abuaisha, Evan Li, Jason Pu, Nicholas Vadivelu || 24|| Being Bayesian about Categorical Probability || [https://proceedings.icml.cc/static/paper_files/icml/2020/3560-Paper.pdf Paper] || ||<br />
|-<br />
|Week of Nov 30 || Anas Mahdi Will Thibault Jan Lau Jiwon Yang || 25|| Loss Function Search for Face Recognition<br />
|| [https://proceedings.icml.cc/static/paper_files/icml/2020/245-Paper.pdf] paper || ||<br />
|-<br />
|Week of Nov 30 ||Zihui (Betty) Qin, Wenqi (Maggie) Zhao, Muyuan Yang, Amartya (Marty) Mukherjee || 26|| Deep Learning for Cardiologist-level Myocardial Infarction Detection in Electrocardiograms || [https://arxiv.org/pdf/1912.07618.pdf?fbclid=IwAR0RwATSn4CiT3qD9LuywYAbJVw8YB3nbex8Kl19OCExIa4jzWaUut3oVB0 Paper] || Summary [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Learning_for_Cardiologist-level_Myocardial_Infarction_Detection_in_Electrocardiograms&fbclid=IwAR1Tad2DAM7LT6NXXuSYDZtHHBvN0mjZtDdCOiUFFq_XwVcQxG3hU-3XcaE] ||<br />
|-<br />
|Week of Nov 30 || Stan Lee, Seokho Lim, Kyle Jung, Daehyun Kim || 27|| Bag of Tricks for Efficient Text Classification || [https://arxiv.org/pdf/1607.01759.pdf paper] || ||<br />
|-<br />
|Week of Nov 30 || Yawen Wang, Danmeng Cui, ZiJie Jiang, Mingkang Jiang, Haotian Ren, Haris Bin Zahid || 28|| A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques || [https://arxiv.org/pdf/1707.02919.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Describtion_of_Text_Mining Summary] ||<br />
|-<br />
|Week of Nov 30 || Qing Guo, XueGuang Ma, James Ni, Yuanxin Wang || 29|| Mask R-CNN || [https://arxiv.org/pdf/1703.06870.pdf Paper] || ||<br />
|-<br />
|Week of Nov 30 || Bertrand Sodjahin, Junyi Yang, Jill Yu Chieh Wang, Yu Min Wu, Calvin Li || 30|| Research paper classifcation systems based on TF‑IDF and LDA schemes || [https://hcis-journal.springeropen.com/articles/10.1186/s13673-019-0192-7?fbclid=IwAR3swO-eFrEbj1BUQfmomJazxxeFR6SPgr6gKayhs38Y7aBG-zX1G3XWYRM Paper] || ||<br />
|-<br />
|Week of Nov 30 || Daniel Zhang, Jacky Yao, Scholar Sun, Russell Parco, Ian Cheung || 31 || Speech2Face: Learning the Face Behind a Voice || [https://arxiv.org/pdf/1905.09773.pdf?utm_source=thenewstack&utm_medium=website&utm_campaign=platform Paper] || ||<br />
|-<br />
|Week of Nov 30 || Siyuan Xia, Jiaxiang Liu, Jiabao Dong, Yipeng Du || 32 || Evaluating Machine Accuracy on ImageNet || [https://proceedings.icml.cc/static/paper_files/icml/2020/6173-Paper.pdf] || ||<br />
|-<br />
|Week of Nov 30 || Mushi Wang, Siyuan Qiu, Yan Yu || 33 || Surround Vehicle Motion Prediction Using LSTM-RNN for Motion Planning of Autonomous Vehicles at Multi-Lane Turn Intersections || [https://ieeexplore.ieee.org/abstract/document/8957421 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Surround_Vehicle_Motion_Prediction Summary] ||</div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:T358wang&diff=46074User:T358wang2020-11-23T17:11:40Z<p>T358wang: </p>
<hr />
<div><br />
== Group ==<br />
Rui Chen, Zeren Shen, Zihao Guo, Taohao Wang<br />
<br />
== Introduction ==<br />
<br />
Landmark recognition is an image retrieval task with its own specific challenges and this paper provides a new and effective method to recognize landmark images. And this method has been successfully applied to actual images. In this way, statue buildings and characteristic objects can be effectively identified.<br />
<br />
There are many difficulties encountered in the development process:<br />
<br />
1. The first problem is that the concept of landmarks cannot be strictly defined, because landmarks can be any object and building.<br />
<br />
2. The second problem is that the same landmark can be photographed from different angles. The results of the multi-angle shooting will result in very different picture characteristics. But the system needs to accurately identify different landmarks. <br />
<br />
3. The third problem is that the landmark recognition system must recognize a large number of landmarks, and the recognition must achieve high accuracy. <br />
<br />
These problems require that the system should have a very low false alarm rate and high recognition accuracy. <br />
There are also three potential problems:<br />
<br />
1. The processed data set contains a little error content, the image content is not clean and the quantity is huge.<br />
<br />
2. The algorithm for learning the training set must be fast and scalable.<br />
<br />
3. While displaying high-quality judgment landmarks, there is no image geographic information mixed.<br />
<br />
The article describes the deep convolutional neural network (CNN) architecture, loss function, training method, and inference aspects. The results of quantitative experiments will be displayed through testing and deployment effect analysis to prove the effectiveness of our model. Finally, it introduces the comparison results between the method proposed in this paper and the latest model and gives a conclusion.<br />
<br />
== Related Work ==<br />
<br />
Landmark recognition can be regarded as one of the tasks of image retrieval, and a large number of documents are concentrated on image retrieval tasks. In the past two decades, the field of image retrieval has made significant progress, and the main methods can be divided into two categories. <br />
The first is a classic retrieval method using local features, a method based on local feature descriptors organized in bag-of-words, spatial verification, Hamming embedding, and query expansion. These methods are dominant in image retrieval. Later, until the rise of deep convolutional neural networks (CNN), deep convolutional neural networks (CNN) were used to generate global descriptors of input images.<br />
Another method is to selectively match the kernel Hamming embedding method extension. With the advent of deep convolutional neural networks, the most effective image retrieval method is based on training CNNs for specific tasks. Deep networks are very powerful for semantic feature representation, which allows us to effectively use them for landmark recognition. This method shows good results but brings additional memory and complexity costs. <br />
The DELF (DEep local feature) by Noh et al. proved promising results. This method combines the classic local feature method with deep learning. This allows us to extract local features from the input image and then use RANSAC for geometric verification. <br />
The goal of the project is to describe a method for accurate and fast large-scale landmark recognition using the advantages of deep convolutional neural networks.<br />
<br />
<br />
== Methodology ==<br />
<br />
This section will describe in detail the CNN architecture, loss function, training procedure, and inference implementation of the landmark recognition system. The figure below is an overview of the landmark recognition system.<br />
<br />
[[File:t358wang_landmark_recog_system.png |center|800px]]<br />
<br />
The landmark CNN consists of three parts including the main network, the embedding layer, and the classification layer. To obtain a CNN main network suitable for training landmark recognition model, fine-tuning is applied and several pre-trained backbones (Residual Networks) based on other similar datasets, including ResNet-50, ResNet-200, SE-ResNext-101, and Wide Residual Network (WRN-50-2), are evaluated based on inference quality and efficiency. Based on the evaluation results, WRN-50-2 is selected as the optimal backbone architecture.<br />
<br />
[[File:t358wang_backbones.png |center|600px]]<br />
<br />
For the embedding layer, as shown in figure x, the last fully-connected layer after the averaging pool is removed. Instead, a fully-connected 2048 <math>\times</math> 512 layer and a batch normalization are added as the embedding layer. After the batch norm, a fully-connected 512 <math>\times</math> n layer is added as the classification layer. The below figure shows the overview of the CNN architecture of the landmark recognition system.<br />
<br />
[[File:t358wang_network_arch.png |center|800px]]<br />
<br />
To effectively determine the embedding vectors for each landmark class (centroids), the network needs to be trained to have the members of each class to be as close as possible to the centroids. Several suitable loss functions are evaluated including Contrastive Loss, Arcface, and Center loss. The center loss is selected since it achieves the optimal test results and it trains a center of embeddings of each class and penalizes distances between image embeddings as well as their class centers. In addition, the center loss is a simple addition to softmax loss and is trivial to implement.<br />
<br />
When implementing the loss function, a new additional class that includes all non-landmark instances needs to be added and the center loss function needs to be modified as follows: Let n be the number of landmark classes, m be the mini-batch size, <math>x_i \in R^d</math> is the i-th embedding and <math>y_i</math> is the corresponding label where <math>y_i \in</math> {1,...,n,n+1}, n+1 is the label of the non-landmark class. Denote <math>W \in R^{d \times n}</math> as the weights of the classifier layer, <math>W_j</math> as its j-th column. Let <math>c_{y_i}</math> be the <math>y_i</math> th embeddings center from Center loss and <math>\lambda</math> be the balancing parameter of Center loss. Then the final loss function will be: <br />
<br />
[[File:t358wang_loss_function.png |center|600px]]<br />
<br />
In the training procedure, the stochastic gradient descent(SGD) will be used as the optimizer with momentum=0.9 and weight decay = 5e-3. For the center loss function, the parameter <math>\lambda</math> is set to 5e-5. Each image is resized to 256 <math>\times</math> 256 and several data augmentations are applied to the dataset including random resized crop, color jitter and random flip. The training dataset is divided into four parts based on the geographical affiliation of cities where landmarks are located: Europe/Russia, North America/Australia/Oceania, Middle East/North Africa, and the Far East. <br />
<br />
The paper introduces curriculum learning for landmark recognition, which is shown in the below figure. The algorithm is trained for 30 epochs and the learning rate <math>\alpha_1, \alpha_2, \alpha_3</math> will be reduced by a factor of 10 at the 12th epoch and 24th epoch.<br />
<br />
[[File:t358wang_algorithm1.png |center|600px]]<br />
<br />
In the inference phase, the paper introduces the term “centroids” which are embedding vectors that are calculated by averaging embeddings and are used to describe landmark classes. The calculation of centroids is significant to effectively determine whether a query image contains a landmark. The paper proposes two approaches to help the inference algorithm to calculate the centroids. First, instead of using the entire training data for each landmark, data cleaning is done to remove most of the redundant and irrelevant elements in the image. Second, since each landmark can have different shooting angles, it is more efficient to calculate a separate centroid for each shooting angle. Hence, a hierarchical agglomerative clustering algorithm is proposed to partition training data into several valid clusters for each landmark and the set of centroids for a landmark L can be represented by <math>\mu_{l_j} = \frac{1}{v} \sum_{i \in C_j} x_i, j \in 1,...,v</math> where v is the number of valid clusters for landmark L. <br />
<br />
Once the centroids are calculated for each landmark class, the system can make decisions whether there is any landmark in an image. The query image is passed through the landmark CNN and the resulting embedding vector is compared with all centroids by dot product similarity using approximate k-nearest neighbors (AKNN). To distinguish landmark classes from non-landmark, a threshold <math>\eta</math> is set and it will be compared with the maximum similarity to determine if the image contains any landmarks.<br />
<br />
The full inference algorithm is described in the below figure.<br />
<br />
[[File:t358wang_algorithm1.png |center|600px]]<br />
<br />
== Experiments and Analysis ==<br />
<br />
'''Offline test'''<br />
<br />
In order to measure the quality of the model, an offline test set was collected and manually labeled. According to the calculations, photos containing landmarks make up 1 − 3% of the total number of photos on average. This distribution was emulated in an offline test, and the geo-information and landmark references weren’t used. <br />
The results of this test are presented in Table 4. Two metrics were used to measure the results of experiments: Sensitivity — the accuracy of a model on images with landmarks (also called Recall) and Specificity — the accuracy of a model on images without landmarks. Several types of DELF were evaluated, and the best results in terms of sensitivity and specificity were included in Table 4. The table also contains the results of the model trained only with Softmax loss, Softmax, and Center loss. Thus, Table 4 reflects improvements in our approach with the addition of new elements in it.<br />
<br />
[[File:t358wang_models_eval.png |center|600px]]<br />
<br />
It’s very important to understand how a model works on “rare” landmarks due to the small amount of data for them. Therefore, the behavior of the model was examined separately on “rare” and “frequent” landmarks in Table 5. The column “Part from total number” shows what percentage of landmark examples in the offline test has the corresponding type of landmarks.<br />
<br />
[[File:t358wang_rare_freq.png |center|600px]]<br />
<br />
Analysis of the behavior of the model in different categories of landmarks in the offline test is presented in Table 6. These results show that the model can successfully work with various categories of landmarks. Predictably better results (92% of sensitivity and 99.5% of specificity) could also be obtained when the offline test with geo-information was launched on the model.<br />
<br />
[[File:t358wang_landmark_category.png |center|600px]]<br />
<br />
'''Revisited Paris dataset'''<br />
<br />
Revisited Paris dataset (RPar) was also used to measure the quality of the landmark recognition approach. This dataset with Revisited Oxford (ROxf) is standard benchmarks for the comparison of image retrieval algorithms. In the recognition, it is important to determine the landmark, which is contained in the query image. Images of the same landmark can have different shooting angles or taken inside/outside the building. Thus, it is reasonable to measure the quality of the model in the standard and adapt it to the task settings. That means not all classes from queries are presented in the landmark dataset. For those images containing correct landmarks but taken from different shooting angles within the building, we transferred them to the “junk” category, which does not influence the final score and makes the test markup closer to our model’s goal. Results on RPar with and without distractors in medium and hard modes are presented in Tables 7 and 8. <br />
<br />
[[File:t358wang_methods_eval1.png |center|600px]]<br />
<br />
[[File:t358wang_methods_eval2.png |center|600px]]<br />
<br />
<br />
== Comparison ==<br />
<br />
Recent most efficient approaches to landmark recognition are built on fine-tuned CNN. We chose to compare our method to DELF on how well each performs on recognition tasks. A brief summary is given below:<br />
<br />
[[File:t358wang_comparison.png |center|600px]]<br />
<br />
<br />
Offline test and timing<br />
<br />
Both approaches obtained similar results for image retrieval in the offline test (shown in table 4), but the proposed approach is much faster on the inference stage and more memory efficient.<br />
<br />
To be more detailed, during the inference stage, DELF needs more forward passes through CNN, has to search the entire database, and performs the RANSAC method for geometric verification. All of them make it much more time-consuming than our proposed approach. Our approach mainly uses centroids, this makes it take less time and needs to store fewer elements.<br />
<br />
<br />
== Conclusion ==<br />
<br />
In this paper we were hoping to solve some difficulties emerging when trying to apply landmark recognition to the production level: there might not be a clean & sufficiently large database for interesting tasks, algorithms should be fast, scalable, and should aim for low FP and high accuracy.<br />
<br />
While aiming for these goals, we presented a way of cleaning landmark data. And most importantly, we introduced the usage of embeddings of deep CNN to make recognition fast and scalable, trained by curriculum learning techniques with modified versions of Center loss. Compared to the state-of-the-art methods, this approach shows similar results but is much faster and suitable for implementation on a large scale.<br />
<br />
== References ==</div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:T358wang&diff=46073User:T358wang2020-11-23T16:54:33Z<p>T358wang: </p>
<hr />
<div><br />
== Group ==<br />
Rui Chen, Zeren Shen, Zihao Guo, Taohao Wang<br />
<br />
== Introduction ==<br />
<br />
Landmark recognition is an image retrieval task with its own specific challenges and this paper provides a new and effective method to recognize landmark images. And this method has been successfully applied to actual images. In this way, statue buildings and characteristic objects can be effectively identified.<br />
<br />
There are many difficulties encountered in the development process:<br />
<br />
1. The first problem is that the concept of landmarks cannot be strictly defined, because landmarks can be any object and building.<br />
<br />
2. The second problem is that the same landmark can be photographed from different angles. The results of the multi-angle shooting will result in very different picture characteristics. But the system needs to accurately identify different landmarks. <br />
<br />
3. The third problem is that the landmark recognition system must recognize a large number of landmarks, and the recognition must achieve high accuracy. <br />
<br />
These problems require that the system should have a very low false alarm rate and high recognition accuracy. <br />
There are also three potential problems:<br />
<br />
1. The processed data set contains a little error content, the image content is not clean and the quantity is huge.<br />
<br />
2. The algorithm for learning the training set must be fast and scalable.<br />
<br />
3. While displaying high-quality judgment landmarks, there is no image geographic information mixed.<br />
<br />
The article describes the deep convolutional neural network (CNN) architecture, loss function, training method, and inference aspects. The results of quantitative experiments will be displayed through testing and deployment effect analysis to prove the effectiveness of our model. Finally, it introduces the comparison results between the method proposed in this paper and the latest model and gives a conclusion.<br />
<br />
== Related Work ==<br />
<br />
Landmark recognition can be regarded as one of the tasks of image retrieval, and a large number of documents are concentrated on image retrieval tasks. In the past two decades, the field of image retrieval has made significant progress, and the main methods can be divided into two categories. <br />
The first is a classic retrieval method using local features, a method based on local feature descriptors organized in bag-of-words, spatial verification, Hamming embedding, and query expansion. These methods are dominant in image retrieval. Later, until the rise of deep convolutional neural networks (CNN), deep convolutional neural networks (CNN) were used to generate global descriptors of input images.<br />
Another method is to selectively match the kernel Hamming embedding method extension. With the advent of deep convolutional neural networks, the most effective image retrieval method is based on training CNNs for specific tasks. Deep networks are very powerful for semantic feature representation, which allows us to effectively use them for landmark recognition. This method shows good results but brings additional memory and complexity costs. <br />
The DELF (DEep local feature) by Noh et al. proved promising results. This method combines the classic local feature method with deep learning. This allows us to extract local features from the input image and then use RANSAC for geometric verification. <br />
The goal of the project is to describe a method for accurate and fast large-scale landmark recognition using the advantages of deep convolutional neural networks.<br />
<br />
<br />
== Methodology ==<br />
<br />
This section will describe in detail the CNN architecture, loss function, training procedure, and inference implementation of the landmark recognition system. The figure below is an overview of the landmark recognition system.<br />
<br />
[[File:t358wang_landmark_recog_system.png |center]]<br />
<br />
The landmark CNN consists of three parts including the main network, the embedding layer, and the classification layer. To obtain a CNN main network suitable for training landmark recognition model, fine-tuning is applied and several pre-trained backbones (Residual Networks) based on other similar datasets, including ResNet-50, ResNet-200, SE-ResNext-101, and Wide Residual Network (WRN-50-2), are evaluated based on inference quality and efficiency. Based on the evaluation results, WRN-50-2 is selected as the optimal backbone architecture.<br />
<br />
For the embedding layer, as shown in figure x, the last fully-connected layer after the averaging pool is removed. Instead, a fully-connected 2048 <math>\times</math> 512 layer and a batch normalization are added as the embedding layer. After the batch norm, a fully-connected 512 <math>\times</math> n layer is added as the classification layer. The below figure shows the overview of the CNN architecture of the landmark recognition system.<br />
<br />
To effectively determine the embedding vectors for each landmark class (centroids), the network needs to be trained to have the members of each class to be as close as possible to the centroids. Several suitable loss functions are evaluated including contrastive loss, arcface, and center loss. The center loss is selected since it achieves the optimal test results and it trains a center of embeddings of each class and penalizes distances between image embeddings as well as their class centers. In addition, the center loss is a simple addition to softmax loss and is trivial to implement.<br />
<br />
When implementing the loss function, a new additional class that includes all non-landmark instances needs to be added and the center loss function needs to be modified as follows: Let n be the number of landmark classes, m be the mini-batch size, <math>x_i \in R^d</math> is the i-th embedding and <math>y_i</math> is the corresponding label where <math>y_i \in</math> {1,...,n,n+1}, n+1 is the label of the non-landmark class. Denote <math>W \in R^{d \times n}</math> as the weights of the classifier layer, <math>W_j</math> as its j-th column. Let <math>c_{y_i}</math> be the <math>y_i</math> th embeddings center from Center loss and <math>\lambda</math> be the balancing parameter of Center loss. Then the final loss function will be: xxx<br />
<br />
In the training procedure, the stochastic gradient descent(SGD) will be used as the optimizer with momentum=0.9 and weight decay = 5e-3. For the center loss function, the parameter <math>\lambda</math> is set to 5e-5. Each image is resized to 256 <math>\times</math> 256 and several data augmentations are applied to the dataset including random resized crop, color jitter and random flip. The training dataset is divided into four parts based on the geographical affiliation of cities where landmarks are located: Europe/Russia, North America/Australia/Oceania, Middle East/North Africa, and Far East. <br />
<br />
The paper introduces curriculum learning for landmark recognition, which is shown in figure xxx. The algorithm is trained for 30 epochs and the learning rate <math>\alpha_1, \alpha_2, \alpha_3</math> will be reduced by a factor of 10 at the 12th epoch and 24th epoch.<br />
<br />
In the inference phase, the paper introduces the term “centroids” which are embedding vectors that are calculated by averaging embeddings and are used to describe landmark classes. The calculation of centroids is significant to effectively determine whether a query image contains a landmark. The paper proposes two approaches to help the inference algorithm to calculate the centroids. First, instead of using the entire training data for each landmark, data cleaning is done to remove most of the redundant and irrelevant elements in the image. Second, since each landmark can have different shooting angles, it is more efficient to calculate a separate centroid for each shooting angle. Hence, a hierarchical agglomerative clustering algorithm is proposed to partition training data into several valid clusters for each landmark and the set of centroids for a landmark L can be represented by <math>\mu_{l_j} = \frac{1}{v} \sum_{i \in C_j} x_i, j \in 1,...,v</math> where v is the number of valid clusters for landmark L. <br />
<br />
Once the centroids are calculated for each landmark class, the system can make decisions whether there is any landmark in an image. The query image is passed through the landmark CNN and the resulting embedding vector is compared with all centroids by dot product similarity using approximate k-nearest neighbors (AKNN). To distinguish landmark classes from non-landmark, a threshold <math>\eta</math> is set and it will be compared with the maximum similarity to determine if the image contains any landmarks.<br />
<br />
The full inference algorithm is described in figure xxx.<br />
<br />
== Experiments and Analysis ==<br />
<br />
Offline test<br />
<br />
In order to measure the quality of the model, an offline test set was collected and manually labeled. According to the calculations, photos containing landmarks make up 1 − 3% of the total number of photos on average. This distribution was emulated in an offline test, and the geo information and landmark references weren’t used. <br />
The results of this test are presented in Table 4. Two metrics were used to measure the results of experiments: Sensitivity — the accuracy of a model on images with landmarks (also called Recall) and Specificity — the accuracy of a model on images without landmarks. Several types of DELF were evaluated, and the best results in terms of sensitivity and specificity were included in Table 4. The table also contains the results of the model trained only with Softmax loss, Softmax, and Center loss. Thus, Table 4 reflects improvements in our approach with the addition of new elements in it.<br />
<br />
It’s very important to understand how a model works on “rare” landmarks due to the small amount of data for them. Therefore, the behavior of the model was examined separately on “rare” and “frequent” landmarks in Table 5. The column “Part from total number” shows what percentage of landmark examples in the offline test has the corresponding type of landmarks.<br />
<br />
Analysis of the behavior of the model in different categories of landmarks in the offline test is presented in Table 6. These results show that the model can successfully work with various categories of landmarks. Predictably better results (92% of sensitivity and 99.5% of specificity) could also be obtained when the offline test with geo information was launched on the model.<br />
<br />
Revisited Paris dataset<br />
<br />
Revisited Paris dataset (RPar) was also used to measure the quality of the landmark recognition approach. This dataset with Revisited Oxford (ROxf) is standard benchmarks for comparison of image retrieval algorithms. In the recognition, it is important to determine the landmark, which is contained in the query image. Images of the same landmark can have different shooting angles or taken inside/outside the building. Thus, it is reasonable to measure the quality of the model in the standard and adapt it to the task settings. That means not all classes from queries are presented in the landmark dataset. For those images containing correct landmarks but taken from different shooting angles within the building, we transferred them to the “junk” category, which does not influence the final score and makes the test markup closer to our model’s goal. Results on RPar with and without distractors in medium and hard modes are presented in Tables 7 and 8. <br />
<br />
== Comparison ==<br />
<br />
Recent most efficient approaches to landmark recognition are built on fine-tuned CNN. We chose to compare our method to DELF on how well each performs on recognition tasks. A brief summary is given below:<br />
<br />
Figure xxx<br />
<br />
Offline test and timing<br />
<br />
Both approaches obtained similar results for image retrieval in the offline test (shown in table 4), but the proposed approach is much faster on the inference stage and more memory efficient.<br />
<br />
To be more detailed, during the inference stage, DELF needs more forward passes through CNN, has to search the entire database, and performs the RANSAC method for geometric verification. All of them make it much more time-consuming than our proposed approach. Our approach mainly uses centroids, this makes it take less time and needs to store fewer elements.<br />
<br />
<br />
== Conclusion ==<br />
<br />
In this paper we were hoping to solve some difficulties emerging when trying to apply landmark recognition to the production level: there might not be a clean & sufficiently large database for interesting tasks, algorithms should be fast, scalable, and should aim for low FP and high accuracy.<br />
<br />
While aiming for these goals, we presented a way of cleaning landmark data. And most importantly, we introduced the usage of embeddings of deep CNN to make recognition fast and scalable, trained by curriculum learning techniques with modified versions of Center loss. Compared to the state-of-the-art methods, this approach shows similar results but is much faster and suitable for implementation on a large scale.<br />
<br />
== References ==</div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:T358wang&diff=46072User:T358wang2020-11-23T16:50:46Z<p>T358wang: /* Methodology */</p>
<hr />
<div><br />
== Group ==<br />
Rui Chen, Zeren Shen, Zihao Guo, Taohao Wang<br />
<br />
== Introduction ==<br />
<br />
Landmark recognition is an image retrieval task with its own specific challenges and this paper provides a new and effective method to recognize landmark images. And this method has been successfully applied to actual images. In this way, statue buildings and characteristic objects can be effectively identified.<br />
<br />
There are many difficulties encountered in the development process:<br />
<br />
1. The first problem is that the concept of landmarks cannot be strictly defined, because landmarks can be any object and building.<br />
<br />
2. The second problem is that the same landmark can be photographed from different angles. The results of the multi-angle shooting will result in very different picture characteristics. But the system needs to accurately identify different landmarks. <br />
<br />
3. The third problem is that the landmark recognition system must recognize a large number of landmarks, and the recognition must achieve high accuracy. <br />
<br />
These problems require that the system should have a very low false alarm rate and high recognition accuracy. <br />
There are also three potential problems:<br />
<br />
1. The processed data set contains a little error content, the image content is not clean and the quantity is huge.<br />
<br />
2. The algorithm for learning the training set must be fast and scalable.<br />
<br />
3. While displaying high-quality judgment landmarks, there is no image geographic information mixed.<br />
<br />
The article describes the deep convolutional neural network (CNN) architecture, loss function, training method, and inference aspects. The results of quantitative experiments will be displayed through testing and deployment effect analysis to prove the effectiveness of our model. Finally, it introduces the comparison results between the method proposed in this paper and the latest model and gives a conclusion.<br />
<br />
== Related Work ==<br />
<br />
Landmark recognition can be regarded as one of the tasks of image retrieval, and a large number of documents are concentrated on image retrieval tasks. In the past two decades, the field of image retrieval has made significant progress, and the main methods can be divided into two categories. <br />
The first is a classic retrieval method using local features, a method based on local feature descriptors organized in bag-of-words, spatial verification, Hamming embedding, and query expansion. These methods are dominant in image retrieval. Later, until the rise of deep convolutional neural networks (CNN), deep convolutional neural networks (CNN) were used to generate global descriptors of input images.<br />
Another method is to selectively match the kernel Hamming embedding method extension. With the advent of deep convolutional neural networks, the most effective image retrieval method is based on training CNNs for specific tasks. Deep networks are very powerful for semantic feature representation, which allows us to effectively use them for landmark recognition. This method shows good results but brings additional memory and complexity costs. <br />
The DELF (DEep local feature) by Noh et al. proved promising results. This method combines the classic local feature method with deep learning. This allows us to extract local features from the input image and then use RANSAC for geometric verification. <br />
The goal for the project is to describe a method for accurate and fast large-scale landmark recognition using the advantages of deep convolutional neural networks.<br />
<br />
<br />
== Methodology ==<br />
<br />
This section will describe in detail the CNN architecture, loss function, training procedure, and inference implementation of the landmark recognition system. The figure below is an overview of the landmark recognition system.<br />
<br />
The landmark CNN consists of three parts including the main network, the embedding layer, and the classification layer. To obtain a CNN main network suitable for training landmark recognition model, fine-tuning is applied and several pre-trained backbones (Residual Networks) based on other similar datasets, including ResNet-50, ResNet-200, SE-ResNext-101, and Wide Residual Network (WRN-50-2), are evaluated based on inference quality and efficiency. Based on the evaluation results, WRN-50-2 is selected as the optimal backbone architecture.<br />
<br />
For the embedding layer, as shown in figure x, the last fully-connected layer after the averaging pool is removed. Instead, a fully-connected 2048 <math>\times</math> 512 layer and a batch normalization are added as the embedding layer. After the batch norm, a fully-connected 512 <math>\times</math> n layer is added as the classification layer. The below figure shows the overview of the CNN architecture of the landmark recognition system.<br />
<br />
To effectively determine the embedding vectors for each landmark class (centroids), the network needs to be trained to have the members of each class to be as close as possible to the centroids. Several suitable loss functions are evaluated including contrastive loss, arcface, and center loss. The center loss is selected since it achieves the optimal test results and it trains a center of embeddings of each class and penalizes distances between image embeddings as well as their class centers. In addition, the center loss is a simple addition to softmax loss and is trivial to implement.<br />
<br />
When implementing the loss function, a new additional class that includes all non-landmark instances needs to be added and the center loss function needs to be modified as follows: Let n be the number of landmark classes, m be the mini-batch size, <math>x_i \in R^d</math> is the i-th embedding and <math>y_i</math> is the corresponding label where <math>y_i \in</math> {1,...,n,n+1}, n+1 is the label of the non-landmark class. Denote <math>W \in R^{d \times n}</math> as the weights of the classifier layer, <math>W_j</math> as its j-th column. Let <math>c_{y_i}</math> be the <math>y_i</math> th embeddings center from Center loss and <math>\lambda</math> be the balancing parameter of Center loss. Then the final loss function will be: xxx<br />
<br />
In the training procedure, the stochastic gradient descent(SGD) will be used as the optimizer with momentum=0.9 and weight decay = 5e-3. For the center loss function, the parameter <math>\lambda</math> is set to 5e-5. Each image is resized to 256 <math>\times</math> 256 and several data augmentations are applied to the dataset including random resized crop, color jitter and random flip. The training dataset is divided into four parts based on the geographical affiliation of cities where landmarks are located: Europe/Russia, North America/Australia/Oceania, Middle East/North Africa, and Far East. <br />
<br />
The paper introduces curriculum learning for landmark recognition, which is shown in figure xxx. The algorithm is trained for 30 epochs and the learning rate <math>\alpha_1, \alpha_2, \alpha_3</math> will be reduced by a factor of 10 at the 12th epoch and 24th epoch.<br />
<br />
In the inference phase, the paper introduces the term “centroids” which are embedding vectors that are calculated by averaging embeddings and are used to describe landmark classes. The calculation of centroids is significant to effectively determine whether a query image contains a landmark. The paper proposes two approaches to help the inference algorithm to calculate the centroids. First, instead of using the entire training data for each landmark, data cleaning is done to remove most of the redundant and irrelevant elements in the image. Second, since each landmark can have different shooting angles, it is more efficient to calculate a separate centroid for each shooting angle. Hence, a hierarchical agglomerative clustering algorithm is proposed to partition training data into several valid clusters for each landmark and the set of centroids for a landmark L can be represented by <math>\mu_{l_j} = \frac{1}{v} \sum_{i \in C_j} x_i, j \in 1,...,v</math> where v is the number of valid clusters for landmark L. <br />
<br />
Once the centroids are calculated for each landmark class, the system can make decisions whether there is any landmark in an image. The query image is passed through the landmark CNN and the resulting embedding vector is compared with all centroids by dot product similarity using approximate k-nearest neighbors (AKNN). To distinguish landmark classes from non-landmark, a threshold <math>\eta</math> is set and it will be compared with the maximum similarity to determine if the image contains any landmarks.<br />
<br />
The full inference algorithm is described in figure xxx.<br />
<br />
== Experiments and Analysis ==<br />
<br />
Offline test<br />
<br />
In order to measure the quality of the model, an offline test set was collected and manually labeled. According to the calculations, photos containing landmarks make up 1 − 3% of the total number of photos on average. This distribution was emulated in an offline test, and the geo information and landmark references weren’t used. <br />
The results of this test are presented in Table 4. Two metrics were used to measure the results of experiments: Sensitivity — the accuracy of a model on images with landmarks (also called Recall) and Specificity — the accuracy of a model on images without landmarks. Several types of DELF were evaluated, and the best results in terms of sensitivity and specificity were included in Table 4. The table also contains the results of the model trained only with Softmax loss, Softmax, and Center loss. Thus, Table 4 reflects improvements in our approach with the addition of new elements in it.<br />
<br />
It’s very important to understand how a model works on “rare” landmarks due to the small amount of data for them. Therefore, the behavior of the model was examined separately on “rare” and “frequent” landmarks in Table 5. The column “Part from total number” shows what percentage of landmark examples in the offline test has the corresponding type of landmarks.<br />
<br />
Analysis of the behavior of the model in different categories of landmarks in the offline test is presented in Table 6. These results show that the model can successfully work with various categories of landmarks. Predictably better results (92% of sensitivity and 99.5% of specificity) could also be obtained when the offline test with geo information was launched on the model.<br />
<br />
Revisited Paris dataset<br />
<br />
Revisited Paris dataset (RPar) was also used to measure the quality of the landmark recognition approach. This dataset with Revisited Oxford (ROxf) is standard benchmarks for comparison of image retrieval algorithms. In the recognition, it is important to determine the landmark, which is contained in the query image. Images of the same landmark can have different shooting angles or taken inside/outside the building. Thus, it is reasonable to measure the quality of the model in the standard and adapt it to the task settings. That means not all classes from queries are presented in the landmark dataset. For those images containing correct landmarks but taken from different shooting angles within the building, we transferred them to the “junk” category, which does not influence the final score and makes the test markup closer to our model’s goal. Results on RPar with and without distractors in medium and hard modes are presented in Tables 7 and 8. <br />
<br />
== Comparison ==<br />
<br />
Recent most efficient approaches to landmark recognition are built on fine-tuned CNN. We chose to compare our method to DELF on how well each performs on recognition tasks. A brief summary is given below:<br />
<br />
Figure xxx<br />
<br />
Offline test and timing<br />
<br />
Both approaches obtained similar results for image retrieval in the offline test (shown in table 4), but the proposed approach is much faster on the inference stage and more memory efficient.<br />
<br />
To be more detailed, during the inference stage, DELF needs more forward passes through CNN, has to search the entire database, and performs the RANSAC method for geometric verification. All of them make it much more time-consuming than our proposed approach. Our approach mainly uses centroids, this makes it take less time and needs to store fewer elements.<br />
<br />
<br />
== Conclusion ==<br />
<br />
In this paper we were hoping to solve some difficulties emerging when trying to apply landmark recognition to the production level: there might not be a clean & sufficiently large database for interesting tasks, algorithms should be fast, scalable, and should aim for low FP and high accuracy.<br />
<br />
While aiming for these goals, we presented a way of cleaning landmark data. And most importantly, we introduced the usage of embeddings of deep CNN to make recognition fast and scalable, trained by curriculum learning techniques with modified versions of Center loss. Compared to the state-of-the-art methods, this approach shows similar results but is much faster and suitable for implementation on a large scale.<br />
<br />
== References ==</div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:T358wang&diff=46071User:T358wang2020-11-23T16:49:49Z<p>T358wang: /* Introduction */</p>
<hr />
<div><br />
== Group ==<br />
Rui Chen, Zeren Shen, Zihao Guo, Taohao Wang<br />
<br />
== Introduction ==<br />
<br />
Landmark recognition is an image retrieval task with its own specific challenges and this paper provides a new and effective method to recognize landmark images. And this method has been successfully applied to actual images. In this way, statue buildings and characteristic objects can be effectively identified.<br />
<br />
There are many difficulties encountered in the development process:<br />
<br />
1. The first problem is that the concept of landmarks cannot be strictly defined, because landmarks can be any object and building.<br />
<br />
2. The second problem is that the same landmark can be photographed from different angles. The results of the multi-angle shooting will result in very different picture characteristics. But the system needs to accurately identify different landmarks. <br />
<br />
3. The third problem is that the landmark recognition system must recognize a large number of landmarks, and the recognition must achieve high accuracy. <br />
<br />
These problems require that the system should have a very low false alarm rate and high recognition accuracy. <br />
There are also three potential problems:<br />
<br />
1. The processed data set contains a little error content, the image content is not clean and the quantity is huge.<br />
<br />
2. The algorithm for learning the training set must be fast and scalable.<br />
<br />
3. While displaying high-quality judgment landmarks, there is no image geographic information mixed.<br />
<br />
The article describes the deep convolutional neural network (CNN) architecture, loss function, training method, and inference aspects. The results of quantitative experiments will be displayed through testing and deployment effect analysis to prove the effectiveness of our model. Finally, it introduces the comparison results between the method proposed in this paper and the latest model and gives a conclusion.<br />
<br />
== Related Work ==<br />
<br />
Landmark recognition can be regarded as one of the tasks of image retrieval, and a large number of documents are concentrated on image retrieval tasks. In the past two decades, the field of image retrieval has made significant progress, and the main methods can be divided into two categories. <br />
The first is a classic retrieval method using local features, a method based on local feature descriptors organized in bag-of-words, spatial verification, Hamming embedding, and query expansion. These methods are dominant in image retrieval. Later, until the rise of deep convolutional neural networks (CNN), deep convolutional neural networks (CNN) were used to generate global descriptors of input images.<br />
Another method is to selectively match the kernel Hamming embedding method extension. With the advent of deep convolutional neural networks, the most effective image retrieval method is based on training CNNs for specific tasks. Deep networks are very powerful for semantic feature representation, which allows us to effectively use them for landmark recognition. This method shows good results but brings additional memory and complexity costs. <br />
The DELF (DEep local feature) by Noh et al. proved promising results. This method combines the classic local feature method with deep learning. This allows us to extract local features from the input image and then use RANSAC for geometric verification. <br />
The goal for the project is to describe a method for accurate and fast large-scale landmark recognition using the advantages of deep convolutional neural networks.<br />
<br />
<br />
== Methodology ==<br />
<br />
This section will describe in detail the CNN architecture, loss function, training procedure, and inference implementation of the landmark recognition system. The picture below is an overview of the landmark recognition system. <br />
<br />
The landmark CNN consists of three parts including the main network, the embedding layer, and the classification layer. To obtain a CNN main network suitable for training landmark recognition model, fine-tuning is applied and several pre-trained backbones (Residual Networks) based on other similar datasets, including ResNet-50, ResNet-200, SE-ResNext-101, and Wide Residual Network (WRN-50-2), are evaluated based on inference quality and efficiency. Based on the evaluation results, WRN-50-2 is selected as the optimal backbone architecture.<br />
<br />
For the embedding layer, as shown in figure x, the last fully-connected layer after the averaging pool is removed. Instead, a fully-connected 2048 <math>\times</math> 512 layer and a batch normalization are added as the embedding layer. After the batch norm, a fully-connected 512 <math>\times</math> n layer is added as the classification layer. The below figure shows the overview of the CNN architecture of the landmark recognition system.<br />
<br />
To effectively determine the embedding vectors for each landmark class (centroids), the network needs to be trained to have the members of each class to be as close as possible to the centroids. Several suitable loss functions are evaluated including contrastive loss, arcface, and center loss. The center loss is selected since it achieves the optimal test results and it trains a center of embeddings of each class and penalizes distances between image embeddings as well as their class centers. In addition, the center loss is a simple addition to softmax loss and is trivial to implement.<br />
<br />
When implementing the loss function, a new additional class that includes all non-landmark instances needs to be added and the center loss function needs to be modified as follows: Let n be the number of landmark classes, m be the mini-batch size, <math>x_i \in R^d</math> is the i-th embedding and <math>y_i</math> is the corresponding label where <math>y_i \in</math> {1,...,n,n+1}, n+1 is the label of the non-landmark class. Denote <math>W \in R^{d \times n}</math> as the weights of the classifier layer, <math>W_j</math> as its j-th column. Let <math>c_{y_i}</math> be the <math>y_i</math> th embeddings center from Center loss and <math>\lambda</math> be the balancing parameter of Center loss. Then the final loss function will be: xxx<br />
<br />
In the training procedure, the stochastic gradient descent(SGD) will be used as the optimizer with momentum=0.9 and weight decay = 5e-3. For the center loss function, the parameter <math>\lambda</math> is set to 5e-5. Each image is resized to 256 <math>\times</math> 256 and several data augmentations are applied to the dataset including random resized crop, color jitter and random flip. The training dataset is divided into four parts based on the geographical affiliation of cities where landmarks are located: Europe/Russia, North America/Australia/Oceania, Middle East/North Africa, and Far East. <br />
<br />
The paper introduces curriculum learning for landmark recognition, which is shown in figure xxx. The algorithm is trained for 30 epochs and the learning rate <math>\alpha_1, \alpha_2, \alpha_3</math> will be reduced by a factor of 10 at the 12th epoch and 24th epoch.<br />
<br />
In the inference phase, the paper introduces the term “centroids” which are embedding vectors that are calculated by averaging embeddings and are used to describe landmark classes. The calculation of centroids is significant to effectively determine whether a query image contains a landmark. The paper proposes two approaches to help the inference algorithm to calculate the centroids. First, instead of using the entire training data for each landmark, data cleaning is done to remove most of the redundant and irrelevant elements in the image. Second, since each landmark can have different shooting angles, it is more efficient to calculate a separate centroid for each shooting angle. Hence, a hierarchical agglomerative clustering algorithm is proposed to partition training data into several valid clusters for each landmark and the set of centroids for a landmark L can be represented by <math>\mu_{l_j} = \frac{1}{v} \sum_{i \in C_j} x_i, j \in 1,...,v</math> where v is the number of valid clusters for landmark L. <br />
<br />
Once the centroids are calculated for each landmark class, the system can make decisions whether there is any landmark in an image. The query image is passed through the landmark CNN and the resulting embedding vector is compared with all centroids by dot product similarity using approximate k-nearest neighbors (AKNN). To distinguish landmark classes from non-landmark, a threshold <math>\eta</math> is set and it will be compared with the maximum similarity to determine if the image contains any landmarks.<br />
<br />
The full inference algorithm is described in figure xxx.<br />
<br />
== Experiments and Analysis ==<br />
<br />
Offline test<br />
<br />
In order to measure the quality of the model, an offline test set was collected and manually labeled. According to the calculations, photos containing landmarks make up 1 − 3% of the total number of photos on average. This distribution was emulated in an offline test, and the geo information and landmark references weren’t used. <br />
The results of this test are presented in Table 4. Two metrics were used to measure the results of experiments: Sensitivity — the accuracy of a model on images with landmarks (also called Recall) and Specificity — the accuracy of a model on images without landmarks. Several types of DELF were evaluated, and the best results in terms of sensitivity and specificity were included in Table 4. The table also contains the results of the model trained only with Softmax loss, Softmax, and Center loss. Thus, Table 4 reflects improvements in our approach with the addition of new elements in it.<br />
<br />
It’s very important to understand how a model works on “rare” landmarks due to the small amount of data for them. Therefore, the behavior of the model was examined separately on “rare” and “frequent” landmarks in Table 5. The column “Part from total number” shows what percentage of landmark examples in the offline test has the corresponding type of landmarks.<br />
<br />
Analysis of the behavior of the model in different categories of landmarks in the offline test is presented in Table 6. These results show that the model can successfully work with various categories of landmarks. Predictably better results (92% of sensitivity and 99.5% of specificity) could also be obtained when the offline test with geo information was launched on the model.<br />
<br />
Revisited Paris dataset<br />
<br />
Revisited Paris dataset (RPar) was also used to measure the quality of the landmark recognition approach. This dataset with Revisited Oxford (ROxf) is standard benchmarks for comparison of image retrieval algorithms. In the recognition, it is important to determine the landmark, which is contained in the query image. Images of the same landmark can have different shooting angles or taken inside/outside the building. Thus, it is reasonable to measure the quality of the model in the standard and adapt it to the task settings. That means not all classes from queries are presented in the landmark dataset. For those images containing correct landmarks but taken from different shooting angles within the building, we transferred them to the “junk” category, which does not influence the final score and makes the test markup closer to our model’s goal. Results on RPar with and without distractors in medium and hard modes are presented in Tables 7 and 8. <br />
<br />
== Comparison ==<br />
<br />
Recent most efficient approaches to landmark recognition are built on fine-tuned CNN. We chose to compare our method to DELF on how well each performs on recognition tasks. A brief summary is given below:<br />
<br />
Figure xxx<br />
<br />
Offline test and timing<br />
<br />
Both approaches obtained similar results for image retrieval in the offline test (shown in table 4), but the proposed approach is much faster on the inference stage and more memory efficient.<br />
<br />
To be more detailed, during the inference stage, DELF needs more forward passes through CNN, has to search the entire database, and performs the RANSAC method for geometric verification. All of them make it much more time-consuming than our proposed approach. Our approach mainly uses centroids, this makes it take less time and needs to store fewer elements.<br />
<br />
<br />
== Conclusion ==<br />
<br />
In this paper we were hoping to solve some difficulties emerging when trying to apply landmark recognition to the production level: there might not be a clean & sufficiently large database for interesting tasks, algorithms should be fast, scalable, and should aim for low FP and high accuracy.<br />
<br />
While aiming for these goals, we presented a way of cleaning landmark data. And most importantly, we introduced the usage of embeddings of deep CNN to make recognition fast and scalable, trained by curriculum learning techniques with modified versions of Center loss. Compared to the state-of-the-art methods, this approach shows similar results but is much faster and suitable for implementation on a large scale.<br />
<br />
== References ==</div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=File:t358wang_rare_freq.png&diff=46070File:t358wang rare freq.png2020-11-23T16:48:54Z<p>T358wang: </p>
<hr />
<div></div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=File:t358wang_network_arch.png&diff=46069File:t358wang network arch.png2020-11-23T16:48:40Z<p>T358wang: </p>
<hr />
<div></div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=File:t358wang_models_eval.png&diff=46068File:t358wang models eval.png2020-11-23T16:48:26Z<p>T358wang: </p>
<hr />
<div></div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=File:t358wang_methods_eval2.png&diff=46067File:t358wang methods eval2.png2020-11-23T16:48:11Z<p>T358wang: </p>
<hr />
<div></div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=File:t358wang_methods_eval1.png&diff=46066File:t358wang methods eval1.png2020-11-23T16:48:03Z<p>T358wang: </p>
<hr />
<div></div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=File:t358wang_loss_function.png&diff=46065File:t358wang loss function.png2020-11-23T16:47:53Z<p>T358wang: </p>
<hr />
<div></div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=File:t358wang_landmark_recog_system.png&diff=46064File:t358wang landmark recog system.png2020-11-23T16:47:39Z<p>T358wang: </p>
<hr />
<div></div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=File:t358wang_landmark_category.png&diff=46063File:t358wang landmark category.png2020-11-23T16:47:27Z<p>T358wang: </p>
<hr />
<div></div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=File:t358wang_comparison.png&diff=46062File:t358wang comparison.png2020-11-23T16:47:15Z<p>T358wang: </p>
<hr />
<div></div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=File:t358wang_backbones.png&diff=46061File:t358wang backbones.png2020-11-23T16:46:57Z<p>T358wang: </p>
<hr />
<div></div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=File:t358wang_algorithm2.png&diff=46060File:t358wang algorithm2.png2020-11-23T16:46:41Z<p>T358wang: </p>
<hr />
<div></div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=File:t358wang_algorithm1.png&diff=46059File:t358wang algorithm1.png2020-11-23T16:46:00Z<p>T358wang: </p>
<hr />
<div></div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:T358wang&diff=46058User:T358wang2020-11-23T16:15:50Z<p>T358wang: Created page with " == Group == Rui Chen, Zeren Shen, Zihao Guo, Taohao Wang == Introduction == Landmark recognition is an image retrieval task with its own specific challenges and this paper..."</p>
<hr />
<div><br />
== Group ==<br />
Rui Chen, Zeren Shen, Zihao Guo, Taohao Wang<br />
<br />
== Introduction ==<br />
<br />
Landmark recognition is an image retrieval task with its own specific challenges and this paper provides a new and effective method to recognize landmark images. And this method has been successfully applied to actual images. In this way, statue buildings and characteristic objects can be effectively identified.<br />
<br />
There are many difficulties encountered in the development process: <br />
1. The first problem is that the concept of landmarks cannot be strictly defined, because landmarks can be any object and building. <br />
2. The second problem is that the same landmark can be photographed from different angles. The results of the multi-angle shooting will result in very different picture characteristics. But the system needs to accurately identify different landmarks. <br />
3. The third problem is that the landmark recognition system must recognize a large number of landmarks, and the recognition must achieve high accuracy. <br />
<br />
These problems require that the system should have a very low false alarm rate and high recognition accuracy. <br />
There are also three potential problems:<br />
1. The processed data set contains a little error content, the image content is not clean and the quantity is huge.<br />
2. The algorithm for learning the training set must be fast and scalable.<br />
3. While displaying high-quality judgment landmarks, there is no image geographic information mixed.<br />
The article describes the deep convolutional neural network (CNN) architecture, loss function, training method, and inference aspects. The results of quantitative experiments will be displayed through testing and deployment effect analysis to prove the effectiveness of our model. Finally, it introduces the comparison results between the method proposed in this paper and the latest model and gives a conclusion.<br />
<br />
== Related Work ==<br />
<br />
Landmark recognition can be regarded as one of the tasks of image retrieval, and a large number of documents are concentrated on image retrieval tasks. In the past two decades, the field of image retrieval has made significant progress, and the main methods can be divided into two categories. <br />
The first is a classic retrieval method using local features, a method based on local feature descriptors organized in bag-of-words, spatial verification, Hamming embedding, and query expansion. These methods are dominant in image retrieval. Later, until the rise of deep convolutional neural networks (CNN), deep convolutional neural networks (CNN) were used to generate global descriptors of input images.<br />
Another method is to selectively match the kernel Hamming embedding method extension. With the advent of deep convolutional neural networks, the most effective image retrieval method is based on training CNNs for specific tasks. Deep networks are very powerful for semantic feature representation, which allows us to effectively use them for landmark recognition. This method shows good results but brings additional memory and complexity costs. <br />
The DELF (DEep local feature) by Noh et al. proved promising results. This method combines the classic local feature method with deep learning. This allows us to extract local features from the input image and then use RANSAC for geometric verification. <br />
The goal for the project is to describe a method for accurate and fast large-scale landmark recognition using the advantages of deep convolutional neural networks.<br />
<br />
<br />
== Methodology ==<br />
<br />
This section will describe in detail the CNN architecture, loss function, training procedure, and inference implementation of the landmark recognition system. The picture below is an overview of the landmark recognition system. <br />
<br />
The landmark CNN consists of three parts including the main network, the embedding layer, and the classification layer. To obtain a CNN main network suitable for training landmark recognition model, fine-tuning is applied and several pre-trained backbones (Residual Networks) based on other similar datasets, including ResNet-50, ResNet-200, SE-ResNext-101, and Wide Residual Network (WRN-50-2), are evaluated based on inference quality and efficiency. Based on the evaluation results, WRN-50-2 is selected as the optimal backbone architecture.<br />
<br />
For the embedding layer, as shown in figure x, the last fully-connected layer after the averaging pool is removed. Instead, a fully-connected 2048 <math>\times</math> 512 layer and a batch normalization are added as the embedding layer. After the batch norm, a fully-connected 512 <math>\times</math> n layer is added as the classification layer. The below figure shows the overview of the CNN architecture of the landmark recognition system.<br />
<br />
To effectively determine the embedding vectors for each landmark class (centroids), the network needs to be trained to have the members of each class to be as close as possible to the centroids. Several suitable loss functions are evaluated including contrastive loss, arcface, and center loss. The center loss is selected since it achieves the optimal test results and it trains a center of embeddings of each class and penalizes distances between image embeddings as well as their class centers. In addition, the center loss is a simple addition to softmax loss and is trivial to implement.<br />
<br />
When implementing the loss function, a new additional class that includes all non-landmark instances needs to be added and the center loss function needs to be modified as follows: Let n be the number of landmark classes, m be the mini-batch size, <math>x_i \in R^d</math> is the i-th embedding and <math>y_i</math> is the corresponding label where <math>y_i \in</math> {1,...,n,n+1}, n+1 is the label of the non-landmark class. Denote <math>W \in R^{d \times n}</math> as the weights of the classifier layer, <math>W_j</math> as its j-th column. Let <math>c_{y_i}</math> be the <math>y_i</math> th embeddings center from Center loss and <math>\lambda</math> be the balancing parameter of Center loss. Then the final loss function will be: xxx<br />
<br />
In the training procedure, the stochastic gradient descent(SGD) will be used as the optimizer with momentum=0.9 and weight decay = 5e-3. For the center loss function, the parameter <math>\lambda</math> is set to 5e-5. Each image is resized to 256 <math>\times</math> 256 and several data augmentations are applied to the dataset including random resized crop, color jitter and random flip. The training dataset is divided into four parts based on the geographical affiliation of cities where landmarks are located: Europe/Russia, North America/Australia/Oceania, Middle East/North Africa, and Far East. <br />
<br />
The paper introduces curriculum learning for landmark recognition, which is shown in figure xxx. The algorithm is trained for 30 epochs and the learning rate <math>\alpha_1, \alpha_2, \alpha_3</math> will be reduced by a factor of 10 at the 12th epoch and 24th epoch.<br />
<br />
In the inference phase, the paper introduces the term “centroids” which are embedding vectors that are calculated by averaging embeddings and are used to describe landmark classes. The calculation of centroids is significant to effectively determine whether a query image contains a landmark. The paper proposes two approaches to help the inference algorithm to calculate the centroids. First, instead of using the entire training data for each landmark, data cleaning is done to remove most of the redundant and irrelevant elements in the image. Second, since each landmark can have different shooting angles, it is more efficient to calculate a separate centroid for each shooting angle. Hence, a hierarchical agglomerative clustering algorithm is proposed to partition training data into several valid clusters for each landmark and the set of centroids for a landmark L can be represented by <math>\mu_{l_j} = \frac{1}{v} \sum_{i \in C_j} x_i, j \in 1,...,v</math> where v is the number of valid clusters for landmark L. <br />
<br />
Once the centroids are calculated for each landmark class, the system can make decisions whether there is any landmark in an image. The query image is passed through the landmark CNN and the resulting embedding vector is compared with all centroids by dot product similarity using approximate k-nearest neighbors (AKNN). To distinguish landmark classes from non-landmark, a threshold <math>\eta</math> is set and it will be compared with the maximum similarity to determine if the image contains any landmarks.<br />
<br />
The full inference algorithm is described in figure xxx.<br />
<br />
== Experiments and Analysis ==<br />
<br />
Offline test<br />
<br />
In order to measure the quality of the model, an offline test set was collected and manually labeled. According to the calculations, photos containing landmarks make up 1 − 3% of the total number of photos on average. This distribution was emulated in an offline test, and the geo information and landmark references weren’t used. <br />
The results of this test are presented in Table 4. Two metrics were used to measure the results of experiments: Sensitivity — the accuracy of a model on images with landmarks (also called Recall) and Specificity — the accuracy of a model on images without landmarks. Several types of DELF were evaluated, and the best results in terms of sensitivity and specificity were included in Table 4. The table also contains the results of the model trained only with Softmax loss, Softmax, and Center loss. Thus, Table 4 reflects improvements in our approach with the addition of new elements in it.<br />
<br />
It’s very important to understand how a model works on “rare” landmarks due to the small amount of data for them. Therefore, the behavior of the model was examined separately on “rare” and “frequent” landmarks in Table 5. The column “Part from total number” shows what percentage of landmark examples in the offline test has the corresponding type of landmarks.<br />
<br />
Analysis of the behavior of the model in different categories of landmarks in the offline test is presented in Table 6. These results show that the model can successfully work with various categories of landmarks. Predictably better results (92% of sensitivity and 99.5% of specificity) could also be obtained when the offline test with geo information was launched on the model.<br />
<br />
Revisited Paris dataset<br />
<br />
Revisited Paris dataset (RPar) was also used to measure the quality of the landmark recognition approach. This dataset with Revisited Oxford (ROxf) is standard benchmarks for comparison of image retrieval algorithms. In the recognition, it is important to determine the landmark, which is contained in the query image. Images of the same landmark can have different shooting angles or taken inside/outside the building. Thus, it is reasonable to measure the quality of the model in the standard and adapt it to the task settings. That means not all classes from queries are presented in the landmark dataset. For those images containing correct landmarks but taken from different shooting angles within the building, we transferred them to the “junk” category, which does not influence the final score and makes the test markup closer to our model’s goal. Results on RPar with and without distractors in medium and hard modes are presented in Tables 7 and 8. <br />
<br />
== Comparison ==<br />
<br />
Recent most efficient approaches to landmark recognition are built on fine-tuned CNN. We chose to compare our method to DELF on how well each performs on recognition tasks. A brief summary is given below:<br />
<br />
Figure xxx<br />
<br />
Offline test and timing<br />
<br />
Both approaches obtained similar results for image retrieval in the offline test (shown in table 4), but the proposed approach is much faster on the inference stage and more memory efficient.<br />
<br />
To be more detailed, during the inference stage, DELF needs more forward passes through CNN, has to search the entire database, and performs the RANSAC method for geometric verification. All of them make it much more time-consuming than our proposed approach. Our approach mainly uses centroids, this makes it take less time and needs to store fewer elements.<br />
<br />
<br />
== Conclusion ==<br />
<br />
In this paper we were hoping to solve some difficulties emerging when trying to apply landmark recognition to the production level: there might not be a clean & sufficiently large database for interesting tasks, algorithms should be fast, scalable, and should aim for low FP and high accuracy.<br />
<br />
While aiming for these goals, we presented a way of cleaning landmark data. And most importantly, we introduced the usage of embeddings of deep CNN to make recognition fast and scalable, trained by curriculum learning techniques with modified versions of Center loss. Compared to the state-of-the-art methods, this approach shows similar results but is much faster and suitable for implementation on a large scale.<br />
<br />
== References ==</div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat441F21&diff=45618stat441F212020-11-22T00:40:23Z<p>T358wang: /* Paper presentation */</p>
<hr />
<div><br />
<br />
== [[F20-STAT 441/841 CM 763-Proposal| Project Proposal ]] ==<br />
<br />
<!--[https://goo.gl/forms/apurag4dr9kSR76X2 Your feedback on presentations]--><br />
<br />
= Record your contributions here [https://docs.google.com/spreadsheets/d/10CHiJpAylR6kB9QLqN7lZHN79D9YEEW6CDTH27eAhbQ/edit?usp=sharing]=<br />
<br />
Use the following notations:<br />
<br />
P: You have written a summary/critique on the paper.<br />
<br />
T: You had a technical contribution on a paper (excluding the paper that you present).<br />
<br />
E: You had an editorial contribution on a paper (excluding the paper that you present).<br />
<br />
=Paper presentation=<br />
{| class="wikitable"<br />
<br />
{| border="1" cellpadding="3"<br />
|-<br />
|width="60pt"|Date<br />
|width="250pt"|Name <br />
|width="15pt"|Paper number <br />
|width="700pt"|Title<br />
|width="15pt"|Link to the paper<br />
|width="30pt"|Link to the summary<br />
|width="30pt"|Link to the video<br />
|-<br />
|Sep 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary] || [https://youtu.be/JWozRg_X-Vg?list=PLehuLRPyt1HzXDemu7K4ETcF0Ld_B5adG&t=539]<br />
|-<br />
|Week of Nov 16 ||Sharman Bharat, Li Dylan,Lu Leonie, Li Mingdao || 1|| Risk prediction in life insurance industry using supervised learning algorithms || [https://rdcu.be/b780J Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Bsharman Summary] ||<br />
[https://www.youtube.com/watch?v=TVLpSFYgF0c&feature=youtu.be]<br />
|-<br />
|Week of Nov 16 || Delaney Smith, Mohammad Assem Mahmoud || 2|| Influenza Forecasting Framework based on Gaussian Processes || [https://proceedings.icml.cc/static/paper_files/icml/2020/1239-Paper.pdf Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Influenza_Forecasting_Framework_based_on_Gaussian_Processes Summary]|| [https://www.youtube.com/watch?v=HZG9RAHhpXc&feature=youtu.be]<br />
|-<br />
|Week of Nov 16 || Tatianna Krikella, Swaleh Hussain, Grace Tompkins || 3|| Processing of Missing Data by Neural Networks || [http://papers.nips.cc/paper/7537-processing-of-missing-data-by-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Gtompkin Summary] || [https://learn.uwaterloo.ca/d2l/ext/rp/577051/lti/framedlaunch/6ec1ebea-5547-46a2-9e4f-e3dc9d79fd54]<br />
|-<br />
|Week of Nov 16 ||Jonathan Chow, Nyle Dharani, Ildar Nasirov ||4 ||Streaming Bayesian Inference for Crowdsourced Classification ||[https://papers.nips.cc/paper/9439-streaming-bayesian-inference-for-crowdsourced-classification.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Streaming_Bayesian_Inference_for_Crowdsourced_Classification Summary] || [https://www.youtube.com/watch?v=UgVRzi9Ubws]<br />
|-<br />
|Week of Nov 16 || Matthew Hall, Johnathan Chalaturnyk || 5|| Neural Ordinary Differential Equations || [https://papers.nips.cc/paper/7892-neural-ordinary-differential-equations.pdf] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Neural_ODEs Summary]||<br />
|-<br />
|Week of Nov 16 || Luwen Chang, Qingyang Yu, Tao Kong, Tianrong Sun || 6|| Adversarial Attacks on Copyright Detection Systems || Paper [https://proceedings.icml.cc/static/paper_files/icml/2020/1894-Paper.pdf] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Adversarial_Attacks_on_Copyright_Detection_Systems Summary] || [https://www.youtube.com/watch?v=bQI9S6bCo8o]<br />
|-<br />
|Week of Nov 16 || Casey De Vera, Solaiman Jawad || 7|| IPBoost – Non-Convex Boosting via Integer Programming || [https://arxiv.org/pdf/2002.04679.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=IPBoost Summary] || [https://www.youtube.com/watch?v=4DhJDGC4pyI&feature=youtu.be]<br />
|-<br />
|Week of Nov 16 || Yuxin Wang, Evan Peters, Yifan Mou, Sangeeth Kalaichanthiran || 8|| What Game Are We Playing? End-to-end Learning in Normal and Extensive Form Games || [https://arxiv.org/pdf/1805.02777.pdf] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=what_game_are_we_playing Summary] || [https://www.youtube.com/watch?v=9qJoVxo3hnI&feature=youtu.be]<br />
|-<br />
|Week of Nov 16 || Yuchuan Wu || 9|| || || ||<br />
|-<br />
|Week of Nov 16 || Zhou Zeping, Siqi Li, Yuqin Fang, Fu Rao || 10|| A survey of neural network-based cancer prediction models from microarray data || [https://www.sciencedirect.com/science/article/pii/S0933365717305067 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Y93fang Summary] || [https://youtu.be/B8pPUU8ypO0]<br />
|-<br />
|Week of Nov 23 ||Jinjiang Lian, Jiawen Hou, Yisheng Zhu, Mingzhe Huang || 11|| DROCC: Deep Robust One-Class Classification || [https://proceedings.icml.cc/static/paper_files/icml/2020/6556-Paper.pdf paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:J46hou Summary] || [https://www.youtube.com/watch?v=tvCEvvy54X8&ab_channel=JJLian]<br />
|-<br />
|Week of Nov 23 || Bushra Haque, Hayden Jones, Michael Leung, Cristian Mustatea || 12|| Combine Convolution with Recurrent Networks for Text Classification || [https://arxiv.org/pdf/2006.15795.pdf Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Cvmustat Summary] || [https://www.youtube.com/watch?v=or5RTxDnZDo]<br />
|-<br />
|Week of Nov 23 || Taohao Wang, Zeren Shen, Zihao Guo, Rui Chen || 13|| Large Scale Landmark Recognition via Deep Metric Learning || [https://arxiv.org/pdf/1908.10192.pdf paper] || ||<br />
|-<br />
|Week of Nov 23 || Qianlin Song, William Loh, Junyue Bai, Phoebe Choi || 14|| Task Understanding from Confusing Multi-task Data || [https://proceedings.icml.cc/static/paper_files/icml/2020/578-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Task_Understanding_from_Confusing_Multi-task_Data Summary] ||<br />
|-<br />
|Week of Nov 23 || Rui Gong, Xuetong Wang, Xinqi Ling, Di Ma || 15|| Semantic Relation Classification via Convolution Neural Network|| [https://www.aclweb.org/anthology/S18-1127.pdf paper] || ||<br />
|-<br />
|Week of Nov 23 || Xiaolan Xu, Robin Wen, Yue Weng, Beizhen Chang || 16|| Graph Structure of Neural Networks || [https://proceedings.icml.cc/paper/2020/file/757b505cfd34c64c85ca5b5690ee5293-Paper.pdf Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title= Summary] ||<br />
|-<br />
|Week of Nov 23 ||Hansa Halim, Sanjana Rajendra Naik, Samka Marfua, Shawrupa Proshasty || 17|| Superhuman AI for multiplayer poker || [https://www.cs.cmu.edu/~noamb/papers/19-Science-Superhuman.pdf Paper] || ||<br />
|-<br />
|Week of Nov 23 ||Guanting Pan, Haocheng Chang, Zaiwei Zhang || 18|| Point-of-Interest Recommendation: Exploiting Self-Attentive Autoencoders with Neighbor-Aware Influence || [https://arxiv.org/pdf/1809.10770.pdf Paper] || ||<br />
|-<br />
|Week of Nov 23 || Jerry Huang, Daniel Jiang, Minyan Dai || 19|| Neural Speed Reading Via Skim-RNN ||[https://arxiv.org/pdf/1711.02085.pdf?fbclid=IwAR3EeFsKM_b5p9Ox7X9mH-1oI3U3oOKPBy3xUOBN0XvJa7QW2ZeJJ9ypQVo Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Neural_Speed_Reading_via_Skim-RNN Summary]||<br />
|-<br />
|Week of Nov 23 ||Ruixian Chin, Yan Kai Tan, Jason Ong, Wen Cheen Chiew || 20|| DivideMix: Learning with Noisy Labels as Semi-supervised Learning || [https://openreview.net/pdf?id=HJgExaVtwr Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Yktan Summary]||<br />
|-<br />
|Week of Nov 30 || Banno Dion, Battista Joseph, Kahn Solomon || 21|| Music Recommender System Based on Genre using Convolutional Recurrent Neural Networks || [https://www.sciencedirect.com/science/article/pii/S1877050919310646] || ||<br />
|-<br />
|Week of Nov 30 || Sai Arvind Budaraju, Isaac Ellmen, Dorsa Mohammadrezaei, Emilee Carson || 22|| A universal SNP and small-indel variant caller using deep neural networks||[https://www.nature.com/articles/nbt.4235.epdf?author_access_token=q4ZmzqvvcGBqTuKyKgYrQ9RgN0jAjWel9jnR3ZoTv0NuM3saQzpZk8yexjfPUhdFj4zyaA4Yvq0LWBoCYQ4B9vqPuv8e2HHy4vShDgEs8YxI_hLs9ov6Y1f_4fyS7kGZ Paper] || ||<br />
|-<br />
|Week of Nov 30 || Daniel Fagan, Cooper Brooke, Maya Perelman || 23|| Efficient kNN Classification With Different Number of Nearest Neighbors || [https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7898482 Paper] || ||<br />
|-<br />
|Week of Nov 30 || Karam Abuaisha, Evan Li, Jason Pu, Nicholas Vadivelu || 24|| Being Bayesian about Categorical Probability || [https://proceedings.icml.cc/static/paper_files/icml/2020/3560-Paper.pdf Paper] || ||<br />
|-<br />
|Week of Nov 30 || Anas Mahdi Will Thibault Jan Lau Jiwon Yang || 25|| Loss Function Search for Face Recognition<br />
|| [https://proceedings.icml.cc/static/paper_files/icml/2020/245-Paper.pdf] paper || ||<br />
|-<br />
|Week of Nov 30 ||Zihui (Betty) Qin, Wenqi (Maggie) Zhao, Muyuan Yang, Amartya (Marty) Mukherjee || 26|| Deep Learning for Cardiologist-level Myocardial Infarction Detection in Electrocardiograms || [https://arxiv.org/pdf/1912.07618.pdf?fbclid=IwAR0RwATSn4CiT3qD9LuywYAbJVw8YB3nbex8Kl19OCExIa4jzWaUut3oVB0 Paper] || Summary [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Learning_for_Cardiologist-level_Myocardial_Infarction_Detection_in_Electrocardiograms&fbclid=IwAR1Tad2DAM7LT6NXXuSYDZtHHBvN0mjZtDdCOiUFFq_XwVcQxG3hU-3XcaE] ||<br />
|-<br />
|Week of Nov 30 || Stan Lee, Seokho Lim, Kyle Jung, Daehyun Kim || 27|| Bag of Tricks for Efficient Text Classification || [https://arxiv.org/pdf/1607.01759.pdf paper] || ||<br />
|-<br />
|Week of Nov 30 || Yawen Wang, Danmeng Cui, ZiJie Jiang, Mingkang Jiang, Haotian Ren, Haris Bin Zahid || 28|| A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques || [https://arxiv.org/pdf/1707.02919.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Describtion_of_Text_Mining Summary] ||<br />
|-<br />
|Week of Nov 30 || Qing Guo, XueGuang Ma, James Ni, Yuanxin Wang || 29|| Mask R-CNN || [https://arxiv.org/pdf/1703.06870.pdf Paper] || ||<br />
|-<br />
|Week of Nov 30 || Bertrand Sodjahin, Junyi Yang, Jill Yu Chieh Wang, Yu Min Wu, Calvin Li || 30|| Research paper classifcation systems based on TF‑IDF and LDA schemes || [https://hcis-journal.springeropen.com/articles/10.1186/s13673-019-0192-7?fbclid=IwAR3swO-eFrEbj1BUQfmomJazxxeFR6SPgr6gKayhs38Y7aBG-zX1G3XWYRM Paper] || ||<br />
|-<br />
|Week of Nov 30 || Daniel Zhang, Jacky Yao, Scholar Sun, Russell Parco, Ian Cheung || 31 || Speech2Face: Learning the Face Behind a Voice || [https://arxiv.org/pdf/1905.09773.pdf?utm_source=thenewstack&utm_medium=website&utm_campaign=platform Paper] || ||<br />
|-<br />
|Week of Nov 30 || Siyuan Xia, Jiaxiang Liu, Jiabao Dong, Yipeng Du || 32 || Evaluating Machine Accuracy on ImageNet || [https://proceedings.icml.cc/static/paper_files/icml/2020/6173-Paper.pdf] || ||<br />
|-<br />
|Week of Nov 30 || Msuhi Wang, Siyuan Qiu, Yan Yu || 33 || Surround Vehicle Motion Prediction Using LSTM-RNN for Motion Planning of Autonomous Vehicles at Multi-Lane Turn Intersections || [https://ieeexplore.ieee.org/abstract/document/8957421 paper] || ||</div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat441F21&diff=43664stat441F212020-11-10T00:53:10Z<p>T358wang: /* Paper presentation */</p>
<hr />
<div><br />
<br />
== [[F20-STAT 441/841 CM 763-Proposal| Project Proposal ]] ==<br />
<br />
<!--[https://goo.gl/forms/apurag4dr9kSR76X2 Your feedback on presentations]--><br />
<br />
= Record your contributions here [https://docs.google.com/spreadsheets/d/10CHiJpAylR6kB9QLqN7lZHN79D9YEEW6CDTH27eAhbQ/edit?usp=sharing]=<br />
<br />
Use the following notations:<br />
<br />
P: You have written a summary/critique on the paper.<br />
<br />
T: You had a technical contribution on a paper (excluding the paper that you present).<br />
<br />
E: You had an editorial contribution on a paper (excluding the paper that you present).<br />
<br />
=Paper presentation=<br />
{| class="wikitable"<br />
<br />
{| border="1" cellpadding="3"<br />
|-<br />
|width="60pt"|Date<br />
|width="250pt"|Name <br />
|width="15pt"|Paper number <br />
|width="700pt"|Title<br />
|width="15pt"|Link to the paper<br />
|width="30pt"|Link to the summary<br />
|width="30pt"|Link to the video<br />
|-<br />
|Sep 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary] || [https://youtu.be/JWozRg_X-Vg?list=PLehuLRPyt1HzXDemu7K4ETcF0Ld_B5adG&t=539]<br />
|-<br />
|Week of Nov 16 ||Sharman Bharat, Li Dylan,Lu Leonie, Li Mingdao || 1|| Risk prediction in life insurance industry using supervised learning algorithms || [https://rdcu.be/b780J Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Bsharman Summary] ||<br />
[https://www.youtube.com/watch?v=TVLpSFYgF0c&feature=youtu.be]<br />
|-<br />
|Week of Nov 16 || Delaney Smith, Mohammad Assem Mahmoud || 2|| Influenza Forecasting Framework based on Gaussian Processes || [https://proceedings.icml.cc/static/paper_files/icml/2020/1239-Paper.pdf] paper || ||<br />
|-<br />
|Week of Nov 16 || Tatianna Krikella, Swaleh Hussain, Grace Tompkins || 3|| Processing of Missing Data by Neural Networks || [http://papers.nips.cc/paper/7537-processing-of-missing-data-by-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Gtompkin Summary] ||<br />
|-<br />
|Week of Nov 16 ||Jonathan Chow, Nyle Dharani, Ildar Nasirov ||4 ||Streaming Bayesian Inference for Crowdsourced Classification ||[https://papers.nips.cc/paper/9439-streaming-bayesian-inference-for-crowdsourced-classification.pdf Paper] || ||<br />
|-<br />
|Week of Nov 16 || Matthew Hall, Johnathan Chalaturnyk || 5|| Neural Ordinary Differential Equations || [https://papers.nips.cc/paper/7892-neural-ordinary-differential-equations.pdf] || ||<br />
|-<br />
|Week of Nov 16 || Luwen Chang, Qingyang Yu, Tao Kong, Tianrong Sun || 6|| Adversarial Attacks on Copyright Detection Systems || Paper [https://proceedings.icml.cc/static/paper_files/icml/2020/1894-Paper.pdf] || ||<br />
|-<br />
|Week of Nov 16 || Casey De Vera, Solaiman Jawad, Jihoon Han || 7|| || || ||<br />
|-<br />
|Week of Nov 16 || Yuxin Wang, Evan Peters, Cynthia Mou, Sangeeth Kalaichanthiran || 8|| Uniform convergence may be unable to explain generalization in deep learning || [https://papers.nips.cc/paper/9336-uniform-convergence-may-be-unable-to-explain-generalization-in-deep-learning.pdf] || ||<br />
|-<br />
|Week of Nov 16 || Yuchuan Wu || 9|| || || ||<br />
|-<br />
|Week of Nov 16 || Zhou Zeping, Siqi Li, Yuqin Fang, Fu Rao || 10|| The Spectrum of the Fisher Information Matrix of a Single-Hidden-Layer Neural Network || [http://people.cs.uchicago.edu/~pworah/rmt2.pdf] || ||<br />
|-<br />
|Week of Nov 23 ||Jinjiang Lian, Jiawen Hou, Yisheng Zhu, Mingzhe Huang || 11|| DROCC: Deep Robust One-Class Classification || [https://proceedings.icml.cc/static/paper_files/icml/2020/6556-Paper.pdf paper] || ||<br />
|-<br />
|Week of Nov 23 || Bushra Haque, Hayden Jones, Michael Leung, Cristian Mustatea || 12|| Combine Convolution with Recurrent Networks for Text Classification || [https://arxiv.org/pdf/2006.15795.pdf Paper] || ||<br />
|-<br />
|Week of Nov 23 || Taohao Wang, Zeren Shen, Zihao Guo, Rui Chen || 13|| Deep multiple instance learning for image classification and auto-annotation || [https://jiajunwu.com/papers/dmil_cvpr.pdf paper] || ||<br />
|-<br />
|Week of Nov 23 || Qianlin Song, William Loh, Junyue Bai, Phoebe Choi || 14|| Task Understanding from Confusing Multi-task Data || [https://proceedings.icml.cc/static/paper_files/icml/2020/578-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Task_Understanding_from_Confusing_Multi-task_Data Summary] ||<br />
|-<br />
|Week of Nov 23 || Rui Gong, Xuetong Wang, Xinqi Ling, Di Ma || 15|| Semantic Relation Classification via Convolution Neural Network|| [https://www.aclweb.org/anthology/S18-1127.pdf paper] || ||<br />
|-<br />
|Week of Nov 23 || Xiaolan Xu, Robin Wen, Yue Weng, Beizhen Chang || 16|| Graph Structure of Neural Networks || [https://proceedings.icml.cc/paper/2020/file/757b505cfd34c64c85ca5b5690ee5293-Paper.pdf Paper] || ||<br />
|-<br />
|Week of Nov 23 ||Hansa Halim, Sanjana Rajendra Naik, Samka Marfua, Shawrupa Proshasty || 17|| Superhuman AI for multiplayer poker || [https://www.cs.cmu.edu/~noamb/papers/19-Science-Superhuman.pdf Paper] || ||<br />
|-<br />
|Week of Nov 23 ||Guanting Pan, Haocheng Chang, Zaiwei Zhang || 18|| Point-of-Interest Recommendation: Exploiting Self-Attentive Autoencoders with Neighbor-Aware Influence || [https://arxiv.org/pdf/1809.10770.pdf Paper] || ||<br />
|-<br />
|Week of Nov 23 || Jerry Huang, Daniel Jiang, Minyan Dai || 19|| Neural Speed Reading Via Skim-RNN ||[https://arxiv.org/pdf/1711.02085.pdf?fbclid=IwAR3EeFsKM_b5p9Ox7X9mH-1oI3U3oOKPBy3xUOBN0XvJa7QW2ZeJJ9ypQVo Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Neural_Speed_Reading_via_Skim-RNN Summary]||<br />
|-<br />
|Week of Nov 23 ||Ruixian Chin, Yan Kai Tan, Jason Ong, Wen Cheen Chiew || 20|| DivideMix: Learning with Noisy Labels as Semi-supervised Learning || [https://openreview.net/pdf?id=HJgExaVtwr] || ||<br />
|-<br />
|Week of Nov 30 || Banno Dion, Battista Joseph, Kahn Solomon || 21|| Music Recommender System Based on Genre using Convolutional Recurrent Neural Networks || [https://www.sciencedirect.com/science/article/pii/S1877050919310646] || ||<br />
|-<br />
|Week of Nov 30 || Sai Arvind Budaraju, Isaac Ellmen, Dorsa Mohammadrezaei, Emilee Carson || 22|| A universal SNP and small-indel variant caller using deep neural networks||[https://www.nature.com/articles/nbt.4235.epdf?author_access_token=q4ZmzqvvcGBqTuKyKgYrQ9RgN0jAjWel9jnR3ZoTv0NuM3saQzpZk8yexjfPUhdFj4zyaA4Yvq0LWBoCYQ4B9vqPuv8e2HHy4vShDgEs8YxI_hLs9ov6Y1f_4fyS7kGZ Paper] || ||<br />
|-<br />
|Week of Nov 30 || Daniel Fagan, Cooper Brooke, Maya Perelman || 23|| Efficient kNN Classification With Different Number of Nearest Neighbors || [https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7898482 Paper] || ||<br />
|-<br />
|Week of Nov 30 || Karam Abuaisha, Evan Li, Jason Pu, Nicholas Vadivelu || 24|| Being Bayesian about Categorical Probability || [https://proceedings.icml.cc/static/paper_files/icml/2020/3560-Paper.pdf Paper] || ||<br />
|-<br />
|Week of Nov 30 || Anas Mahdi Will Thibault Jan Lau Jiwon Yang || 25|| Loss Function Search for Face Recognition<br />
|| [https://proceedings.icml.cc/static/paper_files/icml/2020/245-Paper.pdf] paper || ||<br />
|-<br />
|Week of Nov 30 ||Zihui (Betty) Qin, Wenqi (Maggie) Zhao, Muyuan Yang, Amartya (Marty) Mukherjee || 26|| Deep Learning for Cardiologist-level Myocardial Infarction Detection in Electrocardiograms || [https://arxiv.org/pdf/1912.07618.pdf?fbclid=IwAR0RwATSn4CiT3qD9LuywYAbJVw8YB3nbex8Kl19OCExIa4jzWaUut3oVB0 Paper] || Summary [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Learning_for_Cardiologist-level_Myocardial_Infarction_Detection_in_Electrocardiograms&fbclid=IwAR1Tad2DAM7LT6NXXuSYDZtHHBvN0mjZtDdCOiUFFq_XwVcQxG3hU-3XcaE] ||<br />
|-<br />
|Week of Nov 30 || Stan Lee, Seokho Lim, Kyle Jung, Daehyun Kim || 27|| Bag of Tricks for Efficient Text Classification || [https://arxiv.org/pdf/1607.01759.pdf paper] || ||<br />
|-<br />
|Week of Nov 30 || Yawen Wang, Danmeng Cui, ZiJie Jiang, Mingkang Jiang, Haotian Ren, Haris Bin Zahid || 28|| A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques || [https://arxiv.org/pdf/1707.02919.pdf Paper] || ||<br />
|-<br />
|Week of Nov 30 || Qing Guo, XueGuang Ma, James Ni, Yuanxin Wang || 29|| Mask R-CNN || [https://arxiv.org/pdf/1703.06870.pdf Paper] || ||<br />
|-<br />
|Week of Nov 30 || Bertrand Sodjahin, Junyi Yang, Jill Yu Chieh Wang, Yu Min Wu, Calvin Li || 30|| Research paper classifcation systems based on TF‑IDF and LDA schemes || [https://hcis-journal.springeropen.com/articles/10.1186/s13673-019-0192-7?fbclid=IwAR3swO-eFrEbj1BUQfmomJazxxeFR6SPgr6gKayhs38Y7aBG-zX1G3XWYRM Paper] || ||<br />
|-<br />
|Week of Nov 30 || Daniel Zhang, Jacky Yao, Scholar Sun, Russell Parco, Ian Cheung || 31 || Speech2Face: Learning the Face Behind a Voice || [https://arxiv.org/pdf/1905.09773.pdf?utm_source=thenewstack&utm_medium=website&utm_campaign=platform Paper] || ||<br />
|-<br />
|Week of Nov 30 || Siyuan Xia, Jiaxiang Liu, Jiabao Dong, Yipeng Du || 32 || Evaluating Machine Accuracy on ImageNet || [https://proceedings.icml.cc/static/paper_files/icml/2020/6173-Paper.pdf] || ||<br />
|-<br />
|Week of Nov 30 || Msuhi Wang, Siyuan Qiu, Yan Yu || 33 || Surround Vehicle Motion Prediction Using LSTM-RNN for Motion Planning of Autonomous Vehicles at Multi-Lane Turn Intersections || [https://ieeexplore.ieee.org/abstract/document/8957421 paper] || ||</div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=F21-STAT_441/841_CM_763-Proposal&diff=42784F21-STAT 441/841 CM 763-Proposal2020-10-18T18:57:17Z<p>T358wang: </p>
<hr />
<div>Use this format (Don’t remove Project 0)<br />
<br />
Project # 0 Group members:<br />
<br />
Last name, First name<br />
<br />
Last name, First name<br />
<br />
Last name, First name<br />
<br />
Last name, First name<br />
<br />
Title: Making a String Telephone<br />
<br />
Description: We use paper cups to make a string phone and talk with friends while learning about sound waves with this science project. (Explain your project in one or two paragraphs).<br />
<br />
--------------------------------------------------------------------<br />
<br />
'''Project # 1 Group members:'''<br />
<br />
Song, Quinn<br />
<br />
Loh, William<br />
<br />
Bai, Junyue<br />
<br />
Choi, Phoebe<br />
<br />
'''Title:''' APTOS 2019 Blindness Detection<br />
<br />
'''Description:'''<br />
<br />
Our team chose the APTOS 2019 Blindness Detection Challenge from Kaggle. The goal of this challenge is to build a machine learning model that detects diabetic retinopathy by screening retina images.<br />
<br />
Millions of people suffer from diabetic retinopathy, the leading cause of blindness among working-aged adults. It is caused by damage to the blood vessels of the light-sensitive tissue at the back of the eye (retina). In rural areas where medical screening is difficult to conduct, it is challenging to detect the disease efficiently. Aravind Eye Hospital hopes to utilize machine learning techniques to gain the ability to automatically screen images for disease and provide information on how severe the condition may be.<br />
<br />
Our team plans to solve this problem by applying our knowledge in image processing and classification.<br />
<br />
<br />
----<br />
<br />
'''Project # 2 Group members:'''<br />
<br />
Li, Dylan<br />
<br />
Li, Mingdao<br />
<br />
Lu, Leonie<br />
<br />
Sharman,Bharat<br />
<br />
'''Title:''' Risk prediction in life insurance industry using supervised learning algorithms<br />
<br />
'''Description:'''<br />
<br />
In this project, we aim to replicate and possibly improve upon the work of Jayabalan et al. in their paper “Risk prediction in life insurance industry using supervised learning algorithms”. We will be using the Prudential Life Insurance Data Set that the authors have used and have shared with us. We will be pre-processing the data to replace missing values, using feature selection using CFS and feature reduction using PCA use this processed data to perform Classification via four algorithms – Neural Networks, Random Tree, REPTree and Multiple Linear Regression. We will compare the performance of these Algorithms using MAE and RMSE metrics and come up with visualizations that can explain the results easily even to a non-quantitative audience. <br />
<br />
Our goal behind this project is to learn applying the algorithms that we learned in our class to an industry dataset and come up with results that we can aid better, data-driven decision making.<br />
<br />
----<br />
<br />
'''Project # 3 Group members:'''<br />
<br />
Parco, Russel<br />
<br />
Sun, Scholar<br />
<br />
Yao, Jacky<br />
<br />
Zhang, Daniel<br />
<br />
'''Title:''' Lyft Motion Prediction for Autonomous Vehicles<br />
<br />
'''Description:''' <br />
<br />
Our team has decided to participate in the Lyft Motion Prediction for Autonomous Vehicles Kaggle competition. The aim of this competition is to build a model which given a set of objects on the road (pedestrians, other cars, etc), predict the future movement of these objects.<br />
<br />
Autonomous vehicles (AVs) are expected to dramatically redefine the future of transportation. However, there are still significant engineering challenges to be solved before one can fully realize the benefits of self-driving cars. One such challenge is building models that reliably predict the movement of traffic agents around the AV, such as cars, cyclists, and pedestrians.<br />
<br />
Our aim is to apply classification techniques learned in class to optimally predict how these objects move.<br />
<br />
----<br />
<br />
'''Project # 4 Group members:'''<br />
<br />
Chow, Jonathan<br />
<br />
Dharani, Nyle<br />
<br />
Nasirov, Ildar<br />
<br />
'''Title:''' Classification with Abstinence<br />
<br />
'''Description:''' <br />
<br />
We seek to implement the algorithm described in [https://papers.nips.cc/paper/9247-deep-gamblers-learning-to-abstain-with-portfolio-theory.pdf Deep Gamblers: Learning to Abstain with Portfolio Theory]. The paper describes augmenting classification problems to include the option of abstaining from making a prediction when confidence is low.<br />
<br />
Medical imaging diagnostics is a field in which classification could assist professionals and improve life expectancy for patients through increased accuracy. However, there are also severe consequences to incorrect predictions. As such, we also hope to apply the algorithm implemented to the classification of medical images, specifically instances of normal and pneumonia [https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia? chest x-rays]. <br />
<br />
----<br />
<br />
'''Project # 5 Group members:'''<br />
<br />
Jones, Hayden<br />
<br />
Leung, Michael<br />
<br />
Haque, Bushra<br />
<br />
Mustatea, Cristian<br />
<br />
'''Title:''' Combine Convolution with Recurrent Networks for Text Classification<br />
<br />
'''Description:''' <br />
<br />
Our team chose to reproduce the paper [https://arxiv.org/pdf/2006.15795.pdf Combine Convolution with Recurrent Networks for Text Classification] on Arxiv. The goal of this paper is to combine CNN and RNN architectures in a way that more flexibly combines the output of both architectures other than simple concatenation through the use of a “neural tensor layer” for the purpose of improving at the task of text classification. In particular, the paper claims that their novel architecture excels at the following types of text classification: sentiment analysis, news categorization, and topical classification. Our team plans to recreate this paper by working in pairs of 2, one pair to implement the CNN pipeline and the other pair to implement the RNN pipeline. We will be working with Tensorflow 2, Google Collab, and reproducing the paper’s experimental results with training on the same 6 publicly available datasets found in the paper.<br />
<br />
----<br />
<br />
'''Project # 6 Group members:'''<br />
<br />
Chin, Ruixian<br />
<br />
Ong, Jason<br />
<br />
Chiew, Wen Cheen<br />
<br />
Tan, Yan Kai<br />
<br />
'''Title:''' Mechanisms of Action (MoA) Prediction<br />
<br />
'''Description:''' <br />
<br />
Our team chose to participate in a Kaggle research challenge "Mechanisms of Action (MoA) Prediction". This competition is a project within the Broad Institute of MIT and Harvard, the Laboratory for Innovation Science at Harvard (LISH), and the NIH Common Funds Library of Integrated Network-Based Cellular Signatures (LINCS), present this challenge with the goal of advancing drug development through improvements to MoA prediction algorithms.<br />
----<br />
<br />
'''Project # 7 Group members:'''<br />
<br />
Ren, Haotian <br />
<br />
Cheung, Ian Long Yat<br />
<br />
Hussain, Swaleh <br />
<br />
Zahid, Bin, Haris <br />
<br />
'''Title:''' Transaction Fraud Detection <br />
<br />
'''Description:''' <br />
<br />
Protecting people from fraudulent transactions is an important topic for all banks and internet security companies. This Kaggle project is based on the dataset from IEEE Computational Intelligence Society (IEEE-CIS). Our objective is to build a more efficient model in order to recognize each fraud transaction with a higher accuracy and higher speed.<br />
----<br />
<br />
'''Project # 8 Group members:'''<br />
<br />
ZiJie, Jiang<br />
<br />
Yawen, Wang<br />
<br />
DanMeng, Cui<br />
<br />
MingKang, Jiang<br />
<br />
'''Title:''' Lyft Motion Prediction for Autonomous Vehicles <br />
<br />
'''Description:'''<br />
<br />
Our team chose to participate in the Kaggle Challenge "Lyft Motion Prediction for Autonomous Vehicles". We will apply our science skills to build motion prediction models for self-driving vehicles. The model will be able to predict the movement of traffic agents around the AV, such as cars, cyclists, and pedestrians. The goal of this competition is to predict the trajectories of other traffic participants.<br />
<br />
----------------------------------------------------------------------<br />
<br />
<br />
'''Project # 9 Group members:'''<br />
<br />
Banno, Dion <br />
<br />
Battista, Joseph<br />
<br />
Kahn, Solomon <br />
<br />
'''Title:''' Increasing Spotify user engagement through predictive personalization<br />
<br />
'''Description:''' <br />
<br />
Our project is an application of classification to the domain of predictive personalization. The goal of the project is to increase Spotify user engagement through data-driven methods. Given a set of users’ demographic data, listening preferences and behaviour, our goal is to build a recommendation system that suggests new songs to users. From a potential pool of songs to suggest, the final song recommendations will be driven by a classification algorithm that measures a given user’s propensity to like a song. We plan on leveraging the Spotify Web API to gather data about songs and collecting user data from consenting peers.<br />
<br />
<br />
-----------------------------------------------------------------------<br />
<br />
'''Project # 10 Group members:'''<br />
<br />
Qing, Guo <br />
<br />
Wang, Yuanxin<br />
<br />
James, Ni<br />
<br />
Xueguang, Ma<br />
<br />
'''Title:''' Mechanisms of Action (MoA) Prediction<br />
<br />
'''Description:''' <br />
<br />
Our team has decided to participate in the Mechanisms of Action (MoA) Prediction Kaggle competition. This is a challenge with the goal of advancing drug development through improvements to MoA prediction algorithms.<br />
Our team plan to develop an algorithm to predict a compound’s MoA given its cellular signature and our goal is to learn various algorithms taught in this course.<br />
<br />
<br />
-----------------------------------------------------------------------<br />
<br />
'''Project # 11 Group members:'''<br />
<br />
Yang, Jiwon <br />
<br />
Mahdi, Anas<br />
<br />
Thibault, Will<br />
<br />
Lau, Jan<br />
<br />
'''Title:''' Application of classification in human fatigue analysis<br />
<br />
'''Description:''' <br />
<br />
The goal of this project is to classify different levels of fatigue based on motion capture (Vicon) and force plates data. First, we plan to obtain data from 4 to 6 participants performing squats or squats with weights and rate them on a fatigue scale, with each participant doing at least 50 to 100 reps. We will collect data with EMG, IMU, force plates, and Vicon. When the participants are squatting, we will ask them about their fatigue level, and compare their feedback against the fatigue level recorded on EMG. The fatigue level will be on a scale of 1 to 10 (1 being not fatigued at all and 10 being cannot continue anymore). Once data is collected, we will classify the motion capture and force plates data into the different levels of fatigue.<br />
<br />
<br />
------------------------------------------------------------------------<br />
<br />
'''Project # 12 Group members:'''<br />
<br />
Xiaolan Xu, <br />
<br />
Robin Wen, <br />
<br />
Yue Weng, <br />
<br />
Beizhen Chang<br />
<br />
'''Title:''' Identification (Classification) of Submillimetre Galaxies Based on Multiwavelength Data in Astronomy<br />
<br />
'''Description:''' <br />
<br />
Identifying the counterparts of submillimetre galaxies (SMGs) in multiwavelength images is important to the study of galaxy evolution in astronomy. However, obtaining a statistically significant sample of robust associations is very challenging because of the poor angular resolution of single-dish submm facilities, that is we can not tell which galalxy is actually responsible for the submillimeter emission from a group of possible candidates due to the poor resolution. Recently, a set of labelled dataset is obtained from ALMA, a milliemetre/submilliemetre telescope array with the sufficient resolution to pin down the exact source of submillimeter emssion. However, applying such array to large fraction of skies are not feasible, so it is of practical interest to develop algorithm to identify submillimetre galaxies (SMGs) based on the other available data. With this newly labelled dataset from ALMA, it is possible to test and develop different new alrgorithms and apply them on unlabelled data to detect submillimetre galaxies.<br />
<br />
In our work, we primarily build on the works of Liu et al.(https://arxiv.org/abs/1901.09594), which tested a set of standard classification algorithms to the dataset. We aim to first reproduce their work and test other classification algorithms with a more stastics centered perspective. Next, we hope to possibly extend their works from one or some of the following directions: (1)Incorporating some other relevant features to augment the dimensions of the available dataset for better classification rate. (2)Taking the measurement error into the classifcation algorithms, possibly from a Bayesian approach. (All features in astronomy datasets come from actual physical measurements, which come with an error bar. However, it is not clear how to incoporate this error into the classification task.) (3)The possibility of combining some tradtional astronomy approaches with algorithms from ML.<br />
<br />
------------------------------------------------------------------------<br />
<br />
'''Project # 13 Group members:'''<br />
<br />
<br />
Zihui (Betty) Qin,<br />
<br />
Wenqi (Maggie) Zhao,<br />
<br />
Muyuan Yang,<br />
<br />
Amartya (Marty) Mukherjee,<br />
<br />
'''Title:''' Insider Trading Roles Classification Prediction on United States conventional stock or non-derivative transaction<br />
<br />
'''Description:'''<br />
<br />
Background (why we were interested in classifying based on insiders): <br />
The United States is one of the most frequently traded financial markets in the world. The dataset captures all insider activities as reported on SEC (U.S. Securities and Exchange Commission) forms 3, 4, 5, and 144. We believe that using variables (such as transaction date, security type, and transaction amount), we could predict the roles code for a new transaction. The reason for the chosen prediction is that the role of the insider gives investors signals of potential internal activities and private information. This is crucial for investors to detect important market signals from those insider trading activities, such that they could benefit from the market. <br />
<br />
Goal: To classify the role of an insider in a company based on the data of their trades.<br />
<br />
<br />
------------------------------------------------------------------------<br />
<br />
'''Project # 14 Group members:'''<br />
<br />
Jung, Kyle<br />
<br />
Kim, Dae Hyun<br />
<br />
Lee, Stan<br />
<br />
Lim, Seokho<br />
<br />
'''Title:''' Mechanisms of Action (MoA) Prediction Competition<br />
<br />
'''Description:''' The main objective of this Kaggle competition is to help to develop an algorithm to predict a compound's MoA given its cellular signature, helping scientists advance the drug discovery process. Our execution plan is to apply concepts and algorithms learned in STAT441 and apply multi-label classification. Through the process, our team will learn biological knowledge necessary to complete and enhance our classification thought-process. https://www.kaggle.com/c/lish-moa<br />
<br />
------------------------------------------------------------------------<br />
<br />
'''Project # 15 Group Members:'''<br />
<br />
Li, Evan<br />
<br />
Abuaisha, Karam<br />
<br />
Vadivelu, Nicholas<br />
<br />
Pu, Jason<br />
<br />
'''Title:''' Predict Students Answering Ability Kaggle Competition<br />
<br />
'''Description:'''<br />
<br />
https://www.kaggle.com/c/riiid-test-answer-prediction<br />
We plan on tackling this Kaggle competition that revolves around classifying whether students are able to answer their next questions correctly. The data provided consists of the student’s historic performance, the performance of other students on the same question, metadata about the question itself, and more. The theme of the competition is to tailor education to a student’s ability as an AI tutor.<br />
<br />
------------------------------------------------------------------------<br />
<br />
'''Project # 16 Group members:'''<br />
<br />
Hall, Matthew<br />
<br />
Chalaturnyk, Johnathan<br />
<br />
'''Title:''' Predicting CO and NOx emissions from gas turbines: novel data and a benchmark PEMS<br />
<br />
'''Description:'''<br />
<br />
Predictive emission monitoring systems (PEMS) are used in conjunction with measurement instruments to predict the amount of emissions exuded from Gas turbine engines. The implementation of this system is reliant on the availability of proper measurements and ecological data points. We will attempt to adjust the novel PEMS implementation from this paper in the hopes of improving the prediction of CO and NOx emission levels from the turbines. Using data points collected over the previous five years, we'll use a number of machine learning algorithms to discuss possible future research areas. Finally, we will compare our methods against the benchmark presented in this paper in order to measure the effectiveness of our problem solutions.<br />
<br />
------------------------------------------------------------------------<br />
<br />
'''Project # 17 Group members:'''<br />
<br />
Yang, Junyi<br />
<br />
Wang, Jill Yu Chieh<br />
<br />
Wu, Yu Min<br />
<br />
Li, Calvin<br />
<br />
'''Title:''' Humpback Whale Identification<br />
<br />
'''Description:'''<br />
<br />
Our team will participate in the Kaggle challenge, Humpback Whale Identification. The main objective is to build a multi-class classification model to identify whales' class base on their tail. There are a total of over 3000 classes and 25361 training images. The challenge is that for each class, there are only on average 8 training data. <br />
<br />
------------------------------------------------------------------------<br />
'''Project # 18 Group members:''' <br />
<br />
Lian, Jinjiang <br />
<br />
Zhu, Yisheng <br />
<br />
Huang, Mingzhe <br />
<br />
Hou, Jiawen <br />
<br />
'''Title:''' Mechanisms of Action (MoA) Prediction <br />
<br />
'''Description:''' <br />
<br />
The final project of our team is the Kaggle ongoing competition -- Mechanism of Action(MoA) Prediction. The goal is to improve the MoA prediction algorithm to assist and advance drug development. MoA algorithm helps scientists approach more targeted medicine molecules based on the biological mechanism of disease. This would strongly shorten the medicine development cycle. Here, MoA here is to apply different drugs to human cells to analyze the corresponding reaction and the dataset provides simultaneous measurement of 100 types of human cells and 5000 drugs. <br />
<br />
To tackle this competition, after data cleaning and feature engineering, we are going to try a selection of ML algorithms such as logistic regression, tree-based method, SVM, etc and find the optimized one that can best complete the tasks. Depending on how we perform, we might utilize other technics such as model ensembling or stacking.<br />
<br />
------------------------------------------------------------------------<br />
'''Project # 19 Group members:''' <br />
<br />
Fagan, Daniel <br />
<br />
Brooke, Cooper <br />
<br />
Perelman, Maya <br />
<br />
'''Title:''' Mechanisms of Action (MoA) Prediction (https://www.kaggle.com/c/lish-moa/overview/description)<br />
<br />
'''Description:''' <br />
<br />
For our final project, we will be competing in the Mechanisms of Action (MoA) Prediction Research Challenge on Kaggle. MoA refers to the description of the biological activity of a given molecule and scientists have specific interest in the MoA of molecules as it pertains to the advancement of drugs. This is because under new frameworks, scientists are looking to develop molecules that can modulate protein targets associated with given diseases. Our task will be to analyze a dataset containing human cellular responses to more than 5, 000 drugs and to classify these responses with one or more MoA.<br />
<br />
For this competition, we plan to use various classification algorithms taught in STAT 441 followed by model validation techniques to ultimately select the most accurate model based on the logarithmic loss function which was specified by Kaggle.<br />
<br />
------------------------------------------------------------------------<br />
'''Project # 20 Group members:''' <br />
Cheng, Leyan<br />
<br />
Dai, Mingyan<br />
<br />
Jiang, Daniel <br />
<br />
Huang, Jerry<br />
<br />
'''Title:''' Riiid! Answer Correctness Prediction<br />
<br />
'''Description:'''<br />
<br />
We will be competing in the Riiid! Kaggle Challenge. The goal of this challenge is to create algorithms for "Knowledge Tracing," the modeling of student knowledge over time. The goal is to accurately predict how students will perform on future interactions.<br />
<br />
We plan on using the classification techniques and model validation techniques learned in the course in order to design an algorithm that can accurately predict the actions of students.<br />
<br />
------------------------------------------------------------------------<br />
'''Project # 21 Group members:''' <br />
<br />
Carson, Emilee<br />
<br />
Ellmen, Isaac<br />
<br />
Mohammadrezaei, Dorsa<br />
<br />
Budaraju, Sai Arvind<br />
<br />
<br />
'''Title:''' Classifying SARS-CoV-2 region of origin based on DNA/RNA sequence<br />
<br />
'''Description:'''<br />
<br />
Determining the location of origin for a viral sequence is an important tool for epidemiological tracking. Knowing where a virus comes from allows epidemiologists to track how a virus is spreading. There are significant efforts to track the spread of SARS-CoV-2. As an RNA virus, SARS-CoV-2 mutates frequently. Most of these mutations carry negligible changes to the function of the virus but act as “barcodes” for specific strains. As the virus spreads in a region, it picks up mutations which allow researchers to identify which sequences correspond to which regions.<br />
<br />
The standard method for classifying viruses based on location is to:<br />
<br />
- Perform a multiple sequence alignment (MSA)<br />
<br />
- Build a phylogenetic tree of the MSA<br />
<br />
- Empirically determine which regions have which sections of the tree<br />
<br />
Phylogenetic trees are an excellent tool for tracking evolutionary changes over time but we wonder if there are better methods for classifying the region of origin for a virus using machine learning techniques.<br />
<br />
Our plan is to perform PCA on the MSA which is available through GISAID. We will determine an appropriate encoding for sequence alignments to vectors and map the aligned sequences onto a much lower dimensional space. We will then use LDA or QDA to classify points based on region (continent). We will also examine if the same technique works well for classifying sequences based on state of origin for samples from the United States. We may try other classification techniques such as logistic regression or neural nets. Finally, we know that projecting data to a small number of principal components and then projecting back to the original space can reduce noise in certain datasets. In the case of mutations, this might correspond to removing insignificant mutations. It is possible that there are certain mutations which induce functional changes in the virus which would be of greater medical interest. Our hope is that we could detect these using PCA.<br />
<br />
------------------------------------------------------------------------<br />
'''Project # 22 Group members:''' <br />
<br />
Chang, Luwen<br />
<br />
Yu, Qingyang<br />
<br />
Kong, Tao <br />
<br />
Sun, Tianrong<br />
<br />
'''Title:''' Riiid! Answer Correctness Prediction<br />
<br />
'''Description:'''<br />
<br />
For the final project, we chose the featured Kaggle Competition named Riiid! Answer Correctness Prediction. The purpose of this challenge is to build a machine learning model to predict the students' interaction performance. (https://www.kaggle.com/c/riiid-test-answer-prediction)<br />
<br />
We plan to use classification and regression techniques learned in this course to build the model and use area under ROC curve to evaluate our model.<br />
<br />
------------------------------------------------------------------------<br />
'''Project # 23 Group members:''' <br />
<br />
Han, Jihoon<br />
<br />
Vera De Casey<br />
<br />
Jawad Solaiman<br />
<br />
'''Title:''' Lyft Motion Prediction for Autonomous Vehicles<br />
<br />
'''Description:'''<br />
<br />
We are planning to compete in the Lyft Motion Prediction for Autonomous Vehicles Challenge on Kaggle. Our goal is to build a motion prediction model for the self-driving car by using our machine learning knowledge as well as utilizing the training and testing data sets. The motion prediction model will predict the motion of traffic agents around the car, such as cars, cyclists, and pedestrians. We are not sure if we have to classify the agents into three categories (cars, cyclists, pedestrians) ourselves. If so, we will initially start by using the single-shot detector algorithm and improve through it.<br />
<br />
------------------------------------------------------------------------<br />
'''Project # 24 Group members:''' <br />
<br />
Guanting Pan<br />
<br />
Haocheng Chang <br />
<br />
Zaiwei Zhang<br />
<br />
'''Title:''' Reproducing result in Accelerated Stochastic Power Iteration<br />
<br />
'''Description:'''<br />
<br />
As our final project, we will reproduce the stochastic PCA algorithm designed by De Sa, He, Mitliagkas, Ré, and Xu to accelerate the iteration complexity for power iteration. By doing so, we are aiming to achieve a final rate of 𝒪(1/sqrt(Δ)) for our reproduction result. We are also hoping to explore and discuss the potentiality for applying such an acceleration method to other non-convex optimization problems, as mentioned in the original paper if there is additional time to do so. Link to the paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6557638/pdf/nihms-993807.pdf<br />
<br />
------------------------------------------------------------------------<br />
'''Project # 25 Group members:''' <br />
<br />
Haoran Dong<br />
<br />
Mushi Wang<br />
<br />
Siyuan Qiu<br />
<br />
Yan Yu<br />
<br />
'''Title:''' Lyft Motion Prediction for Autonomous Vehicles<br />
<br />
'''Description:'''<br />
<br />
We want to be involved in the Kaggle Challenge "Lyft Motion Prediction for Autonomous Vehicles". The goal is to build a motion prediction model for the self-driving car by machine learning with the datasets they provided.<br />
<br />
------------------------------------------------------------------------<br />
'''Project # 26 Group members:''' <br />
<br />
Sangeeth Kalaichanthiran<br />
<br />
Evan Peters<br />
<br />
Cynthia Mou<br />
<br />
Yuxin Wang<br />
<br />
'''Title:''' Mechanisms of Action (MoA) Prediction<br />
<br />
'''Description:'''<br />
<br />
Our team chose the "Mechanisms of Action (MoA) Prediction" challenge on Kaggle. Mechanisms of Action, MOA for short, describes the biological response of human cells to a particular molecule (the drug). The goal is to develop an algorithm that can predict the biological response of a drug based on its similarities to other known drugs. <br />
<br />
Our team hopes to develop a superior algorithm by using our knowledge of supervised learning methods.<br />
<br />
------------------------------------------------------------------------<br />
'''Project # 27 Group members:''' <br />
<br />
Delaney Smith<br />
<br />
Mohammad Assem Mahmoud<br />
<br />
'''Title:''' Replicating "Electrocardiogram heartbeat classification based on a deep convolutional<br />
neural network and focal loss"<br />
<br />
'''Description:'''<br />
<br />
For our project, we intend to replicate and hopefully, extend the work of Romdhane et al.’s 2020 paper “Electrocardiogram heartbeat classification based on a deep convolutional neural network and focal loss”. In this paper, the authors develop a deep convoluted neural network that exploits a novel loss function, focal loss, to classify heartbeats into five arrhythmia categories (N, S, V, Q and F) based on the AAMI standard. The network was trained and tested against two ECG datasets, MIT-BIH and INCART, and returned a 98.41% overall accuracy, a 98.38% overall F1-score, a 98.37% overall prevision and a 98.41% overall recall, which we intend to replicate. <br />
Interestingly, focal loss was implemented to prevent bias towards larger classes (normal heart beats) without needing to augment the smaller class data (diseased heart beats), however the authors did not outline which method actually performs better. Therefore, we hope to extend their work by answering this question in this project.<br />
------------------------------------------------------------------------<br />
'''Project # 28 Group members:''' <br />
<br />
Fang Yuqin<br />
<br />
Fu Rao<br />
<br />
Li Siqi<br />
<br />
Zhou Zeping<br />
<br />
'''Title:''' The Spectrum of the Fisher Information Matrix of a Single-Hidden-Layer Neural Network<br />
<br />
'''Description:'''<br />
Our group aims to dig more on single hidden layer neural network based on what we have learned from class. We'll focus on data that follows the Gaussian distribution and weights such that we can provide some expression in terms of the spectrum in the limit of infinite width. We believe that we can improve the efficiency of first-order optimization problems by applying spectrun. <br />
------------------------------------------------------------------------<br />
'''Project # 29 Group members:''' <br />
<br />
Rui Gong<br />
<br />
Xuetong Wang<br />
<br />
Xinqi Ling<br />
<br />
Di Ma<br />
<br />
'''Title:''' Convolution Neural Network for Rainy day Prediction<br />
<br />
'''Description:'''<br />
<br />
Our project is an application on rainy day prediction using convolution neural network. The goal of our project is making a prediction if tomorrow is going to be a rainy day by using history data of the past week and some indicators such as temperature. We are planning to get the past weather data by Yahoo web API.<br />
------------------------------------------------------------------------<br />
'''Project # 30 Group members:''' <br />
<br />
Jiabao Dong<br />
<br />
Jiaxiang Liu<br />
<br />
Siyuan Xia<br />
<br />
Yipeng Du<br />
<br />
'''Title:''' Privacy-Preserving Classification of Personal Text Messages with Secure Multi-Party Computation<br />
<br />
'''Description:'''<br />
We aim to replicate the work demonstrated in [https://papers.nips.cc/paper/8632-privacy-preserving-classification-of-personal-text-messages-with-secure-multi-party-computation.pdf Privacy-Preserving Classification of Personal Text Messages with Secure Multi-Party Computation]. <br />
<br />
Personal text classification has many useful applications such as mental health care and security surveillance, but also raises concerns about personal privacy. The method proposed in this paper is based on Secure Multiparty Computation (SMC) and avoids (un)intentional privacy violations. The method then extracts features from texts and classifies with logistic regression and tree ensembles. This paper claims to have proposed the first privacy-preserving (PP) solution for text classification that is provably secure.<br />
<br />
------------------------------------------------------------------------<br />
<br />
'''Project # 31 Group members:''' <br />
<br />
Tompkins, Grace<br />
<br />
Krikella, Tatiana<br />
<br />
'''Title:''' An application of Adapting Neural Networks for the Estimation of Treatment Effects (Shi, Blei, and Veitch 2019)<br />
'''Description:'''<br />
We will be using the methodology presented in "Adapting Neural Networks for the Estimation of Treatment Effects" by Claudia Shi, David M. Blei, and Victor Veitch and applying it to a new dataset and simulated data. This method is used to estimate treatment effects from observational data via an architecture called "Dragonnet" which uses propensity scoring for estimation adjustment and targeted regularization. This method has been shown to out-perform existing methods for benchmark datasets, and we will apply it to a new dataset (TBD) and simulated data to evaluate it's performance for classification and prediction.<br />
<br />
We will use R for analysis.<br />
<br />
Link to paper: [http://papers.nips.cc/paper/8520-adapting-neural-networks-for-the-estimation-of-treatment-effects]<br />
<br />
------------------------------------------------------------------------<br />
<br />
'''Project # 32 Group members:''' <br />
<br />
Taohao Wang<br />
Zeren Shen<br />
Zihao Guo<br />
Rui Chen<br />
<br />
'''Title:''' Google Landmark Recognition 2020<br />
<br />
'''Description:'''<br />
Our team decided to give a try for "Google Landmark Recognition 2020" (kaggle) competition,<br />
in which the competitors are asked to build a model to detect any existing landmarks within provided test images.<br />
This competition is challenging in its own way: it has more than 81K classes within its data, where traditional CNN would very<br />
likely to fail(too many parameters to train, especially when taking convolutional layers into account). We will like to implement several <br />
algorithms/frameworks which can utilize a large amount of data with noisy labels, apply them to the provided dataset, and compare their performance(training time, <br />
number of parameters trained, multiple metrics for accuracy/loss evaluation... etc) for our report.</div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat441F21&diff=42741stat441F212020-10-09T23:56:51Z<p>T358wang: /* Paper presentation */</p>
<hr />
<div><br />
<br />
== [[F20-STAT 441/841 CM 763-Proposal| Project Proposal ]] ==<br />
<br />
<!--[https://goo.gl/forms/apurag4dr9kSR76X2 Your feedback on presentations]--><br />
<br />
= Record your contributions here [https://docs.google.com/spreadsheets/d/10CHiJpAylR6kB9QLqN7lZHN79D9YEEW6CDTH27eAhbQ/edit?usp=sharing]=<br />
<br />
Use the following notations:<br />
<br />
P: You have written a summary/critique on the paper.<br />
<br />
T: You had a technical contribution on a paper (excluding the paper that you present).<br />
<br />
E: You had an editorial contribution on a paper (excluding the paper that you present).<br />
<br />
=Paper presentation=<br />
{| class="wikitable"<br />
<br />
{| border="1" cellpadding="3"<br />
|-<br />
|width="60pt"|Date<br />
|width="250pt"|Name <br />
|width="15pt"|Paper number <br />
|width="700pt"|Title<br />
|width="15pt"|Link to the paper<br />
|width="30pt"|Link to the summary<br />
|width="30pt"|Link to the video<br />
|-<br />
|Sep 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary] || [https://youtu.be/JWozRg_X-Vg?list=PLehuLRPyt1HzXDemu7K4ETcF0Ld_B5adG&t=539]<br />
|-<br />
|Week of Nov 16 ||Sharman Bharat, Li Dylan,Lu Leonie, Li Mingdao || 1|| Risk prediction in life insurance industry using supervised learning algorithms || [https://rdcu.be/b780J Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Bsharman Summary] ||<br />
|-<br />
|Week of Nov 16 || Delaney Smith, Mohammad Assem Mahmoud || 2|| Influenza Forecasting Framework based on Gaussian Processes || [https://proceedings.icml.cc/static/paper_files/icml/2020/1239-Paper.pdf] paper || ||<br />
|-<br />
|Week of Nov 16 || Tatianna Krikella, Swaleh Hussain, Grace Tompkins || 3|| Processing of Missing Data by Neural Networks || [http://papers.nips.cc/paper/7537-processing-of-missing-data-by-neural-networks] || ||<br />
|-<br />
|Week of Nov 16 ||Jonathan Chow, Nyle Dharani, Ildar Nasirov ||4 ||Streaming Bayesian Inference for Crowdsourced Classification ||[https://papers.nips.cc/paper/9439-streaming-bayesian-inference-for-crowdsourced-classification.pdf Paper] || ||<br />
|-<br />
|Week of Nov 16 || Matthew Hall, Johnathan Chalaturnyk || 5|| Neural Ordinary Differential Equations || [https://papers.nips.cc/paper/7892-neural-ordinary-differential-equations.pdf] || ||<br />
|-<br />
|Week of Nov 16 || Luwen Chang, Qingyang Yu, Tao Kong, Tianrong Sun || 6|| || || ||<br />
|-<br />
|Week of Nov 16 || Casey De Vera, Solaiman Jawad, Jihoon Han || 7|| || || ||<br />
|-<br />
|Week of Nov 16 || Yuxin Wang, Evan Peters, Cynthia Mou, Sangeeth Kalaichanthiran || 8|| Uniform convergence may be unable to explain generalization in deep learning || [https://papers.nips.cc/paper/9336-uniform-convergence-may-be-unable-to-explain-generalization-in-deep-learning.pdf] || ||<br />
|-<br />
|Week of Nov 16 || Yuchuan Wu || 9|| || || ||<br />
|-<br />
|Week of Nov 16 || Zhou Zeping || 10|| || || ||<br />
|-<br />
|Week of Nov 23 ||Jinjiang Lian, Jiawen Hou, Yisheng Zhu, Mingzhe Huang || 11|| DROCC: Deep Robust One-Class Classification || [https://proceedings.icml.cc/static/paper_files/icml/2020/6556-Paper.pdf paper] || ||<br />
|-<br />
|Week of Nov 23 || Bushra Haque, Hayden Jones, Michael Leung, Cristian Mustatea || 12|| Combine Convolution with Recurrent Netorks for Text Classification || [https://arxiv.org/pdf/2006.15795.pdf Paper] || ||<br />
|-<br />
|Week of Nov 23 || Taohao Wang, Zeren Shen, Zihao Guo, Rui Chen || 13|| Deep multiple instance learning for image classification and auto-annotation || [https://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Wu_Deep_Multiple_Instance_2015_CVPR_paper.pdf paper] || ||<br />
|-<br />
|Week of Nov 23 || Qianlin Song, William Loh, Junyue Bai, Phoebe Choi || 14|| Task Understanding from Confusing Multi-task Data || [https://proceedings.icml.cc/static/paper_files/icml/2020/578-Paper.pdf paper] || ||<br />
|-<br />
|Week of Nov 23 || Rui Gong, Xuetong Wang, Xinqi Ling, Di Ma || 15|| || || ||<br />
|-<br />
|Week of Nov 23 || Xiaolan Xu, Robin Wen, Yue Weng, Beizhen Chang || 16|| Graph Structure of Neural Networks || [https://proceedings.icml.cc/paper/2020/file/757b505cfd34c64c85ca5b5690ee5293-Paper.pdf Paper] || ||<br />
|-<br />
|Week of Nov 23 ||Hansa Halim, Sanjana Rajendra Naik, Samka Marfua, Shawrupa Proshasty || 17|| Superhuman AI for multiplayer poker || [https://www.cs.cmu.edu/~noamb/papers/19-Science-Superhuman.pdf Paper] || ||<br />
|-<br />
|Week of Nov 23 ||Guanting Pan, Haocheng Chang, Zaiwei Zhang || 18|| Point-of-Interest Recommendation: Exploiting Self-Attentive Autoencoders with Neighbor-Aware Influence || [https://arxiv.org/pdf/1809.10770.pdf Paper] || ||<br />
|-<br />
|Week of Nov 23 || Jerry Huang, Daniel Jiang, Minyan Dai, Leyan Cheng || 19|| Neural Speed Reading Via Skim-RNN ||[https://arxiv.org/pdf/1711.02085.pdf?fbclid=IwAR3EeFsKM_b5p9Ox7X9mH-1oI3U3oOKPBy3xUOBN0XvJa7QW2ZeJJ9ypQVo Paper] || ||<br />
|-<br />
|Week of Nov 23 ||Ruixian Chin, Yan Kai Tan, Jason Ong, Wen Cheen Chiew || 20|| DivideMix: Learning with Noisy Labels as Semi-supervised Learning || [https://openreview.net/pdf?id=HJgExaVtwr] || ||<br />
|-<br />
|Week of Nov 30 || Banno Dion, Battista Joseph, Kahn Solomon || 21|| Toward Improving the Prediction Accuracy of Product Recommendation System Using Extreme Gradient Boosting and Encoding Approaches|| [https://www.mdpi.com/2073-8994/12/9/1566] || ||<br />
|-<br />
|Week of Nov 30 || Sai Arvind Budaraju, Isaac Ellmen, Dorsa Mohammadrezaei, Emilee Carson || 22|| A universal SNP and small-indel variant caller using deep neural networks||[https://www.nature.com/articles/nbt.4235.epdf?author_access_token=q4ZmzqvvcGBqTuKyKgYrQ9RgN0jAjWel9jnR3ZoTv0NuM3saQzpZk8yexjfPUhdFj4zyaA4Yvq0LWBoCYQ4B9vqPuv8e2HHy4vShDgEs8YxI_hLs9ov6Y1f_4fyS7kGZ Paper] || ||<br />
|-<br />
|Week of Nov 30 || Daniel Fagan, Cooper Brooke, Maya Perelman || 23|| Efficient kNN Classification With Different Number of Nearest Neighbors || [https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7898482 Paper] || ||<br />
|-<br />
|Week of Nov 30 || Karam Abuaisha, Evan Li, Jason Pu, Nicholas Vadivelu || 24|| Being Bayesian about Categorical Probability || [https://proceedings.icml.cc/static/paper_files/icml/2020/3560-Paper.pdf Paper] || ||<br />
|-<br />
|Week of Nov 30 || Anas Mahdi Will Thibault Jan Lau Jiwon Yang || 25|| Loss Function Search for Face Recognition<br />
|| [https://proceedings.icml.cc/static/paper_files/icml/2020/245-Paper.pdf] paper || ||<br />
|-<br />
|Week of Nov 30 ||Zihui (Betty) Qin, Wenqi (Maggie) Zhao, Muyuan Yang, Amartya (Marty) Mukherjee || 26|| Deep Learning for Cardiologist-level Myocardial Infarction Detection in Electrocardiograms || [https://arxiv.org/pdf/1912.07618.pdf?fbclid=IwAR0RwATSn4CiT3qD9LuywYAbJVw8YB3nbex8Kl19OCExIa4jzWaUut3oVB0 Paper] || ||<br />
|-<br />
|Week of Nov 30 || Stan Lee, Seokho Lim, Kyle Jung, Daehyun Kim || 27|| Bag of Tricks for Efficient Text Classification || [https://arxiv.org/pdf/1607.01759.pdf paper] || ||<br />
|-<br />
|Week of Nov 30 || Yawen Wang, DanMeng Cui, ZiJie Jiang, MingKang Jiang || 28|| A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques || [https://arxiv.org/pdf/1707.02919.pdf Paper] || ||<br />
|-<br />
|Week of Nov 30 || Qing Guo, XueGuang Ma, James Ni, Yuanxin Wang || 29|| Mask R-CNN || [https://arxiv.org/pdf/1703.06870.pdf Paper] || ||<br />
|-<br />
|Week of Nov 30 || Bertrand Sodjahin, Junyi Yang, Jill Yu Chieh Wang, Yu Min Wu, Calvin Li || 30|| Research paper classifcation systems based on TF‑IDF and LDA schemes || [https://hcis-journal.springeropen.com/articles/10.1186/s13673-019-0192-7?fbclid=IwAR3swO-eFrEbj1BUQfmomJazxxeFR6SPgr6gKayhs38Y7aBG-zX1G3XWYRM Paper] || ||<br />
|-</div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat441F21&diff=42689stat441F212020-10-09T17:47:21Z<p>T358wang: /* Paper presentation */</p>
<hr />
<div><br />
<br />
== [[F20-STAT 441/841 CM 763-Proposal| Project Proposal ]] ==<br />
<br />
<!--[https://goo.gl/forms/apurag4dr9kSR76X2 Your feedback on presentations]--><br />
<br />
= Record your contributions here [https://docs.google.com/spreadsheets/d/10CHiJpAylR6kB9QLqN7lZHN79D9YEEW6CDTH27eAhbQ/edit?usp=sharing]=<br />
<br />
Use the following notations:<br />
<br />
P: You have written a summary/critique on the paper.<br />
<br />
T: You had a technical contribution on a paper (excluding the paper that you present).<br />
<br />
E: You had an editorial contribution on a paper (excluding the paper that you present).<br />
<br />
=Paper presentation=<br />
{| class="wikitable"<br />
<br />
{| border="1" cellpadding="3"<br />
|-<br />
|width="60pt"|Date<br />
|width="250pt"|Name <br />
|width="15pt"|Paper number <br />
|width="700pt"|Title<br />
|width="15pt"|Link to the paper<br />
|width="30pt"|Link to the summary<br />
|width="30pt"|Link to the video<br />
|-<br />
|Sep 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary] || [https://youtu.be/JWozRg_X-Vg?list=PLehuLRPyt1HzXDemu7K4ETcF0Ld_B5adG&t=539]<br />
|-<br />
|Week of Nov 16 ||Sharman Bharat, Li Dylan,Lu Leonie, Li Mingdao || 1|| Risk prediction in life insurance industry using supervised learning algorithms || [https://rdcu.be/b780J Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Bsharman Summary] ||<br />
|-<br />
|Week of Nov 16 || Delaney Smith, Mohammad Assem Mahmoud || 2|| Influenza Forecasting Framework based on Gaussian Processes || [https://proceedings.icml.cc/static/paper_files/icml/2020/1239-Paper.pdf] paper || ||<br />
|-<br />
|Week of Nov 16 || Tatianna Krikella, Swaleh Hussain, Grace Tompkins || 3|| Processing of Missing Data by Neural Networks || [http://papers.nips.cc/paper/7537-processing-of-missing-data-by-neural-networks] || ||<br />
|-<br />
|Week of Nov 16 ||Jonathan Chow, Nyle Dharani, Ildar Nasirov ||4 ||Streaming Bayesian Inference for Crowdsourced Classification ||[https://papers.nips.cc/paper/9439-streaming-bayesian-inference-for-crowdsourced-classification.pdf Paper] || ||<br />
|-<br />
|Week of Nov 16 || Matthew Hall, Johnathan Chalaturnyk || 5|| Neural Ordinary Differential Equations || [https://papers.nips.cc/paper/7892-neural-ordinary-differential-equations.pdf] || ||<br />
|-<br />
|Week of Nov 16 || Luwen Chang, Qingyang Yu, Tao Kong, Tianrong Sun || 6|| || || ||<br />
|-<br />
|Week of Nov 16 || Casey De Vera, Solaiman Jawad, Jihoon Han || 7|| || || ||<br />
|-<br />
|Week of Nov 16 || || 8|| || || ||<br />
|-<br />
|Week of Nov 16 || || 9|| || || ||<br />
|-<br />
|Week of Nov 16 || Zhou Zeping || 10|| || || ||<br />
|-<br />
|Week of Nov 23 ||Jinjiang Lian, Jiawen Hou, Yisheng Zhu, Mingzhe Huang || 11|| DROCC: Deep Robust One-Class Classification || [https://proceedings.icml.cc/static/paper_files/icml/2020/6556-Paper.pdf paper] || ||<br />
|-<br />
|Week of Nov 23 || Bushra Haque, Hayden Jones, Michael Leung, Cristian Mustatea || 12|| Combine Convolution with Recurrent Netorks for Text Classification || [https://arxiv.org/pdf/2006.15795.pdf Paper] || ||<br />
|-<br />
|Week of Nov 23 || Taohao Wang, Zeren Shen, || 13|| Deep multiple instance learning for image classification and auto-annotation || [https://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Wu_Deep_Multiple_Instance_2015_CVPR_paper.pdf paper] || ||<br />
|-<br />
|Week of Nov 23 || Qianlin Song, William Loh, Junyue Bai, Phoebe Choi || 14|| Task Understanding from Confusing Multi-task Data || [https://proceedings.icml.cc/static/paper_files/icml/2020/578-Paper.pdf paper] || ||<br />
|-<br />
|Week of Nov 23 || Rui Gong, Xuetong Wang, Xinqi Ling, Di Ma || 15|| || || ||<br />
|-<br />
|Week of Nov 23 || Xiaolan Xu, Robin Wen, Yue Weng, Beizhen Chang || 16|| Graph Structure of Neural Networks || [https://proceedings.icml.cc/paper/2020/file/757b505cfd34c64c85ca5b5690ee5293-Paper.pdf Paper] || ||<br />
|-<br />
|Week of Nov 23 ||Hansa Halim, Sanjana Rajendra Naik, Samka Marfua, Shawrupa Proshasty || 17|| Superhuman AI for multiplayer poker || [https://www.cs.cmu.edu/~noamb/papers/19-Science-Superhuman.pdf Paper] || ||<br />
|-<br />
|Week of Nov 23 ||Guanting Pan, Haocheng Chang, Zaiwei Zhang || 18|| Point-of-Interest Recommendation: Exploiting Self-Attentive Autoencoders with Neighbor-Aware Influence || [https://arxiv.org/pdf/1809.10770.pdf Paper] || ||<br />
|-<br />
|Week of Nov 23 || Jerry Huang, Daniel Jiang, Minyan Dai, Leyan Cheng || 19|| Neural Speed Reading Via Skim-RNN ||[https://arxiv.org/pdf/1711.02085.pdf?fbclid=IwAR3EeFsKM_b5p9Ox7X9mH-1oI3U3oOKPBy3xUOBN0XvJa7QW2ZeJJ9ypQVo Paper] || ||<br />
|-<br />
|Week of Nov 23 ||Ruixian Chin, Yan Kai Tan, Jason Ong, Wen Cheen Chiew || 20|| DivideMix: Learning with Noisy Labels as Semi-supervised Learning || [https://openreview.net/pdf?id=HJgExaVtwr] || ||<br />
|-<br />
|Week of Nov 30 || Banno Dion, Battista Joseph, Kahn Solomon || 21|| Toward Improving the Prediction Accuracy of Product Recommendation System Using Extreme Gradient Boosting and Encoding Approaches|| [https://www.mdpi.com/2073-8994/12/9/1566] || ||<br />
|-<br />
|Week of Nov 30 || Sai Arvind Budaraju || 22|| Mining insights from large-scale corpora using fine-tuned language models ||[https://ecai2020.eu/papers/1572_paper.pdf Paper] || ||<br />
|-<br />
|Week of Nov 30 || Daniel Fagan, Cooper Brooke || 23|| || || ||<br />
|-<br />
|Week of Nov 30 || Karam Abuaisha, Evan Li, Jason Pu, Nicholas Vadivelu || 24|| Being Bayesian about Categorical Probability || [https://proceedings.icml.cc/static/paper_files/icml/2020/3560-Paper.pdf Paper] || ||<br />
|-<br />
|Week of Nov 30 || Anas Mahdi Will Thibault Jan Lau Jiwon Yang || 25|| Loss Function Search for Face Recognition<br />
|| [https://proceedings.icml.cc/static/paper_files/icml/2020/245-Paper.pdf] paper || ||<br />
|-<br />
|Week of Nov 30 ||Zihui (Betty) Qin, Wenqi (Maggie) Zhao, Muyuan Yang, Amartya (Marty) Mukherjee || 26|| Deep Learning for Cardiologist-level Myocardial Infarction Detection in Electrocardiograms || [https://arxiv.org/pdf/1912.07618.pdf?fbclid=IwAR0RwATSn4CiT3qD9LuywYAbJVw8YB3nbex8Kl19OCExIa4jzWaUut3oVB0 Paper] || ||<br />
|-<br />
|Week of Nov 30 || Stan Lee, Seokho Lim, Kyle Jung, Daehyun Kim || 27|| Bag of Tricks for Efficient Text Classification || [https://arxiv.org/pdf/1607.01759.pdf paper] || ||<br />
|-<br />
|Week of Nov 30 || Yawen Wang, DanMeng Cui, ZiJie Jiang, MingKang Jiang || 28|| A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques || [https://arxiv.org/pdf/1707.02919.pdf Paper] || ||<br />
|-<br />
|Week of Nov 30 || Qing Guo, XueGuang Ma, James Ni, Yuanxin Wang || 29|| Mask R-CNN || [https://arxiv.org/pdf/1703.06870.pdf Paper] || ||<br />
|-<br />
|Week of Nov 30 || Bertrand Sodjahin, Junyi Yang, Jill Yu Chieh Wang, Yu Min Wu, Calvin Li || 30|| Research paper classifcation systems based on TF‑IDF and LDA schemes || [https://hcis-journal.springeropen.com/articles/10.1186/s13673-019-0192-7?fbclid=IwAR3swO-eFrEbj1BUQfmomJazxxeFR6SPgr6gKayhs38Y7aBG-zX1G3XWYRM Paper] || ||<br />
|-</div>T358wanghttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat441F21&diff=42647stat441F212020-10-09T01:59:48Z<p>T358wang: /* Paper presentation */</p>
<hr />
<div><br />
<br />
== [[F20-STAT 441/841 CM 763-Proposal| Project Proposal ]] ==<br />
<br />
<!--[https://goo.gl/forms/apurag4dr9kSR76X2 Your feedback on presentations]--><br />
<br />
= Record your contributions here [https://docs.google.com/spreadsheets/d/10CHiJpAylR6kB9QLqN7lZHN79D9YEEW6CDTH27eAhbQ/edit?usp=sharing]=<br />
<br />
Use the following notations:<br />
<br />
P: You have written a summary/critique on the paper.<br />
<br />
T: You had a technical contribution on a paper (excluding the paper that you present).<br />
<br />
E: You had an editorial contribution on a paper (excluding the paper that you present).<br />
<br />
=Paper presentation=<br />
{| class="wikitable"<br />
<br />
{| border="1" cellpadding="3"<br />
|-<br />
|width="60pt"|Date<br />
|width="250pt"|Name <br />
|width="15pt"|Paper number <br />
|width="700pt"|Title<br />
|width="15pt"|Link to the paper<br />
|width="30pt"|Link to the summary<br />
|width="30pt"|Link to the video<br />
|-<br />
|Sep 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary] || [https://youtu.be/JWozRg_X-Vg?list=PLehuLRPyt1HzXDemu7K4ETcF0Ld_B5adG&t=539]<br />
|-<br />
|Week of Nov 16 ||Sharman Bharat, Li Dylan,Lu Leonie, Li Mingdao || 1|| Risk prediction in life insurance industry using supervised learning algorithms || [https://rdcu.be/b780J Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=User:Bsharman Summary] ||<br />
|-<br />
|Week of Nov 16 || Delaney Smith, Mohammad Assem Mahmoud || 2|| Influenza Forecasting Framework based on Gaussian Processes || [https://proceedings.icml.cc/static/paper_files/icml/2020/1239-Paper.pdf] paper || ||<br />
|-<br />
|Week of Nov 16 || Tatianna Krikella, Swaleh Hussain, Grace Tompkins || 3|| Processing of Missing Data by Neural Networks || [http://papers.nips.cc/paper/7537-processing-of-missing-data-by-neural-networks] || ||<br />
|-<br />
|Week of Nov 16 ||Jonathan Chow, Nyle Dharani, Ildar Nasirov ||4 ||Streaming Bayesian Inference for Crowdsourced Classification ||[https://papers.nips.cc/paper/9439-streaming-bayesian-inference-for-crowdsourced-classification.pdf Paper] || ||<br />
|-<br />
|Week of Nov 16 || Matthew Hall, Johnathan Chalaturnyk || 5|| || || ||<br />
|-<br />
|Week of Nov 16 || Luwen Chang, Qingyang Yu, Tao Kong, Tianrong Sun || 6|| || || ||<br />
|-<br />
|Week of Nov 16 || || 7|| || || ||<br />
|-<br />
|Week of Nov 16 || || 8|| || || ||<br />
|-<br />
|Week of Nov 16 || || 9|| || || ||<br />
|-<br />
|Week of Nov 16 || || 10|| || || ||<br />
|-<br />
|Week of Nov 23 ||Jinjiang Lian, Jiawen Hou, Yisheng Zhu, Mingzhe Huang || 11|| DROCC: Deep Robust One-Class Classification || [https://proceedings.icml.cc/static/paper_files/icml/2020/6556-Paper.pdf paper] || ||<br />
|-<br />
|Week of Nov 23 || Bushra Haque, Hayden Jones, Michael Leung, Cristian Mustatea || 12|| Combine Convolution with Recurrent Netorks for Text Classification || [https://arxiv.org/pdf/2006.15795.pdf Paper] || ||<br />
|-<br />
|Week of Nov 23 || Taohao Wang, || 13|| || || ||<br />
|-<br />
|Week of Nov 23 || Qianlin Song, William Loh, Junyue Bai, Phoebe Choi || 14|| Task Understanding from Confusing Multi-task Data || [https://proceedings.icml.cc/static/paper_files/icml/2020/578-Paper.pdf paper] || ||<br />
|-<br />
|Week of Nov 23 || Rui Gong, Xuetong Wang, Xinqi Ling, Di Ma || 15|| || || ||<br />
|-<br />
|Week of Nov 23 || Xiaolan Xu, Robin Wen, Yue Weng, Beizhen Chang || 16|| Graph Structure of Neural Networks || [https://proceedings.icml.cc/paper/2020/file/757b505cfd34c64c85ca5b5690ee5293-Paper.pdf Paper] || ||<br />
|-<br />
|Week of Nov 23 ||Hansa Halim, Sanjana Rajendra Naik, Samka Marfua, Shawrupa Proshasty || 17|| Superhuman AI for multiplayer poker || [https://www.cs.cmu.edu/~noamb/papers/19-Science-Superhuman.pdf Paper] || ||<br />
|-<br />
|Week of Nov 23 ||Guanting Pan, Haocheng Chang, Zaiwei Zhang || 18|| Point-of-Interest Recommendation: Exploiting Self-Attentive Autoencoders with Neighbor-Aware Influence || [https://arxiv.org/pdf/1809.10770.pdf Paper] || ||<br />
|-<br />
|Week of Nov 23 || Jerry Huang, Daniel Jiang, Minyan Dai, Leyan Cheng || 19|| Neural Speed Reading Via Skim-RNN ||[https://arxiv.org/pdf/1711.02085.pdf?fbclid=IwAR3EeFsKM_b5p9Ox7X9mH-1oI3U3oOKPBy3xUOBN0XvJa7QW2ZeJJ9ypQVo Paper] || ||<br />
|-<br />
|Week of Nov 23 ||Ruixian Chin, Yan Kai Tan, Jason Ong, Wen Cheen Chiew || 20|| DivideMix: Learning with Noisy Labels as Semi-supervised Learning || [https://openreview.net/pdf?id=HJgExaVtwr] || ||<br />
|-<br />
|Week of Nov 30 || || 21|| || || ||<br />
|-<br />
|Week of Nov 30 || || 22|| || || ||<br />
|-<br />
|Week of Nov 30 || || 23|| || || ||<br />
|-<br />
|Week of Nov 30 || Karam Abuaisha, Evan Li, Jason Pu, Nicholas Vadivelu || 24|| Being Bayesian about Categorical Probability || [https://proceedings.icml.cc/static/paper_files/icml/2020/3560-Paper.pdf Paper] || ||<br />
|-<br />
|Week of Nov 30 || Anas Mahdi Will Thibault Jan Lau Jiwon Yang || 25|| Loss Function Search for Face Recognition<br />
|| [https://proceedings.icml.cc/static/paper_files/icml/2020/245-Paper.pdf] paper || ||<br />
|-<br />
|Week of Nov 30 ||Zihui (Betty) Qin, Wenqi (Maggie) Zhao, Muyuan Yang, Amartya (Marty) Mukherjee || 26|| Deep Learning for Cardiologist-level Myocardial Infarction Detection in Electrocardiograms || [https://arxiv.org/pdf/1912.07618.pdf?fbclid=IwAR0RwATSn4CiT3qD9LuywYAbJVw8YB3nbex8Kl19OCExIa4jzWaUut3oVB0 Paper] || ||<br />
|-<br />
|Week of Nov 30 || || 27|| || || ||<br />
|-<br />
|Week of Nov 30 || Yawen Wang, DanMeng Cui, ZiJie Jiang, MingKang Jiang || 28|| A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques || [https://arxiv.org/pdf/1707.02919.pdf Paper] || ||<br />
|-<br />
|Week of Nov 30 || Qing Guo, XueGuang Ma, James Ni, Yuanxin Wang || 29|| Mask R-CNN || [https://arxiv.org/pdf/1703.06870.pdf Paper] || ||<br />
|-<br />
|Week of Nov 30 || Bertrand Sodjahin, Junyi Yang, Jill Yu Chieh Wang, Yu Min Wu, Calvin Li || 30|| Research paper classifcation systems based on TF‑IDF and LDA schemes || [https://hcis-journal.springeropen.com/articles/10.1186/s13673-019-0192-7?fbclid=IwAR3swO-eFrEbj1BUQfmomJazxxeFR6SPgr6gKayhs38Y7aBG-zX1G3XWYRM Paper] || ||<br />
|-</div>T358wang