statwiki - User contributions [US]

F21-STAT 441/841 CM 763-Proposal

2021-12-12T02:26:51Z

Y664huan:

Use this format (Don’t remove Project 0)

Project # 0 Group members:

Last name, First name

Last name, First name

Last name, First name

Last name, First name

Title: Making a String Telephone

Description: We use paper cups to make a string phone and talk with friends while learning about sound waves with this science project. (Explain your project in one or two paragraphs).

--------------------------------------------------------------------
Project # 1 Group members:

Feng, Jared

Huang, Xipeng

Xu, Mingwei

Yu, Tingzhou

Title: Patch-Based Convolutional Neural Network for Cancers Classification

Description: In this project, we consider classifying three classes (tumor types) of cancers based on pathological data. We will follow the paper ''Patch-based Convolutional Neural Network for Whole Slide Tissue Image Classification''.
--------------------------------------------------------------------
Project # 2 Group members:

Anderson, Eric

Wang, Chengzhi

Zhong, Kai

Zhou, Yi Jing

Title: Data Poison Attacks

Description: Attempting to create a successful data poisoning attack

--------------------------------------------------------------------
Project # 3 Group members:

Chopra, Kanika

Rajcoomar, Yush

Bhattacharya, Vaibhav

Title: Cancer Classification

Description: We will be classifying three tumour types based on pathological data.

--------------------------------------------------------------------
Project # 4 Group members:

Li, Shao Zhong

Kerr, Hannah

Wong, Ann Gie

Title: Classification of text

Description: Being to automatically grade answers on tests can save a lot of time and teaching resources. But unlike a multiple-choice format where grading can be automated, the other formats involving text answers is more through in testing knowledge but still requires human evaluation and marking which is a bottleneck of teaching resources and personnel on a large scale with thousands of students. We will use classification techniques and machine learning to develop an automated way to predict the rightness of text answers with good accuracy that can be used by and suppport graders to reduce the time and manual effort needed in the grading process.

--------------------------------------------------------------------
Project # 5 Group members:

Chin, Jessie Man Wai

Ooi, Yi Lin

Shi, Yaqi

Ngew, Shwen Lyng

Title: The Application of Classification in Accelerated Underwriting (Insurance)

Description: Accelerated Underwriting (AUW), also called “express underwriting,” is a faster and easier process for people with good health condition to obtain life insurance. The traditional underwriting process is often painful for both customers and insurers. From the customer's perspective, they have to complete different types of questionnaires and provide different medical tests involving blood, urine, saliva and other medical results. Underwriters on the other hand have to manually go through every single policy to access the risk of each applicant. AUW allows people, who are deemed “healthy” to forgo medical exams. Since COVID-19, it has become a more concerning topic as traditional underwriting cannot be performed due to the stay-at-home order. However, this imposes a burden on the insurance company to better estimate the risk associated with less testing results.

This is where data science kicks in. With different classification methods, we can address the underwriting process’ five pain points: labor, speed, efficiency, pricing and mortality. This allows us to better estimate the risk and classify the clients for whether they are eligible for accelerated underwriting. For the final project, we use the data from one of the leading US insurers to analyze how we can classify our clients for AUW using the method of classification. We will be using factors such as health data, medical history, family history as well as insurance history to determine the eligibility.

--------------------------------------------------------------------
Project # 6 Group members:

Wang, Carolyn

Cyrenne, Ethan

Nguyen, Dieu Hoa

Sin, Mary Jane

Title: Pawpularity (PetFinder Kaggle Competition)

Description: Using images and metadata on the images to predict the popularity of pet photos, which is calculated based on page view statistics and other metrics from the PetFinder website.

--------------------------------------------------------------------
Project # 7 Group members:

Bhattacharya, Vaibhav

Chatoor, Amanda

Prathap Das, Sutej

Title: PetFinder.my - Pawpularity Contest [https://www.kaggle.com/c/petfinder-pawpularity-score/overview]

Description: In this competition, we will analyze raw images and metadata to predict the “Pawpularity” of pet photos. We'll train and test our model on PetFinder.my's thousands of pet profiles.

--------------------------------------------------------------------
Project # 8 Group members:

Yan, Xin

Duan, Yishu

Di, Xibei

Title: The application of classification on company bankruptcy prediction

Description: TBD
--------------------------------------------------------------------
Project # 9 Group members:

Loke, Chun Waan

Chong, Peter

Osmond, Clarice

Li, Zhilong

Title: TBD

Description: TBD

--------------------------------------------------------------------

Project # 10 Group members:

O'Farrell, Ethan

D'Astous, Justin

Hamed, Waqas

Vladusic, Stefan

Title: Pawpularity (Kaggle)

Description: Predicting the popularity of animal photos based on photo metadata
--------------------------------------------------------------------
Project # 11 Group members:

JunBin, Pan

Title: TBD

Description: TBD
--------------------------------------------------------------------
Project # 12 Group members:

Kar Lok, Ng

Muhan (Iris), Li

Wu, Mingze

Title: NFL Health & Safety - Helmet Assignment competition (Kaggle Competition)

Description: Assigning players to the helmet in a given footage of head collision in football play.
--------------------------------------------------------------------
Project # 13 Group members:

Livochka, Anastasiia

Wong, Cassandra

Evans, David

Yalsavar, Maryam

Title: TBD

Description: TBD
--------------------------------------------------------------------
Project # 14 Group Members:

Zeng, Mingde

Lin, Xiaoyu

Fan, Joshua

Rao, Chen Min

Title: Toxic Comment Classification, Kaggle

Description: Using Wikipedia comments labeled for toxicity to train a model that detects toxicity in comments.
--------------------------------------------------------------------
Project # 15 Group Members:

Huang, Yuying

Anugu, Ankitha

Chen, Yushan

Title: Implementation of the classification task between crop and weeds

Description: Our work will be based on the paper ''Crop and Weeds Classification for Precision Agriculture using Context-Independent Pixel-Wise Segmentation''.
--------------------------------------------------------------------
Project # 16 Group Members:

Wang, Lingshan

Li, Yifan

Liu, Ziyi

Title: Implement and Improve CNN in Multi-Class Text Classification

Description: We are going to apply Bidirectional Encoder Representations from Transformers (BERT) to classify real-world data (application to build an efficient case study interview materials classifier) and improve it algorithm-wise in the context of text classification, being supported with real-world data set. With the implementation of BERT, it allows us to further analyze the efficiency and practicality of the algorithm when dealing with imbalanced dataset in the data input level and modelling level.
The dataset is composed of case study HTML files containing case information that can be classified into multiple industry categories. We will implement a multi-class classification to break down the information contained in each case material into some pre-determined subcategories (eg, behavior questions, consulting questions, questions for new business/market entry, etc.). We will attempt to process the complicated data into several data types(e.g. HTML, JSON, pandas data frames, etc.) and choose the most efficient raw data processing logic based on runtime and algorithm optimization.
--------------------------------------------------------------------
Project # 17 Group members:

Malhi, Dilmeet

Joshi, Vansh

Syamala, Aavinash

Islam, Sohan

Title: Kaggle project: PetFinder.my - Pawpularity Contest

Description: In this competition, we will analyze raw images provided by PetFinder.my to predict the “Pawpularity” of pet photos.
--------------------------------------------------------------------

Project # 18 Group members:

Yuwei, Liu

Daniel, Mao

Title: Sartorius - Cell Instance Segmentation (Kaggle) [https://www.kaggle.com/c/sartorius-cell-instance-segmentation]

Description: Detect single neuronal cells in microscopy images

--------------------------------------------------------------------

Project #19 Group members:

Samuel, Senko

Tyler, Verhaar

Zhang, Bowen

Title: NBA Game Prediction

Description: We will build a win/loss classifier for NBA games using player and game data and also incorporating alternative data (ex. sports betting data).

-------------------------------------------------------------------

Project #20 Group members:

Mitrache, Christian

Renggli, Aaron

Saini, Jessica

Mossman, Alexandra

Title: Classification and Deep Learning for Healthcare Provider Fraud Detection Analysis

Description: TBD

--------------------------------------------------------------------

Project # 21 Group members:

Wang, Kun

Title: TBD

Description : TBD

--------------------------------------------------------------------

Project # 22 Group members:

Guray, Egemen

Title: Traffic Sign Recognition System (TSRS): SVM and Convolutional Neural Network

Description : I will build a prediction system to predict road signs in the German Traffic Sign Dataset using CNN.
--------------------------------------------------------------------

Project # 23 Group members:

Bsodjahi

Title: Modeling Pseudomonas aeruginosa bacteria state through its genes expression activity

Description : Label Pseudomonas aeruginosa gene expression data through unsupervised learning (eg., EM algorithm) and then model the bacterial state as function of its genes expression

F21-STAT 441/841 CM 763-Proposal

2021-12-12T02:25:35Z

Y664huan:

Use this format (Don’t remove Project 0)

Project # 0 Group members:

Last name, First name

Last name, First name

Last name, First name

Last name, First name

Title: Making a String Telephone

Description: We use paper cups to make a string phone and talk with friends while learning about sound waves with this science project. (Explain your project in one or two paragraphs).

--------------------------------------------------------------------
Project # 1 Group members:

Feng, Jared

Huang, Xipeng

Xu, Mingwei

Yu, Tingzhou

Title: Patch-Based Convolutional Neural Network for Cancers Classification

Description: In this project, we consider classifying three classes (tumor types) of cancers based on pathological data. We will follow the paper ''Patch-based Convolutional Neural Network for Whole Slide Tissue Image Classification''.
--------------------------------------------------------------------
Project # 2 Group members:

Anderson, Eric

Wang, Chengzhi

Zhong, Kai

Zhou, Yi Jing

Title: Data Poison Attacks

Description: Attempting to create a successful data poisoning attack

--------------------------------------------------------------------
Project # 3 Group members:

Chopra, Kanika

Rajcoomar, Yush

Bhattacharya, Vaibhav

Title: Cancer Classification

Description: We will be classifying three tumour types based on pathological data.

--------------------------------------------------------------------
Project # 4 Group members:

Li, Shao Zhong

Kerr, Hannah

Wong, Ann Gie

Title: Classification of text

Description: Being to automatically grade answers on tests can save a lot of time and teaching resources. But unlike a multiple-choice format where grading can be automated, the other formats involving text answers is more through in testing knowledge but still requires human evaluation and marking which is a bottleneck of teaching resources and personnel on a large scale with thousands of students. We will use classification techniques and machine learning to develop an automated way to predict the rightness of text answers with good accuracy that can be used by and suppport graders to reduce the time and manual effort needed in the grading process.

--------------------------------------------------------------------
Project # 5 Group members:

Chin, Jessie Man Wai

Ooi, Yi Lin

Shi, Yaqi

Ngew, Shwen Lyng

Title: The Application of Classification in Accelerated Underwriting (Insurance)

Description: Accelerated Underwriting (AUW), also called “express underwriting,” is a faster and easier process for people with good health condition to obtain life insurance. The traditional underwriting process is often painful for both customers and insurers. From the customer's perspective, they have to complete different types of questionnaires and provide different medical tests involving blood, urine, saliva and other medical results. Underwriters on the other hand have to manually go through every single policy to access the risk of each applicant. AUW allows people, who are deemed “healthy” to forgo medical exams. Since COVID-19, it has become a more concerning topic as traditional underwriting cannot be performed due to the stay-at-home order. However, this imposes a burden on the insurance company to better estimate the risk associated with less testing results.

This is where data science kicks in. With different classification methods, we can address the underwriting process’ five pain points: labor, speed, efficiency, pricing and mortality. This allows us to better estimate the risk and classify the clients for whether they are eligible for accelerated underwriting. For the final project, we use the data from one of the leading US insurers to analyze how we can classify our clients for AUW using the method of classification. We will be using factors such as health data, medical history, family history as well as insurance history to determine the eligibility.

--------------------------------------------------------------------
Project # 6 Group members:

Wang, Carolyn

Cyrenne, Ethan

Nguyen, Dieu Hoa

Sin, Mary Jane

Title: Pawpularity (PetFinder Kaggle Competition)

Description: Using images and metadata on the images to predict the popularity of pet photos, which is calculated based on page view statistics and other metrics from the PetFinder website.

--------------------------------------------------------------------
Project # 7 Group members:

Bhattacharya, Vaibhav

Chatoor, Amanda

Prathap Das, Sutej

Title: PetFinder.my - Pawpularity Contest [https://www.kaggle.com/c/petfinder-pawpularity-score/overview]

Description: In this competition, we will analyze raw images and metadata to predict the “Pawpularity” of pet photos. We'll train and test our model on PetFinder.my's thousands of pet profiles.

--------------------------------------------------------------------
Project # 8 Group members:

Yan, Xin

Duan, Yishu

Di, Xibei

Title: The application of classification on company bankruptcy prediction

Description: TBD
--------------------------------------------------------------------
Project # 9 Group members:

Loke, Chun Waan

Chong, Peter

Osmond, Clarice

Li, Zhilong

Title: TBD

Description: TBD

--------------------------------------------------------------------

Project # 10 Group members:

O'Farrell, Ethan

D'Astous, Justin

Hamed, Waqas

Vladusic, Stefan

Title: Pawpularity (Kaggle)

Description: Predicting the popularity of animal photos based on photo metadata
--------------------------------------------------------------------
Project # 11 Group members:

JunBin, Pan

Title: TBD

Description: TBD
--------------------------------------------------------------------
Project # 12 Group members:

Kar Lok, Ng

Muhan (Iris), Li

Wu, Mingze

Title: NFL Health & Safety - Helmet Assignment competition (Kaggle Competition)

Description: Assigning players to the helmet in a given footage of head collision in football play.
--------------------------------------------------------------------
Project # 13 Group members:

Livochka, Anastasiia

Wong, Cassandra

Evans, David

Yalsavar, Maryam

Title: TBD

Description: TBD
--------------------------------------------------------------------
Project # 14 Group Members:

Zeng, Mingde

Lin, Xiaoyu

Fan, Joshua

Rao, Chen Min

Title: Toxic Comment Classification, Kaggle

Description: Using Wikipedia comments labeled for toxicity to train a model that detects toxicity in comments.
--------------------------------------------------------------------
Project # 15 Group Members:

Huang, Yuying

Anugu, Ankitha

Dave, Meet Hemang

Chen, Yushan

Title: Implementation of the classification task between crop and weeds

Description: Our work will be based on the paper Crop and Weeds Classification for Precision Agriculture using Context-Independent Pixel-Wise Segmentation
--------------------------------------------------------------------
Project # 16 Group Members:

Wang, Lingshan

Li, Yifan

Liu, Ziyi

Title: Implement and Improve CNN in Multi-Class Text Classification

Description: We are going to apply Bidirectional Encoder Representations from Transformers (BERT) to classify real-world data (application to build an efficient case study interview materials classifier) and improve it algorithm-wise in the context of text classification, being supported with real-world data set. With the implementation of BERT, it allows us to further analyze the efficiency and practicality of the algorithm when dealing with imbalanced dataset in the data input level and modelling level.
The dataset is composed of case study HTML files containing case information that can be classified into multiple industry categories. We will implement a multi-class classification to break down the information contained in each case material into some pre-determined subcategories (eg, behavior questions, consulting questions, questions for new business/market entry, etc.). We will attempt to process the complicated data into several data types(e.g. HTML, JSON, pandas data frames, etc.) and choose the most efficient raw data processing logic based on runtime and algorithm optimization.
--------------------------------------------------------------------
Project # 17 Group members:

Malhi, Dilmeet

Joshi, Vansh

Syamala, Aavinash

Islam, Sohan

Title: Kaggle project: PetFinder.my - Pawpularity Contest

Description: In this competition, we will analyze raw images provided by PetFinder.my to predict the “Pawpularity” of pet photos.
--------------------------------------------------------------------

Project # 18 Group members:

Yuwei, Liu

Daniel, Mao

Title: Sartorius - Cell Instance Segmentation (Kaggle) [https://www.kaggle.com/c/sartorius-cell-instance-segmentation]

Description: Detect single neuronal cells in microscopy images

--------------------------------------------------------------------

Project #19 Group members:

Samuel, Senko

Tyler, Verhaar

Zhang, Bowen

Title: NBA Game Prediction

Description: We will build a win/loss classifier for NBA games using player and game data and also incorporating alternative data (ex. sports betting data).

-------------------------------------------------------------------

Project #20 Group members:

Mitrache, Christian

Renggli, Aaron

Saini, Jessica

Mossman, Alexandra

Title: Classification and Deep Learning for Healthcare Provider Fraud Detection Analysis

Description: TBD

--------------------------------------------------------------------

Project # 21 Group members:

Wang, Kun

Title: TBD

Description : TBD

--------------------------------------------------------------------

Project # 22 Group members:

Guray, Egemen

Title: Traffic Sign Recognition System (TSRS): SVM and Convolutional Neural Network

Description : I will build a prediction system to predict road signs in the German Traffic Sign Dataset using CNN.
--------------------------------------------------------------------

Project # 23 Group members:

Bsodjahi

Title: Modeling Pseudomonas aeruginosa bacteria state through its genes expression activity

Description : Label Pseudomonas aeruginosa gene expression data through unsupervised learning (eg., EM algorithm) and then model the bacterial state as function of its genes expression

A Game Theoretic Approach to Class-wise Selective Rationalization

2021-11-25T17:14:27Z

Y664huan: /* Experiments */

== Presented by ==
Yushan Chen, Yuying Huang, Ankitha Anugu

== Introduction ==
The selection of input features can be optimised for trained models or can be directly incorporated into methods. But an overall selection does not properly capture the useful rationales. So, a new game theoretic approach to class-dependent rationalization is introduced where the method is specifically trained to highlight evidence supporting alternative conclusions. Each class consists of three players who compete to find evidence for both factual and counterfactual circumstances. In a simple example, we explain how the game drives the solution towards relevant class-dependent rationales theoretically.

== Previous Work ==

There are two directions of research on generating interpretable features of neural networks. The first is to include the interpretations directly in the models, often known as self-explaining models [6, 7, 8, 9]. The alternative option is to generate interpretations in post-hoc manner. There are also few research works attempting to increase the fidelity of post hoc explanations by including the explanation mechanism into the training approach [10, 11]. Although none of these works can perform class-wise rationalisation, gradient based methods can be intuitively modified for this purpose, generating explanations for a specific class by probing the importance with regard to the relevant class logit.

== Motivation ==

Extending how rationales are defined and calculated is one of the primary questions motivating this research. To date, the typical approach has been to choose an overall feature subset that best explains the output/decision. The greatest mutual information criterion [12, 13], for example, selects an overall subset of features so that mutual information between the feature subset and the target output decision is maximised, or the entropy of the target output decision conditional on this subset is minimised. Rationales, on the other hand, can be multi-faceted, involving support for a variety of outcomes in varying degrees. Existing rationale algorithms have one limitation. They only look for rationales that support the label class. So, we propose CAR algorithm to find rationales of any given class.

== How does CAR work intuitively? ==
Suppose X is a random vector representing a string of text, Y represents the class that X is in, <math>Y\in \mathbb Y=\{0,1\} .</math>
Z(t) provides evidence supporting class <math>t\in \{0,1\}.</math>
<math>g_t^f(X), g_t^c(X), t\in \{0,1\}</math> are factual rationale generators, and <math>d_t(Z), t\in \{0,1\}</math> are discriminators, f and c are short for "factual" and "counterfactual", respectively.

'''Discriminator''': In our adversarial game, <math>d_0(\cdot)</math> takes a rationale Z generated by either <math>g_0^f(\cdot)</math> or <math>g_0^c(\cdot)</math>
as input, and outputs the probability that Z is generated by the factual generator <math>g_0^f(\cdot)</math>. The training
target for <math>d_0(\cdot) </math> is similar to the generative adversarial network (GAN) [14]:

<math> d(\cdot)=\underset{d(\cdot)}{argmin} -p_Y(0)\mathrm{E}[\log d(g_0^f(X))|Y=0]-p_Y(1)\mathrm{E}[\log(1- d(g_0^c(X)))|Y=1] </math>

'''Generators''': The factual generator <math>g_0^f(\cdot)</math> is trained to generate rationales from text labeled Y = 0. The counterfactual generator <math>g_0^c(\cdot)</math>, in contrast, learns from text labeled Y = 1. Both generators try to convince the discriminator that they are factual generators for Y = 0.

<math> g_0^f(\cdot)=\underset{g(\cdot)}{argmax}\mathrm{E}[h_0(d_0(g(X)))|Y=0] </math>, and <math>g_0^c(\cdot)=\underset{g(\cdot)}{argmax}\mathrm{E}[h_1(d_0(g(X)))|Y=1] </math>,

s.t. <math>g_0^f(\cdot)</math> and <math>g_0^c(\cdot)</math> satisfy some sparsity and continuity constraints.

In the above formula, <math>h_0(x), h_1(x)</math> are both monotonically-increasing functions. <math>h_0(x)=h_1(x)=x</math>. The goal of the counterfactual generator is to fool the discriminator. Therefore, its optimal strategy is to match the counterfactual rationale distribution with the factual rationale distribution. The goal of the factual generator is to help the discriminator. Therefore, its optimal strategy, given the optimized counterfactual generator, is to “steer” the factual rationale distribution away from the counterfactual rationale distribution. Both the generator and the discriminator eventually achieve an optimal balance where the class-wise rationales can be successfully assigned.

[[File:CAR_framework.png]]

Figure 1: CAR training and inference procedures of the class-0 case. (a) The training procedure. (b) During
inference, there is no ground truth label. In this case, we will always trigger the factual generators.

== Experiments ==
The authors evaluated factual and counterfactual rational generation in both single- and multi-aspect classification tasks. The method was tested on the following three binary classification datasets: Amazon reviews (single-aspect) [1], Beer reviews (multi-aspect) [2], and Hotel reviews (multi-aspect) [3]. Note that the Amazon reviews contain both positive and negative sentiments. Beer reviews and hotel reviews contain only factual annotations.

The CAR model is compared with two existing methods, RNP and Post-exp.

- '''RNP''': this is a framework for rationalizing neural prediction proposed by Lei ''et al.'' [4]. It combines two modular components, a generator, and a predictor. RNP is only able to generate factual rationales.

- '''POST-EXP''': Post-explanation approach contains two generators and one predictor. Given the pre-trained predictor, one generator is trained to generate positive rationales, and the other generator is trained to generate negative rationales.

To make reasonable comparisons among the three methods, the predictors are of the same architecture, so as the generators. Their sparsity and continuity constraints are also of the same form.

Two types are experiments are set: objective evaluation and subjective evaluation.

In the objective evaluation, rationales are generated using the three models, and are compared with human annotations. Precision, recall and F1 score are reported to evaluate the performance of the models. (Note that the algorithms are conditioned on a similar actual sparsity level in factual rationales.)

In the subjective evaluations, people were asked to choose a sentiment (positive or negative) based on the given rationales. In the single-aspect case, a success is credited when the person correctly guesses the ground-truth sentiment given the factual rationales, or the person is convinced to choose the opposite sentiment to the ground-truth given the counterfactual rationale. In the multi-aspect case, the person also needs to guess the aspect correctly.

The results are as below.

[[File:Picture1.png]]

Table 1: Objective performances on the Amazon review dataset. The numbers in each column represent precision / recall / F1 score. [5]

[[File:Picture2.png]]

Table 2: Objective performances on the beer review and hotel review datasets. (Columns: P - precision / R - recall / F1 - F1 score.) [5]

[[File:Picture3.png]]

Figure 2: Subjective performances summary. [5]

== Conclusions ==
The authors evaluated the method in single- and multi-aspect sentiment classification tasks. The result tables above show that the proposed method is able to identify both factual (justifying the ground truth label) and counterfactual (countering the ground truth label) rationales consistent with human rationalization. When comparing with other existing methods (RNP and POST-EXP), CAR achieves better accuracy in finding factual rationales. For the counterfactual case, although the CAR method has lower recall and F1 score, human evaluators still favor the CAR generated counterfactual rationales. The subjective experiment result shows that CAR is able to correctly convince the subjects with factual rationales and fool the subjects with counterfactual rationales.

== Critiques ==
Although the CAR method performs better than the two other methods in most cases, it still has many limitations. In the case where the reviews have annotated ground truth containing mixed sentiments (aroma aspect in beer reviews), CAR often has low recalls. Also, CAR sometimes selects irrelevant aspect words with the desired sentiment to fulfill the sparsity constraint. This reduces the precision. When the reviews are very short without a mix of sentiments (cleanliness aspect in hotel reviews), it is hard for CAR to generate counterfactual rationales to trick a human.

-

== References ==
[1] John Blitzer, Mark Dredze, and Fernando Pereira. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th annual meeting of the association of computational linguistics, pages 440–447, 2007.

[2] Julian McAuley, Jure Leskovec, and Dan Jurafsky. Learning attitudes and attributes from multi-aspect reviews. In 2012 IEEE 12th International Conference on Data Mining, pages 1020–1025. IEEE, 2012.

[3] Hongning Wang, Yue Lu, and Chengxiang Zhai. Latent aspect rating analysis on review text data: a rating regression approach. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 783–792. ACm, 2010.

[4] Tao Lei, Regina Barzilay, and Tommi Jaakkola. Rationalizing neural predictions. arXiv preprint arXiv:1606.04155, 2016

[5] Shiyu Chang, Yang Zhang, Mo Yu, and Tommi Jaakkola. A game theoretic approach to class-wise selective rationalization. arXiv: 1910.12853, 2019

[6] David Alvarez-Melis and Tommi S Jaakkola. Towards robust interpretability with self-explaining neural networks. arXiv preprint arXiv:1806.07538, 2018.

[7] Jacob Andreas, Marcus Rohrbach, Trevor Darrell, and Dan Klein. Learning to compose neural networks for question answering. arXiv preprint arXiv:1601.01705, 2016.

[8] Jacob Andreas, Marcus Rohrbach, Trevor Darrell, and Dan Klein. Neural module networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 39–48, 2016.

[9] Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Judy Hoffman, Li Fei-Fei, C Lawrence Zitnick, and Ross Girshick. Inferring and executing programs for visual reasoning. In Proceedings of the IEEE International Conference on Computer Vision, pages 2989–2998, 2017.

[10] Guang-He Lee, David Alvarez-Melis, and Tommi S Jaakkola. Towards robust, locally linear deep networks. arXiv preprint arXiv:1907.03207, 2019.

[11] Guang-He Lee, Wengong Jin, David Alvarez-Melis, and Tommi S Jaakkola. Functional transparency for structured data: a game-theoretic approach. arXiv preprint arXiv:1902.09737, 2019.

[12] Jianbo Chen, Le Song, Martin J Wainwright, and Michael I Jordan. Learning to explain: An information theoretic perspective on model interpretation. arXiv preprint arXiv:1802.07814, 2018.

[13] Tao Lei, Regina Barzilay, and Tommi Jaakkola. Rationalizing neural predictions. arXiv preprint arXiv:1606.04155, 2016.

[14] Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. Advances in neural information processing systems. 2014;27.

stat441F21

2021-11-25T15:51:13Z

Y664huan: /* Paper presentation */

== [[F20-STAT 441/841 CM 763-Proposal| Project Proposal ]] ==



=Paper presentation=
{| class="wikitable"

{| border="1" cellpadding="3"
|-
|width="60pt"|Date
|width="250pt"|Name
|width="15pt"|Paper number
|width="700pt"|Title
|width="15pt"|Link to the paper
|width="30pt"|Link to the summary
|width="30pt"|Link to the video
|-
|Sep 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary] || [https://youtu.be/JWozRg_X-Vg?list=PLehuLRPyt1HzXDemu7K4ETcF0Ld_B5adG&t=539]
|-
|Week of Nov 16 || Ali Ghodsi || || || || ||
|-
|Week of Nov 22 || Jared Feng, Xipeng Huang, Mingwei Xu, Tingzhou Yu|| || Don't Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification || [http://proceedings.mlr.press/v139/bai21c/bai21c.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Don%27t_Just_Blame_Over-parametrization Summary] ||
|-
|Week of Nov 29 || Kanika Chopra, Yush Rajcoomar || || Automatic Bank Fraud Detection Using Support Vector Machines || [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.863.5804&rep=rep1&type=pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Automatic_Bank_Fraud_Detection_Using_Support_Vector_Machines Summary] ||
|-
|Week of Nov 22 || Zeng Mingde, Lin Xiaoyu, Fan Joshua, Rao Chen Min || || Do Vision Transformers See Like Convolutional Neural Networks? || [https://proceedings.neurips.cc/paper/2021/file/652cf38361a209088302ba2b8b7f51e0-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Do_Vision_Transformers_See_Like_CNN Summary] ||
|-
|Week of Nov 22 || Justin D'Astous, Waqas Hamed, Stefan Vladusic, Ethan O'Farrell || || A Probabilistic Approach to Neural Network Pruning || [http://proceedings.mlr.press/v139/qian21a/qian21a.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Summary_of_A_Probabilistic_Approach_to_Neural_Network_Pruning Summary] ||
|-
|Week of Nov 22 || Cassandra Wong, Anastasiia Livochka, Maryam Yalsavar, David Evans || || Patch-Based Convolutional Neural Network for Whole Slide Tissue Image Classification || [https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Hou_Patch-Based_Convolutional_Neural_CVPR_2016_paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Patch_Based_Convolutional_Neural_Network_for_Whole_Slide_Tissue_Image_Classification Summary] ||
|-
|Week of Nov 29 || Jessie Man Wai Chin, Yi Lin Ooi, Yaqi Shi, Shwen Lyng Ngew || || CatBoost: unbiased boosting with categorical features || [https://proceedings.neurips.cc/paper/2018/file/14491b756b3a51daac41c24863285549-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=CatBoost:_unbiased_boosting_with_categorical_features Summary] ||
|-
|Week of Nov 29 || Eric Anderson, Chengzhi Wang, Kai Zhong, YiJing Zhou || || || || ||
|-
|Week of Nov 29 || Ethan Cyrenne, Dieu Hoa Nguyen, Mary Jane Sin, Carolyn Wang || || || || ||
|-
|Week of Nov 29 || Bowen Zhang, Tyler Magnus Verhaar, Sam Senko || || Deep Double Descent: Where Bigger Models and More Data Hurt || [https://arxiv.org/pdf/1912.02292.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Double_Descent_Where_Bigger_Models_and_More_Data_Hurt Summary] ||
|-
|Week of Nov 29 || Chun Waan Loke, Peter Chong, Clarice Osmond, Zhilong Li|| || XGBoost: A Scalable Tree Boosting System || [https://www.kdd.org/kdd2016/papers/files/rfp0697-chenAemb.pdf Paper] || ||
|-
|Week of Nov 22 || Ann Gie Wong, Curtis Li, Hannah Kerr || || The Detection of Black Ice Accidents for Preventative Automated Vehicles Using Convolutional Neural Networks || [https://www.mdpi.com/2079-9292/9/12/2178/htm Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=The_Detection_of_Black_Ice_Accidents_Using_CNNs&fbclid=IwAR0K4YdnL_hdRnOktmJn8BI6-Ra3oitjJof0YwluZgUP1LVFHK5jyiBZkvQ Summary] ||
|-
|Week of Nov 22 || Yuwei Liu, Daniel Mao|| || Depthwise Convolution Is All You Need for Learning Multiple Visual Domains || [https://arxiv.org/abs/1902.00927 Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Depthwise_Convolution_Is_All_You_Need_for_Learning_Multiple_Visual_Domains Summary] ||
|-
|Week of Nov 29 || Lingshan Wang, Yifan Li, Ziyi Liu || || Deep Learning for Extreme Multi-label Text Classification || [https://dl.acm.org/doi/pdf/10.1145/3077136.3080834 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Learning_for_Extreme_Multi-label_Text_Classification Summary]||
|-
|-
|Week of Nov 29 || Kar Lok Ng, Muhan (Iris) Li || || || || ||
|-
|Week of Nov 29 ||Kun Wang || || Convolutional neural network for diagnosis of viral pneumonia and COVID-19 alike diseases|| [https://doi-org.proxy.lib.uwaterloo.ca/10.1111/exsy.12705 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Convolutional_neural_network_for_diagnosis_of_viral_pneumonia_and_COVID-19_alike_diseases Summary] ||
|-
|Week of Nov 29 ||Egemen Guray || || Traffic Sign Recognition System (TSRS): SVM and Convolutional Neural Network || [https://www.researchgate.net/publication/344399165_Traffic_Sign_Recognition_System_TSRS_SVM_and_Convolutional_Neural_Network Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Traffic_Sign_Recognition_System_(TSRS):_SVM_and_Convolutional_Neural_Network Summary] ||
|-
|Week of Nov 29 ||Bsodjahi || || Bayesian Network as a Decision Tool for Predicting ALS Disease || https://www.mdpi.com/2076-3425/11/2/150/pdf || ||
|-
|Week of Nov 29 ||Xin Yan, Yishu Duan, Xibei Di || || Predicting Hurricane Trajectories Using a Recurrent Neural Network || [https://arxiv.org/pdf/1802.02548.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Predicting_Hurricane_Trajectories_Using_a_Recurrent_Neural_Network Summary]||
|-
|Week of Nov 29 ||Ankitha Anugu, Yushan Chen, Yuying Huang || || A Game Theoretic Approach to Class-wise Selective Rationalization || [https://arxiv.org/pdf/1910.12853.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=A_Game_Theoretic_Approach_to_Class-wise_Selective_Rationalization#How_does_CAR_work_intuitively Summary]||
|-
|Week of Nov 29 ||Aavinash Syamala, Dilmeet Malhi, Sohan Islam, Vansh Joshi || || Research on Multiple Classification Based on Improved SVM Algorithm for Balanced Binary Decision Tree || [https://www.hindawi.com/journals/sp/2021/5560465/ Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Research_on_Multiple_Classification_Based_on_Improved_SVM_Algorithm_for_Balanced_Binary_Decision_Tree Summary]||
|-
|Week of Nov 29 ||Christian Mitrache, Alexandra Mossman, Jessica Saini, Aaron Renggli|| || U-Time: A Fully Convolutional Network for Time Series Segmentation Applied to Sleep Staging|| [https://proceedings.neurips.cc/paper/2019/file/57bafb2c2dfeefba931bb03a835b1fa9-Paper.pdf?fbclid=IwAR1dZpx9vU1pSPTSm_nwk6uBU7TYJ2HNTrsqjaH-9ZycE_PFpFjJoHg1zhQ]||

A Game Theoretic Approach to Class-wise Selective Rationalization

2021-11-25T15:48:11Z

Y664huan: /* Model Architecture */

== Presented by ==
Yushan Chen, Yuying Huang, Ankitha Anugu

== Introduction ==
The selection of input features can be optimised for trained models or can be directly incorporated into methods. But an overall selection does not properly capture the useful rationales. So, a new game theoretic approach to class-dependent rationalization is introduced where the method is specifically trained to highlight evidence supporting alternative conclusions. Each class consists of three players who compete to find evidence for both factual and counterfactual circumstances. In a simple example, we explain how the game drives the solution towards relevant class-dependent rationales theoretically.

== Previous Work ==

There are two directions of research on generating interpretable features of neural networks. The first is to include the interpretations directly in the models, often known as self-explaining models. The alternative option is to generate interpretations in post-hoc manner. There are also few research works attempting to increase the fidelity of post hoc explanations by including the explanation mechanism into the training approach. Although none of these works can perform class-wise rationalisation, gradient based methods can be intuitively modified for this purpose, generating explanations for a specific class by probing the importance with regard to the relevant class logit.

== Motivation ==

Extending how rationales are defined and calculated is one of the primary questions motivating this research. To date, the typical approach has been to choose an overall feature subset that best explains the output/decision. The greatest mutual information criterion, for example, selects an overall subset of features so that mutual information between the feature subset and the target output decision is maximised, or the entropy of the target output decision conditional on this subset is minimised. Rationales, on the other hand, can be multi-faceted, involving support for a variety of outcomes in varying degrees. Existing rationale algorithms have one limitation. They only look for rationales that support the label class. So, we propose CAR algorithm to find rationales of any given class.

== How does CAR work intuitively? ==
Suppose X is a random vector representing a string of text, Y represents the class that X is in, <math>Y\in \mathbb Y=\{0,1\} .</math>
Z(t) provides evidence supporting class <math>t\in \{0,1\}.</math>
<math>g_t^f(X), g_t^c(X), t\in \{0,1\}</math> are factual rationale generators, and <math>d_t(Z), t\in \{0,1\}</math> are discriminators, f and c are short for "factual" and "counterfactual", respectively.

'''Discriminator''': In our adversarial game, <math>d_0(\cdot)</math> takes a rationale Z generated by either <math>g_0^f(\cdot)</math> or <math>g_0^c(\cdot)</math>
as input, and outputs the probability that Z is generated by the factual generator <math>g_0^f(\cdot)</math>. The training
target for <math>d_0(\cdot) </math> is similar to the generative adversarial network (GAN) [6]:

<math> d(\cdot)=\underset{d(\cdot)}{argmin} -p_Y(0)\mathrm{E}[\log d(g_0^f(X))|Y=0]-p_Y(1)\mathrm{E}[\log(1- d(g_0^c(X)))|Y=1] </math>

'''Generators''': The factual generator <math>g_0^f(\cdot)</math> is trained to generate rationales from text labeled Y = 0. The counterfactual generator <math>g_0^c(\cdot)</math>, in contrast, learns from text labeled Y = 1. Both generators try to convince the discriminator that they are factual generators for Y = 0.

<math> g_0^f(\cdot)=\underset{g(\cdot)}{argmax}\mathrm{E}[h_0(d_0(g(X)))|Y=0] </math>, and <math>g_0^c(\cdot)=\underset{g(\cdot)}{argmax}\mathrm{E}[h_1(d_0(g(X)))|Y=1] </math>,

s.t. <math>g_0^f(\cdot)</math> and <math>g_0^c(\cdot)</math> satisfy some sparsity and continuity constraints.

In the above formula, <math>h_0(x), h_1(x)</math> are both monotonically-increasing functions. <math>h_0(x)=h_1(x)=x</math>. The goal of the counterfactual generator is to fool the discriminator. Therefore, its optimal strategy is to match the counterfactual rationale distribution with the factual rationale distribution. The goal of the factual generator is to help the discriminator. Therefore, its optimal strategy, given the optimized counterfactual generator, is to “steer” the factual rationale distribution away from the counterfactual rationale distribution. Both the generator and the discriminator eventually achieve an optimal balance where the class-wise rationales can be successfully assigned.

[[File:CAR_framework.png]]

Figure 1: CAR training and inference procedures of the class-0 case. (a) The training procedure. (b) During
inference, there is no ground truth label. In this case, we will always trigger the factual generators.

== Experiments ==
The authors evaluated factual and counterfactual rational generation in both single- and multi-aspect classification tasks. The method was tested on the following three binary classification datasets: Amazon reviews (single-aspect) [1], Beer reviews (multi-aspect) [2], and Hotel reviews (multi-aspect) [3]. Note that the Amazon reviews contain both positive and negative sentiments. Beer reviews and hotel reviews contain only factual annotations.

The CAR model is compared with two existing methods, RNP and Post-exp.

- '''RNP''': this is a framework for rationalizing neural prediction proposed by Lei ''et al.'' [4]. It combines two modular components, a generator, and a predictor. RNP is only able to generate factual rationales.

- '''POST-EXP''': Post-explanation approach contains two generators and one predictor. Given the pre-trained predictor, one generator is trained to generate positive rationales, and the other generator is trained to generate negative rationales.

To make reasonable comparisons among the three methods, the predictors are of the same architecture, so as the generators. Their sparsity and continuity constraints are also of the same form.

Two types are experiments are set: objective evaluation and subjective evaluation.

In the objective evaluation, rationales are generated using the three models, and are compared with human annotations. Precision, recall and F1 score are reported to evaluate the performance of the models. (Note that the algorithms are conditioned on a similar actual sparsity level in factual rationales.)

In the subjective evaluations, people were asked to choose a sentiment (positive or negative) based on the given rationales. In the single-aspect case, a success is credited when the person correctly guesses the ground-truth sentiment given the factual rationales, or the person is convinced to choose the opposite sentiment to the ground-truth given the counterfactual rationale. In the multi-aspect case, the person also needs to guess the aspect correctly.

The results are as below.

[[File:Picture1.png]]

Table 1: Objective performances on the Amazon review dataset. The numbers in each column represent precision / recall / F1 score. [5]

[[File:Picture2.png]]

Table 2: Objective performances on the beer review and hotel review datasets. (Columns: P - precision / R - recall / F1 - F1 score.) [5]

[[File:Picture3.png]]

Figure 3: Subjective performances summary. [5]

== Conclusions ==
The authors evaluated the method in single- and multi-aspect sentiment classification tasks. The result tables above show that the proposed method is able to identify both factual (justifying the ground truth label) and counterfactual (countering the ground truth label) rationales consistent with human rationalization. When comparing with other existing methods (RNP and POST-EXP), CAR achieves better accuracy in finding factual rationales. For the counterfactual case, although the CAR method has lower recall and F1 score, human evaluators still favor the CAR generated counterfactual rationales. The subjective experiment result shows that CAR is able to correctly convince the subjects with factual rationales and fool the subjects with counterfactual rationales.

== Critiques ==
Although the CAR method performs better than the two other methods in most cases, it still has many limitations. In the case where the reviews have annotated ground truth containing mixed sentiments (aroma aspect in beer reviews), CAR often has low recalls. Also, CAR sometimes selects irrelevant aspect words with the desired sentiment to fulfill the sparsity constraint. This reduces the precision. When the reviews are very short without a mix of sentiments (cleanliness aspect in hotel reviews), it is hard for CAR to generate counterfactual rationales to trick a human.

-

== References ==
[1] John Blitzer, Mark Dredze, and Fernando Pereira. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th annual meeting of the association of computational linguistics, pages 440–447, 2007.

[2] Julian McAuley, Jure Leskovec, and Dan Jurafsky. Learning attitudes and attributes from multi-aspect reviews. In 2012 IEEE 12th International Conference on Data Mining, pages 1020–1025. IEEE, 2012.

[3] Hongning Wang, Yue Lu, and Chengxiang Zhai. Latent aspect rating analysis on review text data: a rating regression approach. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 783–792. ACm, 2010.

[4] Tao Lei, Regina Barzilay, and Tommi Jaakkola. Rationalizing neural predictions. arXiv preprint arXiv:1606.04155, 2016

[5] Shiyu Chang, Yang Zhang, Mo Yu, and Tommi Jaakkola. A game theoretic approach to class-wise selective rationalization. arXiv: 1910.12853, 2019

[6] Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. Advances in neural information processing systems. 2014;27.

A Game Theoretic Approach to Class-wise Selective Rationalization

2021-11-25T15:44:58Z

Y664huan: /* Model Architecture */

== Presented by ==
Yushan Chen, Yuying Huang, Ankitha Anugu

== Introduction ==
The selection of input features can be optimised for trained models or can be directly incorporated into methods. But an overall selection does not properly capture the useful rationales. So, a new game theoretic approach to class-dependent rationalization is introduced where the method is specifically trained to highlight evidence supporting alternative conclusions. Each class consists of three players who compete to find evidence for both factual and counterfactual circumstances. In a simple example, we explain how the game drives the solution towards relevant class-dependent rationales theoretically.

== Previous Work ==

There are two directions of research on generating interpretable features of neural networks. The first is to include the interpretations directly in the models, often known as self-explaining models. The alternative option is to generate interpretations in post-hoc manner. There are also few research works attempting to increase the fidelity of post hoc explanations by including the explanation mechanism into the training approach. Although none of these works can perform class-wise rationalisation, gradient based methods can be intuitively modified for this purpose, generating explanations for a specific class by probing the importance with regard to the relevant class logit.

== Motivation ==

Extending how rationales are defined and calculated is one of the primary questions motivating this research. To date, the typical approach has been to choose an overall feature subset that best explains the output/decision. The greatest mutual information criterion, for example, selects an overall subset of features so that mutual information between the feature subset and the target output decision is maximised, or the entropy of the target output decision conditional on this subset is minimised. Rationales, on the other hand, can be multi-faceted, involving support for a variety of outcomes in varying degrees. Existing rationale algorithms have one limitation. They only look for rationales that support the label class. So, we propose CAR algorithm to find rationales of any given class.

== Model Architecture ==
Suppose X is a random vector representing a string of text, Y represents the class that X is in, <math>Y\in \mathbb Y=\{0,1\} .</math>
Z(t) provides evidence supporting class <math>t\in \{0,1\} .</math>
<math>g_t^f(X), g_t^c(X), t\in \{0,1\}</math> are factual rationale generators, and <math>d_t(Z), t\in \{0,1\}</math> are discriminators, f and c are short for "factual" and "counterfactual", respectively.

'''Discriminator''': In our adversarial game, <math>d_0(\cdot)</math> takes a rationale Z generated by either <math>g_0^f(\cdot)</math> or <math>g_0^c(\cdot)</math>
as input, and outputs the probability that Z is generated by the factual generator <math>g_0^f(\cdot)</math>. The training
target for <math>d_0(\cdot) </math> is similar to the generative adversarial network (GAN) [6]:

<math> d(\cdot)=\underset{d(\cdot)}{argmin} -p_Y(0)\mathrm{E}[\log d(g_0^f(X))|Y=0]-p_Y(1)\mathrm{E}[\log(1- d(g_0^c(X)))|Y=1] </math>

'''Generators''': The factual generator <math>g_0^f(\cdot)</math> is trained to generate rationales from text labeled Y = 0. The counterfactual generator <math>g_0^c(\cdot)</math>, in contrast, learns from text labeled Y = 1. Both generators try to convince the discriminator that they are factual generators for Y = 0.

<math> g_0^f(\cdot)=\underset{g(\cdot)}{argmax}\mathrm{E}[h_0(d_0(g(X)))|Y=0] </math>, and <math>g_0^c(\cdot)=\underset{g(\cdot)}{argmax}\mathrm{E}[h_1(d_0(g(X)))|Y=1] </math>,

s.t. <math>g_0^f(\cdot)</math> and <math>g_0^c(\cdot)</math> satisfy some sparsity and continuity constraints.

The goal of the counterfactual generator is to fool the discriminator. Therefore, its optimal strategy is to match the counterfactual rationale distribution with the factual
rationale distribution. The goal of the factual generator is to help the discriminator. Therefore, its optimal strategy, given the optimized counterfactual generator, is to
“steer” the factual rationale distribution away from the counterfactual rationale distribution. Both the generator and the discriminator eventually achieve an optimal balance where the class-wise rationales can be successfully assigned.

[[File:CAR_framework.png]]

Figure 1: CAR training and inference procedures of the class-0 case. (a) The training procedure. (b) During
inference, there is no ground truth label. In this case, we will always trigger the factual generators.

== Experiments ==
The authors evaluated factual and counterfactual rational generation in both single- and multi-aspect classification tasks. The method was tested on the following three binary classification datasets: Amazon reviews (single-aspect) [1], Beer reviews (multi-aspect) [2], and Hotel reviews (multi-aspect) [3]. Note that the Amazon reviews contain both positive and negative sentiments. Beer reviews and hotel reviews contain only factual annotations.

The CAR model is compared with two existing methods, RNP and Post-exp.

- '''RNP''': this is a framework for rationalizing neural prediction proposed by Lei ''et al.'' [4]. It combines two modular components, a generator, and a predictor. RNP is only able to generate factual rationales.

- '''POST-EXP''': Post-explanation approach contains two generators and one predictor. Given the pre-trained predictor, one generator is trained to generate positive rationales, and the other generator is trained to generate negative rationales.

To make reasonable comparisons among the three methods, the predictors are of the same architecture, so as the generators. Their sparsity and continuity constraints are also of the same form.

Two types are experiments are set: objective evaluation and subjective evaluation.

In the objective evaluation, rationales are generated using the three models, and are compared with human annotations. Precision, recall and F1 score are reported to evaluate the performance of the models. (Note that the algorithms are conditioned on a similar actual sparsity level in factual rationales.)

In the subjective evaluations, people were asked to choose a sentiment (positive or negative) based on the given rationales. In the single-aspect case, a success is credited when the person correctly guesses the ground-truth sentiment given the factual rationales, or the person is convinced to choose the opposite sentiment to the ground-truth given the counterfactual rationale. In the multi-aspect case, the person also needs to guess the aspect correctly.

The results are as below.

[[File:Picture1.png]]

Table 1: Objective performances on the Amazon review dataset. The numbers in each column represent precision / recall / F1 score. [5]

[[File:Picture2.png]]

Table 2: Objective performances on the beer review and hotel review datasets. (Columns: P - precision / R - recall / F1 - F1 score.) [5]

[[File:Picture3.png]]

Figure 3: Subjective performances summary. [5]

== Conclusions ==
The authors evaluated the method in single- and multi-aspect sentiment classification tasks. The result tables above show that the proposed method is able to identify both factual (justifying the ground truth label) and counterfactual (countering the ground truth label) rationales consistent with human rationalization. When comparing with other existing methods (RNP and POST-EXP), CAR achieves better accuracy in finding factual rationales. For the counterfactual case, although the CAR method has lower recall and F1 score, human evaluators still favor the CAR generated counterfactual rationales. The subjective experiment result shows that CAR is able to correctly convince the subjects with factual rationales and fool the subjects with counterfactual rationales.

== Critiques ==
Although the CAR method performs better than the two other methods in most cases, it still has many limitations. In the case where the reviews have annotated ground truth containing mixed sentiments (aroma aspect in beer reviews), CAR often has low recalls. Also, CAR sometimes selects irrelevant aspect words with the desired sentiment to fulfill the sparsity constraint. This reduces the precision. When the reviews are very short without a mix of sentiments (cleanliness aspect in hotel reviews), it is hard for CAR to generate counterfactual rationales to trick a human.

-

== References ==
[1] John Blitzer, Mark Dredze, and Fernando Pereira. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th annual meeting of the association of computational linguistics, pages 440–447, 2007.

[2] Julian McAuley, Jure Leskovec, and Dan Jurafsky. Learning attitudes and attributes from multi-aspect reviews. In 2012 IEEE 12th International Conference on Data Mining, pages 1020–1025. IEEE, 2012.

[3] Hongning Wang, Yue Lu, and Chengxiang Zhai. Latent aspect rating analysis on review text data: a rating regression approach. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 783–792. ACm, 2010.

[4] Tao Lei, Regina Barzilay, and Tommi Jaakkola. Rationalizing neural predictions. arXiv preprint arXiv:1606.04155, 2016

[5] Shiyu Chang, Yang Zhang, Mo Yu, and Tommi Jaakkola. A game theoretic approach to class-wise selective rationalization. arXiv: 1910.12853, 2019

[6] Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. Advances in neural information processing systems. 2014;27.

A Game Theoretic Approach to Class-wise Selective Rationalization

2021-11-25T15:40:08Z

Y664huan: /* Model Architecture */

== Presented by ==
Yushan Chen, Yuying Huang, Ankitha Anugu

== Introduction ==
The selection of input features can be optimised for trained models or can be directly incorporated into methods. But an overall selection does not properly capture the useful rationales. So, a new game theoretic approach to class-dependent rationalization is introduced where the method is specifically trained to highlight evidence supporting alternative conclusions. Each class consists of three players who compete to find evidence for both factual and counterfactual circumstances. In a simple example, we explain how the game drives the solution towards relevant class-dependent rationales theoretically.

== Previous Work ==

There are two directions of research on generating interpretable features of neural networks. The first is to include the interpretations directly in the models, often known as self-explaining models. The alternative option is to generate interpretations in post-hoc manner. There are also few research works attempting to increase the fidelity of post hoc explanations by including the explanation mechanism into the training approach. Although none of these works can perform class-wise rationalisation, gradient based methods can be intuitively modified for this purpose, generating explanations for a specific class by probing the importance with regard to the relevant class logit.

== Motivation ==

Extending how rationales are defined and calculated is one of the primary questions motivating this research. To date, the typical approach has been to choose an overall feature subset that best explains the output/decision. The greatest mutual information criterion, for example, selects an overall subset of features so that mutual information between the feature subset and the target output decision is maximised, or the entropy of the target output decision conditional on this subset is minimised. Rationales, on the other hand, can be multi-faceted, involving support for a variety of outcomes in varying degrees. Existing rationale algorithms have one limitation. They only look for rationales that support the label class. So, we propose CAR algorithm to find rationales of any given class.

== Model Architecture ==
Suppose X is a random vector representing a string of text, Y represents the class that X is in, <math>Y\in \mathbb Y=\{0,1\} .</math>
Z(t) provides evidence supporting class <math>t\in \{0,1\} .</math>
<math>g_t^f(X), g_t^c(X), t\in \{0,1\}</math> are factual rationale generators, and <math>d_t(Z), t\in \{0,1\}</math> are discriminators.

'''Discriminator''': In our adversarial game, <math>d_0(\cdot)</math> takes a rationale Z generated by either <math>g_0^f(\cdot)</math> or <math>g_0^c(\cdot)</math>
as input, and outputs the probability that Z is generated by the factual generator <math>g_0^f(\cdot)</math>. The training
target for <math>d_0(\cdot) </math> is similar to the generative adversarial network (GAN) [6]:

<math> d(\cdot)=\underset{d(\cdot)}{argmin} -p_Y(0)\mathrm{E}[\log d(g_0^f(X))|Y=0]-p_Y(1)\mathrm{E}[\log(1- d(g_0^c(X)))|Y=1] </math>

'''Generators''': The factual generator <math>g_0^f(\cdot)</math> is trained to generate rationales from text labeled Y = 0. The counterfactual generator <math>g_0^c(\cdot)</math>, in contrast, learns from text labeled Y = 1. Both generators try to convince the discriminator that they are factual generators for Y = 0.

<math> g_0^f(\cdot)=\underset{g(\cdot)}{argmax}\mathrm{E}[h_0(d_0(g(X)))|Y=0] </math>, and <math>g_0^c(\cdot)=\underset{g(\cdot)}{argmax}\mathrm{E}[h_1(d_0(g(X)))|Y=1] </math>,

s.t. <math>g_0^f(\cdot)</math> and <math>g_0^c(\cdot)</math> satisfy some sparsity and continuity constraints.

The goal of the counterfactual generator is to fool the discriminator. Therefore, its optimal strategy is to match the counterfactual rationale distribution with the factual
rationale distribution. The goal of the factual generator is to help the discriminator. Therefore, its optimal strategy, given the optimized counterfactual generator, is to
“steer” the factual rationale distribution away from the counterfactual rationale distribution.

[[File:CAR_framework.png]]

Figure 1: CAR training and inference procedures of the class-0 case. (a) The training procedure. (b) During
inference, there is no ground truth label. In this case, we will always trigger the factual generators.

== Experiments ==
The authors evaluated factual and counterfactual rational generation in both single- and multi-aspect classification tasks. The method was tested on the following three binary classification datasets: Amazon reviews (single-aspect) [1], Beer reviews (multi-aspect) [2], and Hotel reviews (multi-aspect) [3]. Note that the Amazon reviews contain both positive and negative sentiments. Beer reviews and hotel reviews contain only factual annotations.

The CAR model is compared with two existing methods, RNP and Post-exp.

- '''RNP''': this is a framework for rationalizing neural prediction proposed by Lei ''et al.'' [4]. It combines two modular components, a generator, and a predictor. RNP is only able to generate factual rationales.

- '''POST-EXP''': Post-explanation approach contains two generators and one predictor. Given the pre-trained predictor, one generator is trained to generate positive rationales, and the other generator is trained to generate negative rationales.

To make reasonable comparisons among the three methods, the predictors are of the same architecture, so as the generators. Their sparsity and continuity constraints are also of the same form.

Two types are experiments are set: objective evaluation and subjective evaluation.

In the objective evaluation, rationales are generated using the three models, and are compared with human annotations. Precision, recall and F1 score are reported to evaluate the performance of the models. (Note that the algorithms are conditioned on a similar actual sparsity level in factual rationales.)

In the subjective evaluations, people were asked to choose a sentiment (positive or negative) based on the given rationales. In the single-aspect case, a success is credited when the person correctly guesses the ground-truth sentiment given the factual rationales, or the person is convinced to choose the opposite sentiment to the ground-truth given the counterfactual rationale. In the multi-aspect case, the person also needs to guess the aspect correctly.

The results are as below.

[[File:Picture1.png]]

Table 1: Objective performances on the Amazon review dataset. The numbers in each column represent precision / recall / F1 score. [5]

[[File:Picture2.png]]

Table 2: Objective performances on the beer review and hotel review datasets. (Columns: P - precision / R - recall / F1 - F1 score.) [5]

[[File:Picture3.png]]

Figure 3: Subjective performances summary. [5]

== Conclusions ==
The authors evaluated the method in single- and multi-aspect sentiment classification tasks. The result tables above show that the proposed method is able to identify both factual (justifying the ground truth label) and counterfactual (countering the ground truth label) rationales consistent with human rationalization. When comparing with other existing methods (RNP and POST-EXP), CAR achieves better accuracy in finding factual rationales. For the counterfactual case, although the CAR method has lower recall and F1 score, human evaluators still favor the CAR generated counterfactual rationales. The subjective experiment result shows that CAR is able to correctly convince the subjects with factual rationales and fool the subjects with counterfactual rationales.

== Critiques ==
Although the CAR method performs better than the two other methods in most cases, it still has many limitations. In the case where the reviews have annotated ground truth containing mixed sentiments (aroma aspect in beer reviews), CAR often has low recalls. Also, CAR sometimes selects irrelevant aspect words with the desired sentiment to fulfill the sparsity constraint. This reduces the precision. When the reviews are very short without a mix of sentiments (cleanliness aspect in hotel reviews), it is hard for CAR to generate counterfactual rationales to trick a human.

-

== References ==
[1] John Blitzer, Mark Dredze, and Fernando Pereira. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th annual meeting of the association of computational linguistics, pages 440–447, 2007.

[2] Julian McAuley, Jure Leskovec, and Dan Jurafsky. Learning attitudes and attributes from multi-aspect reviews. In 2012 IEEE 12th International Conference on Data Mining, pages 1020–1025. IEEE, 2012.

[3] Hongning Wang, Yue Lu, and Chengxiang Zhai. Latent aspect rating analysis on review text data: a rating regression approach. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 783–792. ACm, 2010.

[4] Tao Lei, Regina Barzilay, and Tommi Jaakkola. Rationalizing neural predictions. arXiv preprint arXiv:1606.04155, 2016

[5] Shiyu Chang, Yang Zhang, Mo Yu, and Tommi Jaakkola. A game theoretic approach to class-wise selective rationalization. arXiv: 1910.12853, 2019

[6] Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. Advances in neural information processing systems. 2014;27.

A Game Theoretic Approach to Class-wise Selective Rationalization

2021-11-25T15:36:13Z

Y664huan: /* Model Architecture */

== Presented by ==
Yushan Chen, Yuying Huang, Ankitha Anugu

== Introduction ==
The selection of input features can be optimised for trained models or can be directly incorporated into methods. But an overall selection does not properly capture the useful rationales. So, a new game theoretic approach to class-dependent rationalization is introduced where the method is specifically trained to highlight evidence supporting alternative conclusions. Each class consists of three players who compete to find evidence for both factual and counterfactual circumstances. In a simple example, we explain how the game drives the solution towards relevant class-dependent rationales theoretically.

== Previous Work ==

There are two directions of research on generating interpretable features of neural networks. The first is to include the interpretations directly in the models, often known as self-explaining models. The alternative option is to generate interpretations in post-hoc manner. There are also few research works attempting to increase the fidelity of post hoc explanations by including the explanation mechanism into the training approach. Although none of these works can perform class-wise rationalisation, gradient based methods can be intuitively modified for this purpose, generating explanations for a specific class by probing the importance with regard to the relevant class logit.

== Motivation ==

Extending how rationales are defined and calculated is one of the primary questions motivating this research. To date, the typical approach has been to choose an overall feature subset that best explains the output/decision. The greatest mutual information criterion, for example, selects an overall subset of features so that mutual information between the feature subset and the target output decision is maximised, or the entropy of the target output decision conditional on this subset is minimised. Rationales, on the other hand, can be multi-faceted, involving support for a variety of outcomes in varying degrees. Existing rationale algorithms have one limitation. They only look for rationales that support the label class. So, we propose CAR algorithm to find rationales of any given class.

== Model Architecture ==
Suppose X is a random vector representing a string of text, Y represents the class that X is in, <math>Y\in \mathbb Y=\{0,1\} .</math>
Z(t) provides evidence supporting class <math>t\in \{0,1\} .</math>
<math>g_t^f(X), g_t^c(X), t\in \{0,1\}</math> are factual rationale generators, and <math>d_t(Z), t\in \{0,1\}</math> are discriminators.

'''Discriminator''': In our adversarial game, <math>d_0(\cdot)</math> takes a rationale Z generated by either <math>g_0^f(\cdot)</math> or <math>g_0^c(\cdot)</math>
as input, and outputs the probability that Z is generated by the factual generator <math>g_0^f(\cdot)</math>. The training
target for <math>d_0(\cdot) </math> is similar to the generative adversarial network (GAN) [6]:

<math> d(\cdot)=\underset{d(\cdot)}{argmin} -p_Y(0)\mathrm{E}[\log d(g_0^f(X))|Y=0]-p_Y(1)\mathrm{E}[\log(1- d(g_0^c(X)))|Y=1] </math>

'''Generators''': The factual generator <math>g_0^f(\cdot)</math> is trained to generate rationales from text labeled Y = 0. The counterfactual generator <math>g_0^c(\cdot)</math>, in contrast, learns from text labeled Y = 1. Both generators try to convince the discriminator that they are factual generators for Y = 0.

<math> g_0^f(\cdot)=\underset{g(\cdot)}{argmax}\mathrm{E}[h_0(d_0(g(X)))|Y=0] </math>, and <math>g_0^c(\cdot)=\underset{g(\cdot)}{argmax}\mathrm{E}[h_1(d_0(g(X)))|Y=1] </math>,

s.t. <math>g_0^f(\cdot)</math> and <math>g_0^c(\cdot)</math> satisfy some sparsity and continuity constraints.

[[File:CAR_framework.png]]

Figure 1: CAR training and inference procedures of the class-0 case. (a) The training procedure. (b) During
inference, there is no ground truth label. In this case, we will always trigger the factual generators.

== Experiments ==
The authors evaluated factual and counterfactual rational generation in both single- and multi-aspect classification tasks. The method was tested on the following three binary classification datasets: Amazon reviews (single-aspect) [1], Beer reviews (multi-aspect) [2], and Hotel reviews (multi-aspect) [3]. Note that the Amazon reviews contain both positive and negative sentiments. Beer reviews and hotel reviews contain only factual annotations.

The CAR model is compared with two existing methods, RNP and Post-exp.

- '''RNP''': this is a framework for rationalizing neural prediction proposed by Lei ''et al.'' [4]. It combines two modular components, a generator, and a predictor. RNP is only able to generate factual rationales.

- '''POST-EXP''': Post-explanation approach contains two generators and one predictor. Given the pre-trained predictor, one generator is trained to generate positive rationales, and the other generator is trained to generate negative rationales.

To make reasonable comparisons among the three methods, the predictors are of the same architecture, so as the generators. Their sparsity and continuity constraints are also of the same form.

Two types are experiments are set: objective evaluation and subjective evaluation.

In the objective evaluation, rationales are generated using the three models, and are compared with human annotations. Precision, recall and F1 score are reported to evaluate the performance of the models. (Note that the algorithms are conditioned on a similar actual sparsity level in factual rationales.)

In the subjective evaluations, people were asked to choose a sentiment (positive or negative) based on the given rationales. In the single-aspect case, a success is credited when the person correctly guesses the ground-truth sentiment given the factual rationales, or the person is convinced to choose the opposite sentiment to the ground-truth given the counterfactual rationale. In the multi-aspect case, the person also needs to guess the aspect correctly.

The results are as below.

[[File:Picture1.png]]

Table 1: Objective performances on the Amazon review dataset. The numbers in each column represent precision / recall / F1 score. [5]

[[File:Picture2.png]]

Table 2: Objective performances on the beer review and hotel review datasets. (Columns: P - precision / R - recall / F1 - F1 score.) [5]

[[File:Picture3.png]]

Figure 3: Subjective performances summary. [5]

== Conclusions ==
The authors evaluated the method in single- and multi-aspect sentiment classification tasks. The result tables above show that the proposed method is able to identify both factual (justifying the ground truth label) and counterfactual (countering the ground truth label) rationales consistent with human rationalization. When comparing with other existing methods (RNP and POST-EXP), CAR achieves better accuracy in finding factual rationales. For the counterfactual case, although the CAR method has lower recall and F1 score, human evaluators still favor the CAR generated counterfactual rationales. The subjective experiment result shows that CAR is able to correctly convince the subjects with factual rationales and fool the subjects with counterfactual rationales.

== Critiques ==
Although the CAR method performs better than the two other methods in most cases, it still has many limitations. In the case where the reviews have annotated ground truth containing mixed sentiments (aroma aspect in beer reviews), CAR often has low recalls. Also, CAR sometimes selects irrelevant aspect words with the desired sentiment to fulfill the sparsity constraint. This reduces the precision. When the reviews are very short without a mix of sentiments (cleanliness aspect in hotel reviews), it is hard for CAR to generate counterfactual rationales to trick a human.

-

== References ==
[1] John Blitzer, Mark Dredze, and Fernando Pereira. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th annual meeting of the association of computational linguistics, pages 440–447, 2007.

[2] Julian McAuley, Jure Leskovec, and Dan Jurafsky. Learning attitudes and attributes from multi-aspect reviews. In 2012 IEEE 12th International Conference on Data Mining, pages 1020–1025. IEEE, 2012.

[3] Hongning Wang, Yue Lu, and Chengxiang Zhai. Latent aspect rating analysis on review text data: a rating regression approach. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 783–792. ACm, 2010.

[4] Tao Lei, Regina Barzilay, and Tommi Jaakkola. Rationalizing neural predictions. arXiv preprint arXiv:1606.04155, 2016

[5] Shiyu Chang, Yang Zhang, Mo Yu, and Tommi Jaakkola. A game theoretic approach to class-wise selective rationalization. arXiv: 1910.12853, 2019

[6] Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. Advances in neural information processing systems. 2014;27.

A Game Theoretic Approach to Class-wise Selective Rationalization

2021-11-25T15:36:03Z

Y664huan: /* References */

== Presented by ==
Yushan Chen, Yuying Huang, Ankitha Anugu

== Introduction ==
The selection of input features can be optimised for trained models or can be directly incorporated into methods. But an overall selection does not properly capture the useful rationales. So, a new game theoretic approach to class-dependent rationalization is introduced where the method is specifically trained to highlight evidence supporting alternative conclusions. Each class consists of three players who compete to find evidence for both factual and counterfactual circumstances. In a simple example, we explain how the game drives the solution towards relevant class-dependent rationales theoretically.

== Previous Work ==

There are two directions of research on generating interpretable features of neural networks. The first is to include the interpretations directly in the models, often known as self-explaining models. The alternative option is to generate interpretations in post-hoc manner. There are also few research works attempting to increase the fidelity of post hoc explanations by including the explanation mechanism into the training approach. Although none of these works can perform class-wise rationalisation, gradient based methods can be intuitively modified for this purpose, generating explanations for a specific class by probing the importance with regard to the relevant class logit.

== Motivation ==

Extending how rationales are defined and calculated is one of the primary questions motivating this research. To date, the typical approach has been to choose an overall feature subset that best explains the output/decision. The greatest mutual information criterion, for example, selects an overall subset of features so that mutual information between the feature subset and the target output decision is maximised, or the entropy of the target output decision conditional on this subset is minimised. Rationales, on the other hand, can be multi-faceted, involving support for a variety of outcomes in varying degrees. Existing rationale algorithms have one limitation. They only look for rationales that support the label class. So, we propose CAR algorithm to find rationales of any given class.

== Model Architecture ==
Suppose X is a random vector representing a string of text, Y represents the class that X is in, <math>Y\in \mathbb Y=\{0,1\} .</math>
Z(t) provides evidence supporting class <math>t\in \{0,1\} .</math>
<math>g_t^f(X), g_t^c(X), t\in \{0,1\}</math> are factual rationale generators, and <math>d_t(Z), t\in \{0,1\}</math> are discriminators.

'''Discriminator''': In our adversarial game, <math>d_0(\cdot)</math> takes a rationale Z generated by either <math>g_0^f(\cdot)</math> or <math>g_0^c(\cdot)</math>
as input, and outputs the probability that Z is generated by the factual generator <math>g_0^f(\cdot)</math>. The training
target for <math>d_0(\cdot) </math> is similar to the generative adversarial network (GAN) [5]:

<math> d(\cdot)=\underset{d(\cdot)}{argmin} -p_Y(0)\mathrm{E}[\log d(g_0^f(X))|Y=0]-p_Y(1)\mathrm{E}[\log(1- d(g_0^c(X)))|Y=1] </math>

'''Generators''': The factual generator <math>g_0^f(\cdot)</math> is trained to generate rationales from text labeled Y = 0. The counterfactual generator <math>g_0^c(\cdot)</math>, in contrast, learns from text labeled Y = 1. Both generators try to convince the discriminator that they are factual generators for Y = 0.

<math> g_0^f(\cdot)=\underset{g(\cdot)}{argmax}\mathrm{E}[h_0(d_0(g(X)))|Y=0] </math>, and <math>g_0^c(\cdot)=\underset{g(\cdot)}{argmax}\mathrm{E}[h_1(d_0(g(X)))|Y=1] </math>,

s.t. <math>g_0^f(\cdot)</math> and <math>g_0^c(\cdot)</math> satisfy some sparsity and continuity constraints.

[[File:CAR_framework.png]]

Figure 1: CAR training and inference procedures of the class-0 case. (a) The training procedure. (b) During
inference, there is no ground truth label. In this case, we will always trigger the factual generators.

== Experiments ==
The authors evaluated factual and counterfactual rational generation in both single- and multi-aspect classification tasks. The method was tested on the following three binary classification datasets: Amazon reviews (single-aspect) [1], Beer reviews (multi-aspect) [2], and Hotel reviews (multi-aspect) [3]. Note that the Amazon reviews contain both positive and negative sentiments. Beer reviews and hotel reviews contain only factual annotations.

The CAR model is compared with two existing methods, RNP and Post-exp.

- '''RNP''': this is a framework for rationalizing neural prediction proposed by Lei ''et al.'' [4]. It combines two modular components, a generator, and a predictor. RNP is only able to generate factual rationales.

- '''POST-EXP''': Post-explanation approach contains two generators and one predictor. Given the pre-trained predictor, one generator is trained to generate positive rationales, and the other generator is trained to generate negative rationales.

To make reasonable comparisons among the three methods, the predictors are of the same architecture, so as the generators. Their sparsity and continuity constraints are also of the same form.

Two types are experiments are set: objective evaluation and subjective evaluation.

In the objective evaluation, rationales are generated using the three models, and are compared with human annotations. Precision, recall and F1 score are reported to evaluate the performance of the models. (Note that the algorithms are conditioned on a similar actual sparsity level in factual rationales.)

In the subjective evaluations, people were asked to choose a sentiment (positive or negative) based on the given rationales. In the single-aspect case, a success is credited when the person correctly guesses the ground-truth sentiment given the factual rationales, or the person is convinced to choose the opposite sentiment to the ground-truth given the counterfactual rationale. In the multi-aspect case, the person also needs to guess the aspect correctly.

The results are as below.

[[File:Picture1.png]]

Table 1: Objective performances on the Amazon review dataset. The numbers in each column represent precision / recall / F1 score. [5]

[[File:Picture2.png]]

Table 2: Objective performances on the beer review and hotel review datasets. (Columns: P - precision / R - recall / F1 - F1 score.) [5]

[[File:Picture3.png]]

Figure 3: Subjective performances summary. [5]

== Conclusions ==
The authors evaluated the method in single- and multi-aspect sentiment classification tasks. The result tables above show that the proposed method is able to identify both factual (justifying the ground truth label) and counterfactual (countering the ground truth label) rationales consistent with human rationalization. When comparing with other existing methods (RNP and POST-EXP), CAR achieves better accuracy in finding factual rationales. For the counterfactual case, although the CAR method has lower recall and F1 score, human evaluators still favor the CAR generated counterfactual rationales. The subjective experiment result shows that CAR is able to correctly convince the subjects with factual rationales and fool the subjects with counterfactual rationales.

== Critiques ==
Although the CAR method performs better than the two other methods in most cases, it still has many limitations. In the case where the reviews have annotated ground truth containing mixed sentiments (aroma aspect in beer reviews), CAR often has low recalls. Also, CAR sometimes selects irrelevant aspect words with the desired sentiment to fulfill the sparsity constraint. This reduces the precision. When the reviews are very short without a mix of sentiments (cleanliness aspect in hotel reviews), it is hard for CAR to generate counterfactual rationales to trick a human.

-

== References ==
[1] John Blitzer, Mark Dredze, and Fernando Pereira. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th annual meeting of the association of computational linguistics, pages 440–447, 2007.

[2] Julian McAuley, Jure Leskovec, and Dan Jurafsky. Learning attitudes and attributes from multi-aspect reviews. In 2012 IEEE 12th International Conference on Data Mining, pages 1020–1025. IEEE, 2012.

[3] Hongning Wang, Yue Lu, and Chengxiang Zhai. Latent aspect rating analysis on review text data: a rating regression approach. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 783–792. ACm, 2010.

[4] Tao Lei, Regina Barzilay, and Tommi Jaakkola. Rationalizing neural predictions. arXiv preprint arXiv:1606.04155, 2016

[5] Shiyu Chang, Yang Zhang, Mo Yu, and Tommi Jaakkola. A game theoretic approach to class-wise selective rationalization. arXiv: 1910.12853, 2019

[6] Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. Advances in neural information processing systems. 2014;27.

A Game Theoretic Approach to Class-wise Selective Rationalization

2021-11-25T15:32:42Z

Y664huan: /* Model Architecture */

== Presented by ==
Yushan Chen, Yuying Huang, Ankitha Anugu

== Introduction ==
The selection of input features can be optimised for trained models or can be directly incorporated into methods. But an overall selection does not properly capture the useful rationales. So, a new game theoretic approach to class-dependent rationalization is introduced where the method is specifically trained to highlight evidence supporting alternative conclusions. Each class consists of three players who compete to find evidence for both factual and counterfactual circumstances. In a simple example, we explain how the game drives the solution towards relevant class-dependent rationales theoretically.

== Previous Work ==

There are two directions of research on generating interpretable features of neural networks. The first is to include the interpretations directly in the models, often known as self-explaining models. The alternative option is to generate interpretations in post-hoc manner. There are also few research works attempting to increase the fidelity of post hoc explanations by including the explanation mechanism into the training approach. Although none of these works can perform class-wise rationalisation, gradient based methods can be intuitively modified for this purpose, generating explanations for a specific class by probing the importance with regard to the relevant class logit.

== Motivation ==

Extending how rationales are defined and calculated is one of the primary questions motivating this research. To date, the typical approach has been to choose an overall feature subset that best explains the output/decision. The greatest mutual information criterion, for example, selects an overall subset of features so that mutual information between the feature subset and the target output decision is maximised, or the entropy of the target output decision conditional on this subset is minimised. Rationales, on the other hand, can be multi-faceted, involving support for a variety of outcomes in varying degrees. Existing rationale algorithms have one limitation. They only look for rationales that support the label class. So, we propose CAR algorithm to find rationales of any given class.

== Model Architecture ==
Suppose X is a random vector representing a string of text, Y represents the class that X is in, <math>Y\in \mathbb Y=\{0,1\} .</math>
Z(t) provides evidence supporting class <math>t\in \{0,1\} .</math>
<math>g_t^f(X), g_t^c(X), t\in \{0,1\}</math> are factual rationale generators, and <math>d_t(Z), t\in \{0,1\}</math> are discriminators.

'''Discriminator''': In our adversarial game, <math>d_0(\cdot)</math> takes a rationale Z generated by either <math>g_0^f(\cdot)</math> or <math>g_0^c(\cdot)</math>
as input, and outputs the probability that Z is generated by the factual generator <math>g_0^f(\cdot)</math>. The training
target for <math>d_0(\cdot) </math> is similar to the generative adversarial network (GAN) [5]:

<math> d(\cdot)=\underset{d(\cdot)}{argmin} -p_Y(0)\mathrm{E}[\log d(g_0^f(X))|Y=0]-p_Y(1)\mathrm{E}[\log(1- d(g_0^c(X)))|Y=1] </math>

'''Generators''': The factual generator <math>g_0^f(\cdot)</math> is trained to generate rationales from text labeled Y = 0. The counterfactual generator <math>g_0^c(\cdot)</math>, in contrast, learns from text labeled Y = 1. Both generators try to convince the discriminator that they are factual generators for Y = 0.

<math> g_0^f(\cdot)=\underset{g(\cdot)}{argmax}\mathrm{E}[h_0(d_0(g(X)))|Y=0] </math>, and <math>g_0^c(\cdot)=\underset{g(\cdot)}{argmax}\mathrm{E}[h_1(d_0(g(X)))|Y=1] </math>,

s.t. <math>g_0^f(\cdot)</math> and <math>g_0^c(\cdot)</math> satisfy some sparsity and continuity constraints.

[[File:CAR_framework.png]]

Figure 1: CAR training and inference procedures of the class-0 case. (a) The training procedure. (b) During
inference, there is no ground truth label. In this case, we will always trigger the factual generators.

== Experiments ==
The authors evaluated factual and counterfactual rational generation in both single- and multi-aspect classification tasks. The method was tested on the following three binary classification datasets: Amazon reviews (single-aspect) [1], Beer reviews (multi-aspect) [2], and Hotel reviews (multi-aspect) [3]. Note that the Amazon reviews contain both positive and negative sentiments. Beer reviews and hotel reviews contain only factual annotations.

The CAR model is compared with two existing methods, RNP and Post-exp.

- '''RNP''': this is a framework for rationalizing neural prediction proposed by Lei ''et al.'' [4]. It combines two modular components, a generator, and a predictor. RNP is only able to generate factual rationales.

- '''POST-EXP''': Post-explanation approach contains two generators and one predictor. Given the pre-trained predictor, one generator is trained to generate positive rationales, and the other generator is trained to generate negative rationales.

To make reasonable comparisons among the three methods, the predictors are of the same architecture, so as the generators. Their sparsity and continuity constraints are also of the same form.

Two types are experiments are set: objective evaluation and subjective evaluation.

In the objective evaluation, rationales are generated using the three models, and are compared with human annotations. Precision, recall and F1 score are reported to evaluate the performance of the models. (Note that the algorithms are conditioned on a similar actual sparsity level in factual rationales.)

In the subjective evaluations, people were asked to choose a sentiment (positive or negative) based on the given rationales. In the single-aspect case, a success is credited when the person correctly guesses the ground-truth sentiment given the factual rationales, or the person is convinced to choose the opposite sentiment to the ground-truth given the counterfactual rationale. In the multi-aspect case, the person also needs to guess the aspect correctly.

The results are as below.

[[File:Picture1.png]]

Table 1: Objective performances on the Amazon review dataset. The numbers in each column represent precision / recall / F1 score. [5]

[[File:Picture2.png]]

Table 2: Objective performances on the beer review and hotel review datasets. (Columns: P - precision / R - recall / F1 - F1 score.) [5]

[[File:Picture3.png]]

Figure 3: Subjective performances summary. [5]

== Conclusions ==
The authors evaluated the method in single- and multi-aspect sentiment classification tasks. The result tables above show that the proposed method is able to identify both factual (justifying the ground truth label) and counterfactual (countering the ground truth label) rationales consistent with human rationalization. When comparing with other existing methods (RNP and POST-EXP), CAR achieves better accuracy in finding factual rationales. For the counterfactual case, although the CAR method has lower recall and F1 score, human evaluators still favor the CAR generated counterfactual rationales. The subjective experiment result shows that CAR is able to correctly convince the subjects with factual rationales and fool the subjects with counterfactual rationales.

== Critiques ==
Although the CAR method performs better than the two other methods in most cases, it still has many limitations. In the case where the reviews have annotated ground truth containing mixed sentiments (aroma aspect in beer reviews), CAR often has low recalls. Also, CAR sometimes selects irrelevant aspect words with the desired sentiment to fulfill the sparsity constraint. This reduces the precision. When the reviews are very short without a mix of sentiments (cleanliness aspect in hotel reviews), it is hard for CAR to generate counterfactual rationales to trick a human.

-

== References ==
[1] John Blitzer, Mark Dredze, and Fernando Pereira. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th annual meeting of the association of computational linguistics, pages 440–447, 2007.

[2] Julian McAuley, Jure Leskovec, and Dan Jurafsky. Learning attitudes and attributes from multi-aspect reviews. In 2012 IEEE 12th International Conference on Data Mining, pages 1020–1025. IEEE, 2012.

[3] Hongning Wang, Yue Lu, and Chengxiang Zhai. Latent aspect rating analysis on review text data: a rating regression approach. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 783–792. ACm, 2010.

[4] Tao Lei, Regina Barzilay, and Tommi Jaakkola. Rationalizing neural predictions. arXiv preprint arXiv:1606.04155, 2016

[5] Shiyu Chang, Yang Zhang, Mo Yu, and Tommi Jaakkola. A game theoretic approach to class-wise selective rationalization. arXiv: 1910.12853, 2019

A Game Theoretic Approach to Class-wise Selective Rationalization

2021-11-25T15:05:53Z

Y664huan: /* Model Architecture */

File:CAR framework.png

2021-11-25T15:01:27Z

Y664huan: The framework of CAR, for illustration only

The framework of CAR, for illustration only

stat441F21

2021-11-17T15:45:59Z

Y664huan: /* Paper presentation */

== [[F20-STAT 441/841 CM 763-Proposal| Project Proposal ]] ==



=Paper presentation=
{| class="wikitable"

{| border="1" cellpadding="3"
|-
|width="60pt"|Date
|width="250pt"|Name
|width="15pt"|Paper number
|width="700pt"|Title
|width="15pt"|Link to the paper
|width="30pt"|Link to the summary
|width="30pt"|Link to the video
|-
|Sep 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary] || [https://youtu.be/JWozRg_X-Vg?list=PLehuLRPyt1HzXDemu7K4ETcF0Ld_B5adG&t=539]
|-
|Week of Nov 16 || Ali Ghodsi || || || || ||
|-
|Week of Nov 22 || Jared Feng, Xipeng Huang, Mingwei Xu, Tingzhou Yu|| || Don't Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification || [http://proceedings.mlr.press/v139/bai21c/bai21c.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Don%27t_Just_Blame_Over-parametrization Summary] ||
|-
|Week of Nov 22 || Kanika Chopra, Yush Rajcoomar || || Automatic Bank Fraud Detection Using Support Vector Machines || [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.863.5804&rep=rep1&type=pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Automatic_Bank_Fraud_Detection_Using_Support_Vector_Machines Summary] ||
|-
|Week of Nov 22 || Zeng Mingde, Lin Xiaoyu, Fan Joshua, Rao Chen Min || || Do Vision Transformers See Like Convolutional Neural Networks? || [https://arxiv.org/pdf/2108.08810v1.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Do_Vision_Transformers_See_Like_CNN Summary] ||
|-
|Week of Nov 22 || Justin D'Astous, Waqas Hamed, Stefan Vladusic, Ethan O'Farrell || || A Probabilistic Approach to Neural Network Pruning || [http://proceedings.mlr.press/v139/qian21a/qian21a.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Summary_of_A_Probabilistic_Approach_to_Neural_Network_Pruning Summary] ||
|-
|Week of Nov 22 || Cassandra Wong, Anastasiia Livochka, Maryam Yalsavar, David Evans || || Patch-Based Convolutional Neural Network for Whole Slide Tissue Image Classification || [https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Hou_Patch-Based_Convolutional_Neural_CVPR_2016_paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Patch_Based_Convolutional_Neural_Network_for_Whole_Slide_Tissue_Image_Classification Summary] ||
|-
|Week of Nov 29 || Jessie Man Wai Chin, Yi Lin Ooi, Yaqi Shi, Shwen Lyng Ngew || || CatBoost: unbiased boosting with categorical features || [https://proceedings.neurips.cc/paper/2018/file/14491b756b3a51daac41c24863285549-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=CatBoost:_unbiased_boosting_with_categorical_features Summary] ||
|-
|Week of Nov 29 || Eric Anderson, Chengzhi Wang, Kai Zhong, YiJing Zhou || || || || ||
|-
|Week of Nov 29 || Ethan Cyrenne, Dieu Hoa Nguyen, Mary Jane Sin, Carolyn Wang || || || || ||
|-
|Week of Nov 29 || Bowen Zhang, Tyler Magnus Verhaar, Sam Senko || || Deep Double Descent: Where Bigger Models and More Data Hurt || [https://arxiv.org/pdf/1912.02292.pdf Paper] || ||
|-
|Week of Nov 29 || Chun Waan Loke, Peter Chong, Clarice Osmond, Zhilong Li|| || || || ||
|-
|Week of Nov 22 || Ann Gie Wong, Curtis Li, Hannah Kerr || || The Detection of Black Ice Accidents for Preventative Automated Vehicles Using Convolutional Neural Networks || [https://www.mdpi.com/2079-9292/9/12/2178/htm Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=The_Detection_of_Black_Ice_Accidents_Using_CNNs&fbclid=IwAR0K4YdnL_hdRnOktmJn8BI6-Ra3oitjJof0YwluZgUP1LVFHK5jyiBZkvQ Summary] ||
|-
|Week of Nov 22 || Yuwei Liu, Daniel Mao|| || Depthwise Convolution Is All You Need for Learning Multiple Visual Domains || [https://arxiv.org/abs/1902.00927 Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Depthwise_Convolution_Is_All_You_Need_for_Learning_Multiple_Visual_Domains Summary] ||
|-
|Week of Nov 29 || Lingshan Wang, Yifan Li, Ziyi Liu || || Deep Learning for Extreme Multi-label Text Classification || [https://dl.acm.org/doi/pdf/10.1145/3077136.3080834 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Learning_for_Extreme_Multi-label_Text_Classification Summary]||
|-
|-
|Week of Nov 29 || Kar Lok Ng, Muhan (Iris) Li || || || || ||
|-
|Week of Nov 29 ||Kun Wang || || || || ||
|-
|Week of Nov 29 ||Egemen Guray || || || || ||
|-
|Week of Nov 22 ||Bsodjahi || || Bayesian Network as a Decision Tool for Predicting ALS Disease || https://www.mdpi.com/2076-3425/11/2/150/pdf || ||
|-
|Week of Nov 22 ||Xin Yan, Yishu Duan, Xibei Di || || Predicting Hurricane Trajectories Using a Recurrent Neural Network || [https://arxiv.org/pdf/1802.02548.pdf Paper] ||
[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Predicting_Hurricane_Trajectories_Using_a_Recurrent_Neural_Network Summary]||
|-
|Week of Nov 29 ||Ankitha Anugu, Yushan Chen, Yuying Huang || || || || ||
|-

stat441F21

2021-11-17T15:45:48Z

Y664huan: /* Paper presentation */

== [[F20-STAT 441/841 CM 763-Proposal| Project Proposal ]] ==



=Paper presentation=
{| class="wikitable"

{| border="1" cellpadding="3"
|-
|width="60pt"|Date
|width="250pt"|Name
|width="15pt"|Paper number
|width="700pt"|Title
|width="15pt"|Link to the paper
|width="30pt"|Link to the summary
|width="30pt"|Link to the video
|-
|Sep 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary] || [https://youtu.be/JWozRg_X-Vg?list=PLehuLRPyt1HzXDemu7K4ETcF0Ld_B5adG&t=539]
|-
|Week of Nov 16 || Ali Ghodsi || || || || ||
|-
|Week of Nov 22 || Jared Feng, Xipeng Huang, Mingwei Xu, Tingzhou Yu|| || Don't Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification || [http://proceedings.mlr.press/v139/bai21c/bai21c.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Don%27t_Just_Blame_Over-parametrization Summary] ||
|-
|Week of Nov 22 || Kanika Chopra, Yush Rajcoomar || || Automatic Bank Fraud Detection Using Support Vector Machines || [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.863.5804&rep=rep1&type=pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Automatic_Bank_Fraud_Detection_Using_Support_Vector_Machines Summary] ||
|-
|Week of Nov 22 || Zeng Mingde, Lin Xiaoyu, Fan Joshua, Rao Chen Min || || Do Vision Transformers See Like Convolutional Neural Networks? || [https://arxiv.org/pdf/2108.08810v1.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Do_Vision_Transformers_See_Like_CNN Summary] ||
|-
|Week of Nov 22 || Justin D'Astous, Waqas Hamed, Stefan Vladusic, Ethan O'Farrell || || A Probabilistic Approach to Neural Network Pruning || [http://proceedings.mlr.press/v139/qian21a/qian21a.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Summary_of_A_Probabilistic_Approach_to_Neural_Network_Pruning Summary] ||
|-
|Week of Nov 22 || Cassandra Wong, Anastasiia Livochka, Maryam Yalsavar, David Evans || || Patch-Based Convolutional Neural Network for Whole Slide Tissue Image Classification || [https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Hou_Patch-Based_Convolutional_Neural_CVPR_2016_paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Patch_Based_Convolutional_Neural_Network_for_Whole_Slide_Tissue_Image_Classification Summary] ||
|-
|Week of Nov 29 || Jessie Man Wai Chin, Yi Lin Ooi, Yaqi Shi, Shwen Lyng Ngew || || CatBoost: unbiased boosting with categorical features || [https://proceedings.neurips.cc/paper/2018/file/14491b756b3a51daac41c24863285549-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=CatBoost:_unbiased_boosting_with_categorical_features Summary] ||
|-
|Week of Nov 29 || Eric Anderson, Chengzhi Wang, Kai Zhong, YiJing Zhou || || || || ||
|-
|Week of Nov 29 || Ethan Cyrenne, Dieu Hoa Nguyen, Mary Jane Sin, Carolyn Wang || || || || ||
|-
|Week of Nov 29 || Bowen Zhang, Tyler Magnus Verhaar, Sam Senko || || Deep Double Descent: Where Bigger Models and More Data Hurt || [https://arxiv.org/pdf/1912.02292.pdf Paper] || ||
|-
|Week of Nov 29 || Chun Waan Loke, Peter Chong, Clarice Osmond, Zhilong Li|| || || || ||
|-
|Week of Nov 22 || Ann Gie Wong, Curtis Li, Hannah Kerr || || The Detection of Black Ice Accidents for Preventative Automated Vehicles Using Convolutional Neural Networks || [https://www.mdpi.com/2079-9292/9/12/2178/htm Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=The_Detection_of_Black_Ice_Accidents_Using_CNNs&fbclid=IwAR0K4YdnL_hdRnOktmJn8BI6-Ra3oitjJof0YwluZgUP1LVFHK5jyiBZkvQ Summary] ||
|-
|Week of Nov 22 || Yuwei Liu, Daniel Mao|| || Depthwise Convolution Is All You Need for Learning Multiple Visual Domains || [https://arxiv.org/abs/1902.00927 Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Depthwise_Convolution_Is_All_You_Need_for_Learning_Multiple_Visual_Domains Summary] ||
|-
|Week of Nov 29 || Lingshan Wang, Yifan Li, Ziyi Liu || || Deep Learning for Extreme Multi-label Text Classification || [https://dl.acm.org/doi/pdf/10.1145/3077136.3080834 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Learning_for_Extreme_Multi-label_Text_Classification Summary]||
|-
|-
|Week of Nov 29 || Kar Lok Ng, Muhan (Iris) Li || || || || ||
|-
|Week of Nov 29 ||Kun Wang || || || || ||
|-
|Week of Nov 29 ||Egemen Guray || || || || ||
|-
|Week of Nov 22 ||Bsodjahi || || Bayesian Network as a Decision Tool for Predicting ALS Disease || https://www.mdpi.com/2076-3425/11/2/150/pdf || ||
|-
|Week of Nov 22 ||Xin Yan, Yishu Duan, Xibei Di || || Predicting Hurricane Trajectories Using a Recurrent Neural Network || [https://arxiv.org/pdf/1802.02548.pdf Paper] ||
[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Predicting_Hurricane_Trajectories_Using_a_Recurrent_Neural_Network Summary] ||
|-
|Week of Nov 29 ||Ankitha Anugu, Yushan Chen, Yuying Huang || || || || ||
|-

stat441F21

2021-11-17T15:45:27Z

Y664huan: /* Paper presentation */

== [[F20-STAT 441/841 CM 763-Proposal| Project Proposal ]] ==



=Paper presentation=
{| class="wikitable"

{| border="1" cellpadding="3"
|-
|width="60pt"|Date
|width="250pt"|Name
|width="15pt"|Paper number
|width="700pt"|Title
|width="15pt"|Link to the paper
|width="30pt"|Link to the summary
|width="30pt"|Link to the video
|-
|Sep 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary] || [https://youtu.be/JWozRg_X-Vg?list=PLehuLRPyt1HzXDemu7K4ETcF0Ld_B5adG&t=539]
|-
|Week of Nov 16 || Ali Ghodsi || || || || ||
|-
|Week of Nov 22 || Jared Feng, Xipeng Huang, Mingwei Xu, Tingzhou Yu|| || Don't Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification || [http://proceedings.mlr.press/v139/bai21c/bai21c.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Don%27t_Just_Blame_Over-parametrization Summary] ||
|-
|Week of Nov 22 || Kanika Chopra, Yush Rajcoomar || || Automatic Bank Fraud Detection Using Support Vector Machines || [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.863.5804&rep=rep1&type=pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Automatic_Bank_Fraud_Detection_Using_Support_Vector_Machines Summary] ||
|-
|Week of Nov 22 || Zeng Mingde, Lin Xiaoyu, Fan Joshua, Rao Chen Min || || Do Vision Transformers See Like Convolutional Neural Networks? || [https://arxiv.org/pdf/2108.08810v1.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Do_Vision_Transformers_See_Like_CNN Summary] ||
|-
|Week of Nov 22 || Justin D'Astous, Waqas Hamed, Stefan Vladusic, Ethan O'Farrell || || A Probabilistic Approach to Neural Network Pruning || [http://proceedings.mlr.press/v139/qian21a/qian21a.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Summary_of_A_Probabilistic_Approach_to_Neural_Network_Pruning Summary] ||
|-
|Week of Nov 22 || Cassandra Wong, Anastasiia Livochka, Maryam Yalsavar, David Evans || || Patch-Based Convolutional Neural Network for Whole Slide Tissue Image Classification || [https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Hou_Patch-Based_Convolutional_Neural_CVPR_2016_paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Patch_Based_Convolutional_Neural_Network_for_Whole_Slide_Tissue_Image_Classification Summary] ||
|-
|Week of Nov 29 || Jessie Man Wai Chin, Yi Lin Ooi, Yaqi Shi, Shwen Lyng Ngew || || CatBoost: unbiased boosting with categorical features || [https://proceedings.neurips.cc/paper/2018/file/14491b756b3a51daac41c24863285549-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=CatBoost:_unbiased_boosting_with_categorical_features Summary] ||
|-
|Week of Nov 29 || Eric Anderson, Chengzhi Wang, Kai Zhong, YiJing Zhou || || || || ||
|-
|Week of Nov 29 || Ethan Cyrenne, Dieu Hoa Nguyen, Mary Jane Sin, Carolyn Wang || || || || ||
|-
|Week of Nov 29 || Bowen Zhang, Tyler Magnus Verhaar, Sam Senko || || Deep Double Descent: Where Bigger Models and More Data Hurt || [https://arxiv.org/pdf/1912.02292.pdf Paper] || ||
|-
|Week of Nov 29 || Chun Waan Loke, Peter Chong, Clarice Osmond, Zhilong Li|| || || || ||
|-
|Week of Nov 22 || Ann Gie Wong, Curtis Li, Hannah Kerr || || The Detection of Black Ice Accidents for Preventative Automated Vehicles Using Convolutional Neural Networks || [https://www.mdpi.com/2079-9292/9/12/2178/htm Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=The_Detection_of_Black_Ice_Accidents_Using_CNNs&fbclid=IwAR0K4YdnL_hdRnOktmJn8BI6-Ra3oitjJof0YwluZgUP1LVFHK5jyiBZkvQ Summary] ||
|-
|Week of Nov 22 || Yuwei Liu, Daniel Mao|| || Depthwise Convolution Is All You Need for Learning Multiple Visual Domains || [https://arxiv.org/abs/1902.00927 Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Depthwise_Convolution_Is_All_You_Need_for_Learning_Multiple_Visual_Domains Summary] ||
|-
|Week of Nov 29 || Lingshan Wang, Yifan Li, Ziyi Liu || || Deep Learning for Extreme Multi-label Text Classification || [https://dl.acm.org/doi/pdf/10.1145/3077136.3080834 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Learning_for_Extreme_Multi-label_Text_Classification Summary]||
|-
|-
|Week of Nov 29 || Kar Lok Ng, Muhan (Iris) Li || || || || ||
|-
|Week of Nov 29 ||Kun Wang || || || || ||
|-
|Week of Nov 29 ||Egemen Guray || || || || ||
|-
|Week of Nov 22 ||Bsodjahi || || Bayesian Network as a Decision Tool for Predicting ALS Disease || https://www.mdpi.com/2076-3425/11/2/150/pdf || ||
|-
|Week of Nov 22 ||Xin Yan, Yishu Duan, Xibei Di || || Predicting Hurricane Trajectories Using a Recurrent Neural Network || [https://arxiv.org/pdf/1802.02548.pdf Paper] ||
[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Predicting_Hurricane_Trajectories_Using_a_Recurrent_Neural_Network Summary]||
|-
|Week of Nov 29 ||Ankitha Anugu, Yushan Chen, Yuying Huang || || || || ||
|-

stat441F21

2021-11-17T15:44:38Z

Y664huan: /* Paper presentation */

== [[F20-STAT 441/841 CM 763-Proposal| Project Proposal ]] ==



=Paper presentation=
{| class="wikitable"

{| border="1" cellpadding="3"
|-
|width="60pt"|Date
|width="250pt"|Name
|width="15pt"|Paper number
|width="700pt"|Title
|width="15pt"|Link to the paper
|width="30pt"|Link to the summary
|width="30pt"|Link to the video
|-
|Sep 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary] || [https://youtu.be/JWozRg_X-Vg?list=PLehuLRPyt1HzXDemu7K4ETcF0Ld_B5adG&t=539]
|-
|Week of Nov 16 || Ali Ghodsi || || || || ||
|-
|Week of Nov 22 || Jared Feng, Xipeng Huang, Mingwei Xu, Tingzhou Yu|| || Don't Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification || [http://proceedings.mlr.press/v139/bai21c/bai21c.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Don%27t_Just_Blame_Over-parametrization Summary] ||
|-
|Week of Nov 22 || Kanika Chopra, Yush Rajcoomar || || Automatic Bank Fraud Detection Using Support Vector Machines || [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.863.5804&rep=rep1&type=pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Automatic_Bank_Fraud_Detection_Using_Support_Vector_Machines Summary] ||
|-
|Week of Nov 22 || Zeng Mingde, Lin Xiaoyu, Fan Joshua, Rao Chen Min || || Do Vision Transformers See Like Convolutional Neural Networks? || [https://arxiv.org/pdf/2108.08810v1.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Do_Vision_Transformers_See_Like_CNN Summary] ||
|-
|Week of Nov 22 || Justin D'Astous, Waqas Hamed, Stefan Vladusic, Ethan O'Farrell || || A Probabilistic Approach to Neural Network Pruning || [http://proceedings.mlr.press/v139/qian21a/qian21a.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Summary_of_A_Probabilistic_Approach_to_Neural_Network_Pruning Summary] ||
|-
|Week of Nov 22 || Cassandra Wong, Anastasiia Livochka, Maryam Yalsavar, David Evans || || Patch-Based Convolutional Neural Network for Whole Slide Tissue Image Classification || [https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Hou_Patch-Based_Convolutional_Neural_CVPR_2016_paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Patch_Based_Convolutional_Neural_Network_for_Whole_Slide_Tissue_Image_Classification Summary] ||
|-
|Week of Nov 29 || Jessie Man Wai Chin, Yi Lin Ooi, Yaqi Shi, Shwen Lyng Ngew || || CatBoost: unbiased boosting with categorical features || [https://proceedings.neurips.cc/paper/2018/file/14491b756b3a51daac41c24863285549-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=CatBoost:_unbiased_boosting_with_categorical_features Summary] ||
|-
|Week of Nov 29 || Eric Anderson, Chengzhi Wang, Kai Zhong, YiJing Zhou || || || || ||
|-
|Week of Nov 29 || Ethan Cyrenne, Dieu Hoa Nguyen, Mary Jane Sin, Carolyn Wang || || || || ||
|-
|Week of Nov 29 || Bowen Zhang, Tyler Magnus Verhaar, Sam Senko || || Deep Double Descent: Where Bigger Models and More Data Hurt || [https://arxiv.org/pdf/1912.02292.pdf Paper] || ||
|-
|Week of Nov 29 || Chun Waan Loke, Peter Chong, Clarice Osmond, Zhilong Li|| || || || ||
|-
|Week of Nov 22 || Ann Gie Wong, Curtis Li, Hannah Kerr || || The Detection of Black Ice Accidents for Preventative Automated Vehicles Using Convolutional Neural Networks || [https://www.mdpi.com/2079-9292/9/12/2178/htm Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=The_Detection_of_Black_Ice_Accidents_Using_CNNs&fbclid=IwAR0K4YdnL_hdRnOktmJn8BI6-Ra3oitjJof0YwluZgUP1LVFHK5jyiBZkvQ Summary] ||
|-
|Week of Nov 22 || Yuwei Liu, Daniel Mao|| || Depthwise Convolution Is All You Need for Learning Multiple Visual Domains || [https://arxiv.org/abs/1902.00927 Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Depthwise_Convolution_Is_All_You_Need_for_Learning_Multiple_Visual_Domains Summary] ||
|-
|Week of Nov 29 || Lingshan Wang, Yifan Li, Ziyi Liu || || Deep Learning for Extreme Multi-label Text Classification || [https://dl.acm.org/doi/pdf/10.1145/3077136.3080834 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Learning_for_Extreme_Multi-label_Text_Classification Summary]||
|-
|-
|Week of Nov 29 || Kar Lok Ng, Muhan (Iris) Li || || || || ||
|-
|Week of Nov 29 ||Kun Wang || || || || ||
|-
|Week of Nov 29 ||Egemen Guray || || || || ||
|-
|Week of Nov 22 ||Bsodjahi || || Bayesian Network as a Decision Tool for Predicting ALS Disease || https://www.mdpi.com/2076-3425/11/2/150/pdf || ||
|-
|Week of Nov 22 ||Xin Yan, Yishu Duan, Xibei Di || || Predicting Hurricane Trajectories Using a Recurrent Neural Network || [https://arxiv.org/pdf/1802.02548.pdf Paper] ||
[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Predicting_Hurricane_Trajectories_Using_a_Recurrent_Neural_Network Summary]|| ||
|-
|Week of Nov 29 ||Ankitha Anugu, Yushan Chen, Yuying Huang || || || || ||
|-

stat441F21

2021-11-17T14:56:00Z

Y664huan: /* Paper presentation */

== [[F20-STAT 441/841 CM 763-Proposal| Project Proposal ]] ==



=Paper presentation=
{| class="wikitable"

{| border="1" cellpadding="3"
|-
|width="60pt"|Date
|width="250pt"|Name
|width="15pt"|Paper number
|width="700pt"|Title
|width="15pt"|Link to the paper
|width="30pt"|Link to the summary
|width="30pt"|Link to the video
|-
|Sep 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary] || [https://youtu.be/JWozRg_X-Vg?list=PLehuLRPyt1HzXDemu7K4ETcF0Ld_B5adG&t=539]
|-
|Week of Nov 16 || Ali Ghodsi || || || || ||
|-
|Week of Nov 22 || Jared Feng, Xipeng Huang, Mingwei Xu, Tingzhou Yu|| || Don't Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification || [http://proceedings.mlr.press/v139/bai21c/bai21c.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Don%27t_Just_Blame_Over-parametrization Summary] ||
|-
|Week of Nov 22 || Kanika Chopra, Yush Rajcoomar || || Automatic Bank Fraud Detection Using Support Vector Machines || [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.863.5804&rep=rep1&type=pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Automatic_Bank_Fraud_Detection_Using_Support_Vector_Machines Summary] ||
|-
|Week of Nov 22 || Zeng Mingde, Lin Xiaoyu, Fan Joshua, Rao Chen Min || || Do Vision Transformers See Like Convolutional Neural Networks? || [https://arxiv.org/pdf/2108.08810v1.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Do_Vision_Transformers_See_Like_CNN Summary] ||
|-
|Week of Nov 22 || Justin D'Astous, Waqas Hamed, Stefan Vladusic, Ethan O'Farrell || || A Probabilistic Approach to Neural Network Pruning || [http://proceedings.mlr.press/v139/qian21a/qian21a.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Summary_of_A_Probabilistic_Approach_to_Neural_Network_Pruning Summary] ||
|-
|Week of Nov 22 || Cassandra Wong, Anastasiia Livochka, Maryam Yalsavar, David Evans || || Patch-Based Convolutional Neural Network for Whole Slide Tissue Image Classification || [https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Hou_Patch-Based_Convolutional_Neural_CVPR_2016_paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Patch_Based_Convolutional_Neural_Network_for_Whole_Slide_Tissue_Image_Classification Summary] ||
|-
|Week of Nov 29 || Jessie Man Wai Chin, Yi Lin Ooi, Yaqi Shi, Shwen Lyng Ngew || || CatBoost: unbiased boosting with categorical features || [https://proceedings.neurips.cc/paper/2018/file/14491b756b3a51daac41c24863285549-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=CatBoost:_unbiased_boosting_with_categorical_features Summary] ||
|-
|Week of Nov 29 || Eric Anderson, Chengzhi Wang, Kai Zhong, YiJing Zhou || || || || ||
|-
|Week of Nov 29 || Ethan Cyrenne, Dieu Hoa Nguyen, Mary Jane Sin, Carolyn Wang || || || || ||
|-
|Week of Nov 29 || Bowen Zhang, Tyler Magnus Verhaar, Sam Senko || || Deep Double Descent: Where Bigger Models and More Data Hurt || [https://arxiv.org/pdf/1912.02292.pdf Paper] || ||
|-
|Week of Nov 29 || Chun Waan Loke, Peter Chong, Clarice Osmond, Zhilong Li|| || || || ||
|-
|Week of Nov 22 || Ann Gie Wong, Curtis Li, Hannah Kerr || || The Detection of Black Ice Accidents for Preventative Automated Vehicles Using Convolutional Neural Networks || [https://www.mdpi.com/2079-9292/9/12/2178/htm Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=The_Detection_of_Black_Ice_Accidents_Using_CNNs&fbclid=IwAR0K4YdnL_hdRnOktmJn8BI6-Ra3oitjJof0YwluZgUP1LVFHK5jyiBZkvQ Summary] ||
|-
|Week of Nov 22 || Yuwei Liu, Daniel Mao|| || Depthwise Convolution Is All You Need for Learning Multiple Visual Domains || [https://arxiv.org/abs/1902.00927 Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Depthwise_Convolution_Is_All_You_Need_for_Learning_Multiple_Visual_Domains Summary] ||
|-
|Week of Nov 29 || Lingshan Wang, Yifan Li, Ziyi Liu || || Deep Learning for Extreme Multi-label Text Classification || [https://dl.acm.org/doi/pdf/10.1145/3077136.3080834 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Learning_for_Extreme_Multi-label_Text_Classification Summary]||
|-
|-
|Week of Nov 29 || Kar Lok Ng, Muhan (Iris) Li || || || || ||
|-
|Week of Nov 29 ||Kun Wang || || || || ||
|-
|Week of Nov 29 ||Egemen Guray || || || || ||
|-
|Week of Nov 22 ||Bsodjahi || || Bayesian Network as a Decision Tool for Predicting ALS Disease || https://www.mdpi.com/2076-3425/11/2/150/pdf || ||
|-
|Week of Nov 22 ||Xin Yan, Yishu Duan, Xibei Di || || Predicting Hurricane Trajectories Using a Recurrent Neural Network || [https://arxiv.org/pdf/1802.02548.pdf Paper] ||
[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Predicting_Hurricane_Trajectories_Using_a_Recurrent_Neural_Network Summary]||
|-
|Week of Nov 29 ||Ankitha Anugu, Yushan Chen, Yuying Huang || || || || ||
|-

stat441F21

2021-11-17T14:55:33Z

Y664huan: /* Paper presentation */

== [[F20-STAT 441/841 CM 763-Proposal| Project Proposal ]] ==



=Paper presentation=
{| class="wikitable"

{| border="1" cellpadding="3"
|-
|width="60pt"|Date
|width="250pt"|Name
|width="15pt"|Paper number
|width="700pt"|Title
|width="15pt"|Link to the paper
|width="30pt"|Link to the summary
|width="30pt"|Link to the video
|-
|Sep 15 (example)||Ri Wang || ||Sequence to sequence learning with neural networks.||[http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Going_Deeper_with_Convolutions Summary] || [https://youtu.be/JWozRg_X-Vg?list=PLehuLRPyt1HzXDemu7K4ETcF0Ld_B5adG&t=539]
|-
|Week of Nov 16 || Ali Ghodsi || || || || ||
|-
|Week of Nov 22 || Jared Feng, Xipeng Huang, Mingwei Xu, Tingzhou Yu|| || Don't Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification || [http://proceedings.mlr.press/v139/bai21c/bai21c.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Don%27t_Just_Blame_Over-parametrization Summary] ||
|-
|Week of Nov 22 || Kanika Chopra, Yush Rajcoomar || || Automatic Bank Fraud Detection Using Support Vector Machines || [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.863.5804&rep=rep1&type=pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Automatic_Bank_Fraud_Detection_Using_Support_Vector_Machines Summary] ||
|-
|Week of Nov 22 || Zeng Mingde, Lin Xiaoyu, Fan Joshua, Rao Chen Min || || Do Vision Transformers See Like Convolutional Neural Networks? || [https://arxiv.org/pdf/2108.08810v1.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Do_Vision_Transformers_See_Like_CNN Summary] ||
|-
|Week of Nov 22 || Justin D'Astous, Waqas Hamed, Stefan Vladusic, Ethan O'Farrell || || A Probabilistic Approach to Neural Network Pruning || [http://proceedings.mlr.press/v139/qian21a/qian21a.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Summary_of_A_Probabilistic_Approach_to_Neural_Network_Pruning Summary] ||
|-
|Week of Nov 22 || Cassandra Wong, Anastasiia Livochka, Maryam Yalsavar, David Evans || || Patch-Based Convolutional Neural Network for Whole Slide Tissue Image Classification || [https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Hou_Patch-Based_Convolutional_Neural_CVPR_2016_paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Patch_Based_Convolutional_Neural_Network_for_Whole_Slide_Tissue_Image_Classification Summary] ||
|-
|Week of Nov 29 || Jessie Man Wai Chin, Yi Lin Ooi, Yaqi Shi, Shwen Lyng Ngew || || CatBoost: unbiased boosting with categorical features || [https://proceedings.neurips.cc/paper/2018/file/14491b756b3a51daac41c24863285549-Paper.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=CatBoost:_unbiased_boosting_with_categorical_features Summary] ||
|-
|Week of Nov 29 || Eric Anderson, Chengzhi Wang, Kai Zhong, YiJing Zhou || || || || ||
|-
|Week of Nov 29 || Ethan Cyrenne, Dieu Hoa Nguyen, Mary Jane Sin, Carolyn Wang || || || || ||
|-
|Week of Nov 29 || Bowen Zhang, Tyler Magnus Verhaar, Sam Senko || || Deep Double Descent: Where Bigger Models and More Data Hurt || [https://arxiv.org/pdf/1912.02292.pdf Paper] || ||
|-
|Week of Nov 29 || Chun Waan Loke, Peter Chong, Clarice Osmond, Zhilong Li|| || || || ||
|-
|Week of Nov 22 || Ann Gie Wong, Curtis Li, Hannah Kerr || || The Detection of Black Ice Accidents for Preventative Automated Vehicles Using Convolutional Neural Networks || [https://www.mdpi.com/2079-9292/9/12/2178/htm Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=The_Detection_of_Black_Ice_Accidents_Using_CNNs&fbclid=IwAR0K4YdnL_hdRnOktmJn8BI6-Ra3oitjJof0YwluZgUP1LVFHK5jyiBZkvQ Summary] ||
|-
|Week of Nov 22 || Yuwei Liu, Daniel Mao|| || Depthwise Convolution Is All You Need for Learning Multiple Visual Domains || [https://arxiv.org/abs/1902.00927 Paper] ||[https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Depthwise_Convolution_Is_All_You_Need_for_Learning_Multiple_Visual_Domains Summary] ||
|-
|Week of Nov 29 || Lingshan Wang, Yifan Li, Ziyi Liu || || Deep Learning for Extreme Multi-label Text Classification || [https://dl.acm.org/doi/pdf/10.1145/3077136.3080834 Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Deep_Learning_for_Extreme_Multi-label_Text_Classification Summary]||
|-
|-
|Week of Nov 29 || Kar Lok Ng, Muhan (Iris) Li || || || || ||
|-
|Week of Nov 29 ||Kun Wang || || || || ||
|-
|Week of Nov 29 ||Egemen Guray || || || || ||
|-
|Week of Nov 22 ||Bsodjahi || || Bayesian Network as a Decision Tool for Predicting ALS Disease || https://www.mdpi.com/2076-3425/11/2/150/pdf || ||
|-
|Week of Nov 22 ||Xin Yan, Yishu Duan, Xibei Di || || Predicting Hurricane Trajectories Using a Recurrent Neural Network || [https://arxiv.org/pdf/1802.02548.pdf Paper] || [https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Predicting_Hurricane_Trajectories_Using_a_Recurrent_Neural_Network Summary]||
|Week of Nov 29 ||Ankitha Anugu, Yushan Chen, Yuying Huang || || || || ||
|-

F21-STAT 441/841 CM 763-Proposal

2021-10-08T15:46:27Z

Y664huan: