F21-STAT 441/841 CM 763-Proposal: Difference between revisions

From statwiki
Jump to navigation Jump to search
(Addition of Project #12)
No edit summary
(36 intermediate revisions by 21 users not shown)
Line 26: Line 26:
Yu, Tingzhou
Yu, Tingzhou


Title:  
Title: Patch-Based Convolutional Neural Network for Cancers Classification


Description:
Description: In this project, we consider classifying three classes (tumor types) of cancers based on pathological data. We will follow the paper ''Patch-based Convolutional Neural Network for Whole Slide Tissue Image Classification''.
--------------------------------------------------------------------
--------------------------------------------------------------------
Project # 2 Group members:
Project # 2 Group members:
Line 42: Line 42:
Title: Application of Neural Networks
Title: Application of Neural Networks


Description: To be filled in before Oct 8th.
Description: Using neural networks to determine content/intent of emails.


--------------------------------------------------------------------
--------------------------------------------------------------------
Line 51: Line 51:
Rajcoomar, Yush
Rajcoomar, Yush


Title: TBD
Bhattacharya, Vaibhav


Description: TBD
Title: Cancer Classification
 
Description: We will be classifying three tumour types based on pathological data.


--------------------------------------------------------------------
--------------------------------------------------------------------
Project # 4 Group members:
Project # 4 Group members:


Zhang, Bowen
Li, Shao Zhong
 
Li, Shaozhong


Kerr, Hannah
Kerr, Hannah  


Wong, Ann gie
Wong, Ann Gie


Title: Classification
Title: Classification of text


Description: TBD
Description: Being to automatically grade answers on tests can save a lot of time and teaching resources. But unlike a multiple-choice format where grading can be automated, the other formats involving text answers is more through in testing knowledge but still requires human evaluation and marking which is a bottleneck of teaching resources and personnel on a large scale with thousands of students. We will use classification techniques and machine learning to develop an automated way to predict the rightness of text answers with good accuracy that can be used by and suppport graders to reduce the time and manual effort needed in the grading process.


--------------------------------------------------------------------
--------------------------------------------------------------------
Line 81: Line 81:
Ngew, Shwen Lyng
Ngew, Shwen Lyng


Title: TBD
Title: The Application of Classification in Accelerated Underwriting (Insurance)
 
Description: Accelerated Underwriting (AUW), also called “express underwriting,” is a faster and easier process for people with good health condition to obtain life insurance. The traditional underwriting process is often painful for both customers and insurers. From the customer's perspective, they have to complete different types of questionnaires and provide different medical tests involving blood, urine, saliva and other medical results. Underwriters on the other hand have to manually go through every single policy to access the risk of each applicant. AUW allows people, who are deemed “healthy” to forgo medical exams. Since COVID-19, it has become a more concerning topic as traditional underwriting cannot be performed due to the stay-at-home order. However, this imposes a burden on the insurance company to better estimate the risk associated with less testing results.


Description: TBD
This is where data science kicks in. With different classification methods, we can address the underwriting process’ five pain points: labor, speed, efficiency, pricing and mortality.  This allows us to better estimate the risk and classify the clients for whether they are eligible for accelerated underwriting. For the final project, we use the data from one of the leading US insurers to analyze how we can classify our clients for AUW using the method of classification. We will be using factors such as health data, medical history, family history as well as insurance history to determine the eligibility.


--------------------------------------------------------------------
--------------------------------------------------------------------
Line 92: Line 94:
Cyrenne, Ethan
Cyrenne, Ethan


Hoa, Dieu
Nguyen, Dieu Hoa


Sin, Mary Jane
Sin, Mary Jane
Line 135: Line 137:


Osmond, Clarice
Osmond, Clarice
Li, Zhilong


Title: TBD
Title: TBD
Line 169: Line 173:


Muhan (Iris), Li
Muhan (Iris), Li
Wu, Mingze


Title: NFL Health & Safety - Helmet Assignment competition (Kaggle Competition)
Title: NFL Health & Safety - Helmet Assignment competition (Kaggle Competition)
Line 174: Line 180:
Description: Assigning players to the helmet in a given footage of head collision in football play.
Description: Assigning players to the helmet in a given footage of head collision in football play.
--------------------------------------------------------------------
--------------------------------------------------------------------
Project # 13 Group members:
Livochka, Anastasiia
Wong, Cassandra
Evans, David
Yalsavar, Maryam
Title: TBD
Description: TBD
--------------------------------------------------------------------
Project # 14 Group Members:
Zeng, Mingde
Lin, Xiaoyu
Fan, Joshua
Rao, Chen Min
Title: TBD
Description: TBD
--------------------------------------------------------------------
Project # 15 Group Members:
Huang, Yuying
Anugu, Ankitha
Dave, Meet Hemang
Chen, Yushan
Title: TBD
Description: TBD
--------------------------------------------------------------------
Project # 16 Group Members:
Wang, Lingshan
Li, Yifan
Liu, Ziyi
Title: Implement and Improve CNN in Multi-Class Text Classification
Description: We are going to apply Convolutional Neural Network (CNN) to classify real-world data (application to build an efficient case study interview materials classifier) and improve CNN algorithm-wise in the context of text classification, being supported with real-world data set. With the implementation of CNN, it allows us to further analyze the efficiency and practicality of the algorithm.
The dataset is composed of case study HTML files containing case information that can be classified into multiple industry categories. We will implement a multi-class classification to break down the information contained in each case material into some pre-determined subcategories (eg, behavior questions, consulting questions, questions for new business/market entry, etc.). We will attempt to process the complicated data into several data types(e.g. HTML, JSON, pandas data frames, etc.) and choose the most efficient raw data processing logic based on runtime and algorithm optimization.
--------------------------------------------------------------------
Project # 17 Group members:
Malhi, Dilmeet
Joshi, Vansh
Syamala, Aavinash
Islam, Sohan
Title: Kaggle project: Brain Tumor Radiogenomic Classification
Description: In this project, we will predict the genetic subtype of glioblastoma using MRI (magnetic resonance imaging) scans to train and test your model to detect the presence of MGMT promoter methylation.
--------------------------------------------------------------------
Project # 18 Group members:
Yuwei, Liu
Daniel, Mao
Title: Sartorius - Cell Instance Segmentation (Kaggle) [https://www.kaggle.com/c/sartorius-cell-instance-segmentation]
Description: Detect single neuronal cells in microscopy images
--------------------------------------------------------------------
Project #19 Group members:
Samuel, Senko
Tyler, Verhaar
Zhang, Bowen
Title: NBA Game Prediction
Description: We will build a win/loss classifier for NBA games using player and game data and also incorporating alternative data (ex. sports betting data).
-------------------------------------------------------------------
Project #20 Group members:
Mitrache, Christian
Renggli, Aaron
Saini, Jessica
Mossman, Alexandra
Title: Classification and Deep Learning for Healthcare Provider Fraud Detection Analysis
Description: TBD
--------------------------------------------------------------------
Project # 21 Group members:
Wang, Kun
Title: TBD
Description : TBD
--------------------------------------------------------------------
Project # 22 Group members:
Guray, Egemen
Title: Traffic Sign Recognition System (TSRS): SVM and Convolutional Neural Network
Description : I will build a prediction system to predict road signs in the German Traffic Sign Dataset using CNN.
--------------------------------------------------------------------
Project # 23 Group members:
Bsodjahi
Title: Modeling Pseudomonas aeruginosa bacteria state through its genes expression activity
Description : Label Pseudomonas aeruginosa gene expression data through unsupervised learning (eg., EM algorithm) and then model the bacterial state as function of its genes expression

Revision as of 17:28, 28 November 2021

Use this format (Don’t remove Project 0)

Project # 0 Group members:

Last name, First name

Last name, First name

Last name, First name

Last name, First name

Title: Making a String Telephone

Description: We use paper cups to make a string phone and talk with friends while learning about sound waves with this science project. (Explain your project in one or two paragraphs).


Project # 1 Group members:

Feng, Jared

Huang, Xipeng

Xu, Mingwei

Yu, Tingzhou

Title: Patch-Based Convolutional Neural Network for Cancers Classification

Description: In this project, we consider classifying three classes (tumor types) of cancers based on pathological data. We will follow the paper Patch-based Convolutional Neural Network for Whole Slide Tissue Image Classification.


Project # 2 Group members:

Anderson, Eric

Wang, Chengzhi

Zhong, Kai

Zhou, Yi Jing

Title: Application of Neural Networks

Description: Using neural networks to determine content/intent of emails.


Project # 3 Group members:

Chopra, Kanika

Rajcoomar, Yush

Bhattacharya, Vaibhav

Title: Cancer Classification

Description: We will be classifying three tumour types based on pathological data.


Project # 4 Group members:

Li, Shao Zhong

Kerr, Hannah

Wong, Ann Gie

Title: Classification of text

Description: Being to automatically grade answers on tests can save a lot of time and teaching resources. But unlike a multiple-choice format where grading can be automated, the other formats involving text answers is more through in testing knowledge but still requires human evaluation and marking which is a bottleneck of teaching resources and personnel on a large scale with thousands of students. We will use classification techniques and machine learning to develop an automated way to predict the rightness of text answers with good accuracy that can be used by and suppport graders to reduce the time and manual effort needed in the grading process.


Project # 5 Group members:

Chin, Jessie Man Wai

Ooi, Yi Lin

Shi, Yaqi

Ngew, Shwen Lyng

Title: The Application of Classification in Accelerated Underwriting (Insurance)

Description: Accelerated Underwriting (AUW), also called “express underwriting,” is a faster and easier process for people with good health condition to obtain life insurance. The traditional underwriting process is often painful for both customers and insurers. From the customer's perspective, they have to complete different types of questionnaires and provide different medical tests involving blood, urine, saliva and other medical results. Underwriters on the other hand have to manually go through every single policy to access the risk of each applicant. AUW allows people, who are deemed “healthy” to forgo medical exams. Since COVID-19, it has become a more concerning topic as traditional underwriting cannot be performed due to the stay-at-home order. However, this imposes a burden on the insurance company to better estimate the risk associated with less testing results.

This is where data science kicks in. With different classification methods, we can address the underwriting process’ five pain points: labor, speed, efficiency, pricing and mortality. This allows us to better estimate the risk and classify the clients for whether they are eligible for accelerated underwriting. For the final project, we use the data from one of the leading US insurers to analyze how we can classify our clients for AUW using the method of classification. We will be using factors such as health data, medical history, family history as well as insurance history to determine the eligibility.


Project # 6 Group members:

Wang, Carolyn

Cyrenne, Ethan

Nguyen, Dieu Hoa

Sin, Mary Jane

Title: TBD

Description: TBD


Project # 7 Group members:

Bhattacharya, Vaibhav

Chatoor, Amanda

Prathap Das, Sutej

Title: PetFinder.my - Pawpularity Contest [1]

Description: In this competition, we will analyze raw images and metadata to predict the “Pawpularity” of pet photos. We'll train and test our model on PetFinder.my's thousands of pet profiles.


Project # 8 Group members:

Xu, Siming

Yan, Xin

Duan, Yishu

Di, Xibei

Title: TBD

Description: TBD


Project # 9 Group members:

Loke, Chun Waan

Chong, Peter

Osmond, Clarice

Li, Zhilong

Title: TBD

Description: TBD


Project # 10 Group members:

O'Farrell, Ethan

D'Astous, Justin

Hamed, Waqas

Vladusic, Stefan

Title: Pawpularity (Kaggle)

Description: Predicting the popularity of animal photos based on photo metadata


Project # 11 Group members:

JunBin, Pan

Title: TBD

Description: TBD


Project # 12 Group members:

Kar Lok, Ng

Muhan (Iris), Li

Wu, Mingze

Title: NFL Health & Safety - Helmet Assignment competition (Kaggle Competition)

Description: Assigning players to the helmet in a given footage of head collision in football play.


Project # 13 Group members:

Livochka, Anastasiia

Wong, Cassandra

Evans, David

Yalsavar, Maryam

Title: TBD

Description: TBD


Project # 14 Group Members:

Zeng, Mingde

Lin, Xiaoyu

Fan, Joshua

Rao, Chen Min

Title: TBD

Description: TBD


Project # 15 Group Members:

Huang, Yuying

Anugu, Ankitha

Dave, Meet Hemang

Chen, Yushan

Title: TBD

Description: TBD


Project # 16 Group Members:

Wang, Lingshan

Li, Yifan

Liu, Ziyi

Title: Implement and Improve CNN in Multi-Class Text Classification

Description: We are going to apply Convolutional Neural Network (CNN) to classify real-world data (application to build an efficient case study interview materials classifier) and improve CNN algorithm-wise in the context of text classification, being supported with real-world data set. With the implementation of CNN, it allows us to further analyze the efficiency and practicality of the algorithm. The dataset is composed of case study HTML files containing case information that can be classified into multiple industry categories. We will implement a multi-class classification to break down the information contained in each case material into some pre-determined subcategories (eg, behavior questions, consulting questions, questions for new business/market entry, etc.). We will attempt to process the complicated data into several data types(e.g. HTML, JSON, pandas data frames, etc.) and choose the most efficient raw data processing logic based on runtime and algorithm optimization.


Project # 17 Group members:

Malhi, Dilmeet

Joshi, Vansh

Syamala, Aavinash

Islam, Sohan

Title: Kaggle project: Brain Tumor Radiogenomic Classification

Description: In this project, we will predict the genetic subtype of glioblastoma using MRI (magnetic resonance imaging) scans to train and test your model to detect the presence of MGMT promoter methylation.


Project # 18 Group members:

Yuwei, Liu

Daniel, Mao

Title: Sartorius - Cell Instance Segmentation (Kaggle) [2]

Description: Detect single neuronal cells in microscopy images


Project #19 Group members:

Samuel, Senko

Tyler, Verhaar

Zhang, Bowen

Title: NBA Game Prediction

Description: We will build a win/loss classifier for NBA games using player and game data and also incorporating alternative data (ex. sports betting data).


Project #20 Group members:

Mitrache, Christian

Renggli, Aaron

Saini, Jessica

Mossman, Alexandra

Title: Classification and Deep Learning for Healthcare Provider Fraud Detection Analysis

Description: TBD


Project # 21 Group members:

Wang, Kun

Title: TBD

Description : TBD


Project # 22 Group members:

Guray, Egemen

Title: Traffic Sign Recognition System (TSRS): SVM and Convolutional Neural Network

Description : I will build a prediction system to predict road signs in the German Traffic Sign Dataset using CNN.


Project # 23 Group members:

Bsodjahi

Title: Modeling Pseudomonas aeruginosa bacteria state through its genes expression activity

Description : Label Pseudomonas aeruginosa gene expression data through unsupervised learning (eg., EM algorithm) and then model the bacterial state as function of its genes expression