F18-STAT841-Proposal: Difference between revisions

From statwiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 105: Line 105:


[1] OpenReview location: https://openreview.net/forum?id=B1l08oAct7
[1] OpenReview location: https://openreview.net/forum?id=B1l08oAct7
--------------------------------------------------------------------


'''Project # 5'''
'''Project # 5'''
Line 141: Line 143:


'''Description:'''  
'''Description:'''  
We will participate in the Human Protein Atlas Image Classification competition featured on Kaggle. We will classify proteins based on patterns seen in microscopic images.
Historically, the work done to classify proteins has only used single patterns in very few cell types. The goal of this challenge is to develop methods to classify proteins based on multiple/mixed patterns and with a larger range of cell types.


--------------------------------------------------------------------
--------------------------------------------------------------------

Revision as of 19:16, 6 October 2018

Use this format (Don’t remove Project 0)

Project # 0 Group members:

Last name, First name

Last name, First name

Last name, First name

Last name, First name

Title: Making a String Telephone

Description: We use paper cups to make a string phone and talk with friends while learning about sound waves with this science project. (Explain your project in one or two paragraphs).


Project # 1 Group members:

Weng, Jiacheng

Li, Keqi

Qian, Yi

Liu, Bomeng

Title: RSNA Pneumonia Detection Challenge

Description:

Our team’s project is the RSNA Pneumonia Detection Challenge from Kaggle competition. The primary goal of this project is to develop a machine learning tool to detect patients with pneumonia based on their chest radiographs (CXR).

Pneumonia is an infection that inflames the air sacs in human lungs which has symptoms such as chest pain, cough, and fever [1]. Pneumonia can be very dangerous especially to infants and elders. In 2015, 920,000 children under the age of 5 died from this disease [2]. Due to its fatality to children, diagnosing pneumonia has a high order. A common method of diagnosing pneumonia is to obtain patients’ chest radiograph (CXR) which is a gray-scale scan image of patients’ chests using x-ray. The infected region due to pneumonia usually shows as an area or areas of increased opacity [3] on CXR. However, many other factors can also contribute to increase in opacity on CXR which makes the diagnose very challenging. The diagnose also requires highly-skilled clinicians and a lot of time of CXR screening. The Radiological Society of North America (RSNA®) sees the opportunity of using machine learning to potentially accelerate the initial CXR screening process.

For the scope of this project, our team plans to contribute to solving this problem by applying our machine learning knowledge in image processing and classification. Team members are going to apply techniques that include, but are not limited to: logistic regression, random forest, SVM, kNN, CNN, etc., in order to successfully detect CXRs with pneumonia.


[1] (Accessed 2018, Oct. 4). Pneumonia [Online]. MAYO CLINIC. Available from: https://www.mayoclinic.org/diseases-conditions/pneumonia/symptoms-causes/syc-20354204 [2] (Accessed 2018, Oct. 4). RSNA Pneumonia Detection Challenge [Online]. Kaggle. Available from: https://www.kaggle.com/c/rsna-pneumonia-detection-challenge [3] Franquet T. Imaging of community-acquired pneumonia. J Thorac Imaging 2018 (epub ahead of print). PMID 30036297


Project # 2 Group members:

Hou, Zhaoran

Zhang, Chi

Title:

Description:


Project # 3 Group members:

Hanzhen Yang

Jing Pu Sun

Ganyuan Xuan

Yu Su

Title: Kaggle Challenge: Quick, Draw! Doodle Recognition Challenge

Description:

Our team chose the Quick, Draw! Doodle Recognition Challenge from the Kaggle Competition. The goal of the competition is to build an image recognition tool that can classify hand-drawn doodles into one of the 340 categories.

The main challenge of the project remains in the training set being very noisy. Hand-drawn artwork may deviate substantially from the actual object, and is almost definitively different from person to person. Mislabeled images also present a problem since they will create outlier points when we train our models.

We plan on learning more about some of the currently mature image recognition algorithms to inspire and develop our own model.


Project # 4 Group members:

Snaith, Mitchell

Title: Reproducibility report: *Fixing Variational Bayes: Deterministic Variational Inference for Bayesian Neural Networks*

Description:

The paper *Fixing Variational Bayes: Deterministic Variational Inference for Bayesian Neural Networks* [1] has been submitted to ICLR 2019. It aims to "fix" variational Bayes and turn it into a robust inference tool through two innovations.

Goals are to:

- reproduce the deterministic variational inference scheme as described in the paper without referencing the original author's code, providing a 3rd party implementation

- reproduce experiment results with own implementation, using the same NN framework for reference implementations of compared methods described in the paper

- reproduce experiment results with the author's own implementation

- explore other possible applications of variational Bayes besides heteroscedastic regression

[1] OpenReview location: https://openreview.net/forum?id=B1l08oAct7


Project # 5 Group members:

Rebecca, Chen

Susan,

Mike, Li

Ted, Wang

Title: Kaggle Challenge: Quick, Draw! Doodle Recognition Challenge

Description:

Classification has become a more and more eye-catching, especially with the rise of machine learning in these years. Our team is particularly interested in machine learning algorithms that optimize some specific type image classification.

In this project, we will dig into base classifiers we learnt from the class and try to cook them together to find an optimal solution for a certain type images dataset. Currently, we are looking into a dataset from Kaggle: Quick, Draw! Doodle Recognition Challenge. The dataset in this competition contains 50M drawings among 340 categories and is the subset of the world’s largest doodling dataset and the doodling dataset is updating by real drawing game players. Anyone can contribution by joining it! (quickdraw.withgoogle.com).

For us, as machine learning students, we are more eager to help getting a better classification method. By “better”, we mean find a balance between simplify and accuracy. We will start with neural network via different activation functions in each layer and we will also combine base classifiers with bagging, random forest, boosting for ensemble learning. Also, we will try to regulate our parameters to avoid overfitting in training dataset. Last, we will summary features of this type image dataset, formulate our solutions and standardize our steps to solve this kind problems

Hopefully, we can not only finish our project successfully, but also make a little contribution to machine learning research field.


Project # 6 Group members:

Ngo, Jameson

Xu, Amy

Title: Kaggle Challenge: Human Protein Atlas Image Classification

Description:

We will participate in the Human Protein Atlas Image Classification competition featured on Kaggle. We will classify proteins based on patterns seen in microscopic images.

Historically, the work done to classify proteins has only used single patterns in very few cell types. The goal of this challenge is to develop methods to classify proteins based on multiple/mixed patterns and with a larger range of cell types.


Project # 7 Group members:

Qianying Zhao

Hui Huang

Meiyu Zhou

Gezhou Zhang

Title: Google Analytics Customer Revenue Prediction

Description: Our group will participate in the featured Kaggle competition of Google Analytics Customer Revenue Prediction. In this competition, we will analyze customer dataset from a Google Merchandise Store selling swags to predict revenue per customer using Rstudio. Our presentation report will include not only how we've concluded by classifying and analyzing provided data with appropriate models, but also how we performed in the contest.