F21-STAT 940-Proposal: Difference between revisions

From statwiki
Jump to navigation Jump to search
(Abhinav, Gautam: Adding tentative project details)
No edit summary
 
(78 intermediate revisions by 27 users not shown)
Line 3: Line 3:
Project # 0 Group members:
Project # 0 Group members:


Last name, First name
Abdelkareem, Youssef


Last name, First name
Nasr, Islam


Last name, First name
Huang, Xuanzhi


Last name, First name


Title: Making a String Telephone
Title: Automatic Covid-19 Self-Test Supervision using Deep Learning


Description: We use paper cups to make a string phone and talk with friends while learning about sound waves with this science project. (Explain your project in one or two paragraphs).
Description:  


The current health regulations in Canada mandate that all travelers arriving who have to quarantine take a Covid-19 Self-test at home. But the process involves a human agent having a video call with you to walk you through the steps. The idea is to create a deep learning pipeline that takes a real-time video stream of the user and guides him through the process from start to end with no human interference. We would be the first to try to automate such a process using deep learning.


'''The steps of the test are as follows:'''
  - Pickup the swab
  - Place the swab in our nose up to a particular depth
  - Rotate in place for a certain period of time
  - Return back the swab to the required area
  - https://www.youtube.com/watch?v=jDIUFDMmBDo


'''Our pipeline will do the following steps:'''
  - Use a real-time face detector to detect bounding box around User's face (Such as SSD-MobileNetV2)
  - Use a real-time landmark detector to accurately detect the location of the nose.
  - Use a novel model to classify whether the swab is (Inside Left Nose, Inside right Node, Outside Nose)
  - Once the swab is detected to be inside the nose, the swab will have a marker (such as aruco marker) that can be automatically tracked in real-time using opencv, we will detect the location of the marker relative to the nose location and make sure it's correctly placed and rotated for the determined period.
  - This whole pipeline will be developed as a state machine, whenever the user violates the rule of a certain step he goes back to the corresponding step and repeats again.


Project # 1 Group members:
'''Challenges:'''
  - There is no available dataset for training our novel model (to classify whether the swab is inside the nose or not), we will need to build our own dataset using both synthetic and real data. We will build a small dataset (5000-6000 images) and will rely on transfer learning to finetune an existing face attribute (such as moustache and glasses) classification model on our small dataset.
  - The whole pipeline will need to operate at a minimum of 30 FPS on CPU in order to match the FPS of a real-time video stream.


McWhannel, Pierre


Yan, Nicole


Hussein Salamah, Ahmed
--------------------------------------------------------------------


Title: placeholder
Project # 1 Group Members:


Description: placeholder
Feng, Jared


Salm, Veronica


Jones-McCormick, Taj
Sarangian, Varnan
'''Title:''' An Arsenal of Augmentation Strategies for Natural Language Processing
'''Description:''' Data augmentation is an effective technique for improving the generalization power of modern neural architectures, especially when trained over smaller datasets. Strategies for modifying training examples in domains such as computer vision and speech processing are typically straightforward, however it becomes more challenging in text processing where such generalized approaches are non-trivial. Unlike with images, where most geometric operations typically preserve the images' significant features, incorporating augmentation in text at the syntactic level can reduce the data quality as it may produce noisy examples that are no longer human-readable.
The purpose of this project is to investigate data augmentation techniques in NLP to accommodate for training robust neural networks over smaller datasets. Ideally, this may act as an alternative to relying on finetuning computationally demanding pretrained models. We will start with a survey of existing text augmentation techniques and evaluate them on several downstream tasks (using existing benchmark datasets and possibly novel ones). We will further attempt to adapt popular augmentation approaches from external domains (i.e. computer vision) into an NLP setting. This will include an empirical investigation into determining whether its appropriate to augment at the syntactic (i.e. raw text) or semantic (i.e. encoded vectors) level.
--------------------------------------------------------------------


Project # 2 Group members:
Project # 2 Group members:


Singh, Gursimran
Shi, Yuliang
 
Liang, Wei
 
 
Title: Classification of COVID-19 Cases via Fine tuning model on X-Ray Chest Images
 
'''Description:'''
 
COVID-19 is a contagious disease that emerged in Wuhan, China in December 2019 and has been spreading worldwide rapidly, presenting unprecedented challenges to the world. Infected people may have a wide range of symptoms such as fever, cough, fatigue, etc. Researchers are tirelessly working together to better understand the methodology of death caused by COVID-19 . However,  whether the x-ray chest image can be detected automatically by machine is still required further investigation. This gives us the motivation to consider deep learning methodology for classifying x-ray images of COVID-19 patients. In this study, we are going to use the two datasets. The dataset is a collection of multiple sources for X-ray images with three categories including: Normal, COVID-19, and Pneumonia. In total, it contains 6432 X-ray images and test
data have 20% of total images. Each image is published on Kaggle website.
 
'''Purpose:'''
 
The objective of this project is to identify different chest x-ray images from three different categories - COVID-19, Pneumonia and normal chest. This is an image classification problem and we set up our model based on convolutional neural networks (CNN), with inputs being the chest x-ray images and outputs being the disease categories. CNN is powerful in computer visions and has achieved extensive success. A typical CNN architecture consists of input layers, convolutional layers, pooling layers and fully connected output layers. We will tailor our networks based on this classical CNN architecture and adopt some techniques for more accurate detection of COVID-19 from the x-ray images.
 
'''Challenges and Improvements:'''
 
We develop the fine tuning procedure to reduce the computational burdensome and compare their structures and performances among self- built and pre-trained models on the
collection of large dataset.
 
 
 
Link To Proposal: [https://www.overleaf.com/read/tzqtxzykhggc STAT 940 Proposal]
--------------------------------------------------------------------
 
Project # 3 Group Members:
 
Leung, Alice
 
Yalsavar, Maryam
 
Jamialahmadi, Benyamin
 
 
'''Title:''' MLP-Mixer: MLP Architecture for NLP
 
'''Description:'''  Although transformers, and other attention-based architectures are the common parts of the most state-of-the-art models in both Vision and Natural Language Processing (NLP), recently, “MLP-Mixer: An all-MLP Architecture for Vision” (https://arxiv.org/pdf/2105.01601.pdf) paper disclosed a model that is entirely based on multi-layer perceptrons (MLPs) with competitive results on both accuracy and computational cost to these advanced methods in field of vision. This paper claims that neither of attention, and convolutions are necessary which the claim is proved by its well-stablished experiments and results. Beside its astounding results, the application of this model is not yet addressed in field of NLP. So, this project approaches towards implementation of this model for NLP tasks. The overall outline and steps can be:
 
1. Paper and code assessment
 
2. Applying the model to a named entity recognition problem
 
3. Data preparation 
 
4. Training, fine-tuning, and testing
 
'''Challenges:'''
Challenges will be finding a good data set to apply NER to, and figuring out how to adapt this architecture from a computer vision problem to a NLP problem.
 
 
--------------------------------------------------------------------
 
Project # 4 Group Members:
 
Mina Kebriaee
 
 
Title: '''Short-term Forecast for Solar Photovoltaic Farm Output'''
'''Description:'''
 
<pre>
Synopsis: Electric power generation at a solar photovoltaic (PV) farm is predicted for near future based on historical data and weather features. A long-short-term-memory (LSTM) network is used to perform time-series prediction.
 
Solar energy is becoming increasingly popular as a viable renewable energy source. It offers many environmental advantages; however, fluctuations with changing weather patterns have always been a problem. While solar may not entirely replace fossil fuels, it has its place in the power generation portfolio. Many electricity companies are looking to add solar energy into their mix of power generation and in order to do so they require accurate solar production forecasts. Errors in forecasting could lead to large expenses like excess fuel consumption, emergency purchases of electricity from competitors, and/or might force utility operators to perform load shedding which directly affects electricity end-users. The major challenge in solar energy generation is the intermittency of photovoltaic system power generation mainly due to weather conditions which are extremely nonlinear and thus, analytical model-based approaches lack the required flexibility and complexity to capture the underlying patterns and behavior.
Applied machine learning (ML) techniques can help improve the prediction accuracy and effectively increase social benefits by using multi-dimensional feature-based networks that digest tens of relevant input data streams including historical data and local weather conditions. In other words, ML-based tools are inevitably becoming indispensable for reducing the effect of uncertainty and energy costs in modern power systems and smart grids. The goal of this project is to implement machine learning algorithm to predict the electric energy output at a solar PV farm based on local weather and temporal parameters.
</pre>
<pre>
This project consists of three main parts:
 
1.Data: Pre-processing of the raw data files (input) and historical power generation data files (output) from the solar farm to get meaningful numeric values on an hourly basis.


Sharma, Govind
2.Machine Learning: Designing and implementing Deep Neural Network (Long Short-Term Memory (LSTM) model to accurately forecast short-term photovoltaic solar power). This approach exploits the desirable properties of LSTM, which is a powerful tool for modeling dependency in data.


Chanana, Abhinav
3.Application/reporting: Apply the model and report back the findings including the accuracy of the results using the available dataset.
</pre>


Title: Quick Text Description using Headline Generation and Text To Image Conversion


Description: An automatic tool to generate short description based on long textual data is a useful mechanism to share quick information. Most of the current approaches involve summarizing the text using varied deep learning approaches from Transformers to different RNNs. For this project, instead of building a standard text summarizer, we aim to provide two separate utilities for generating a quick description of the text. First, we plan to develop a model that produces a headline for the long textual data, and second, we are intending to generate an image describing the text.
--------------------------------------------------------------------


Headline Generation - Headline generation is a specific case of text summarization where the output is generally a combination of few words that gives an overall outcome from the text. In most cases, text summarization is an unsupervised learning problem. But, for the headline generation, we have the original headlines available in our training dataset that makes it a supervised learning task. We plan to experiment with different Recurrent Neural Networks like LSTMs and GRUs with varied architectures. For model evaluation, we are considering BERTScore using which we can compare the reference headline with the automatically generated headline from the model. We also aim to explore attention models for the text (headline) generation. We will make use of the currently available techniques mentioned in the various research papers but also try to develop our own architecture if the previous methods don't reveal reliable results on our dataset. Therefore, this task would primarily fit under the category of application of deep learning to a particular domain, but could also include some components of new algorithm design.
Project # 5 Group Members:


Text to Image Conversion - Generation or synthesis of images from a short text description is another very interesting application domain in deep learning. One approach for image generation is based on mapping image pixels to specific features as described by the discriminative feature representation of the text. Recurrent Neural Networks have been successfully used in learning such feature representations of text. This approach is difficult to generalize because the recognition of discriminative features for texts in different domains is not an easy task and it requires domain expertise. Different generative methods have been used including Variational Recurrent Auto-Encoders and its extension in Deep Recurrent Attention Writer (DRAW). We plan to experiment with Generative Adversarial Networks (GAN). Application of GANs on domain-specific datasets has been done but we aim to apply different variants of GANs on the Microsoft COCO dataset which has been used in other architectures. The analysis will be focusing on how well GANs are able to generalize when compared to other alternatives on the given dataset.


Scope - The above models will be trained independently on different datasets. Therefore, for a particular text, only one of the two functionalities will be available.
Shervin Hakimi
Mehrshad Sadria






Project # 3 Group members:
Title: '''Kidney Function Analysis Using Deep Neural Networks'''


Sikri, Gaurav
'''Description:'''
<pre>
The kidney is one of the most intriguing organs of our body: a single kidney alone has enough resources for the body to function, it filters the blood and regulates fluids in the body, and is one of the most essential organs, but therein lies the problem: when this organ becomes ill there are multiple issues coming along with it such as Chronic Kidney Diseases, Renal Cell Carcinoma, etc. In these cases, the kidney progressively loses its function and would eventually lead to the person's death. There are several factors that can be used to assess the well-being of a kidney for instance number of glomeruli, cell size, urine secretion, etc. In this project, we aim to use clinical data focusing on kidneys to understand these diseases more thoroughly and in the end build a model that could assess the aforementioned clinical features.


Bhatia, Jaskirat
</pre>


Title: Not decided yet (Placeholder)
'''Challenges:'''


Description: Not decided yet :)
<pre>


In order to answer the question we expect to be faced with few challenges as follow:


Project # 4 Group members:
Clinical data annotation is a time-consuming and tedious task that requires the knowledge of the expert in the field which even might end up being accurate because of human error.


Maleki, Danial
Cleaning biological data can be hard since your data might have a batch effect, missing values, alignment problems, and low image resolution.


Rasoolijaberi, Maral
Biological processes are complex which might require different types of datasets in different levels such as Genomics, Epigenomics, Proteomics and etc, in order to have a good understanding of the system.


Title: Binary Deep Neural Network for the domain of Pathology
</pre>


Description: The binary neural network, largely saving the storage and computation, serves as a promising technique for deploying deep models on resource-limited devices. However, the binarization inevitably causes severe information loss, and even worse, its discontinuity brings difficulty to the optimization of the deep network. We want to investigate the possibility of using these types of networks in the domain of histopathology as it has gigapixels images which make the use of them very useful.
--------------------------------------------------------------------


Project # 6 Group Members:


Project # 5 Group members:
Edward Bangala


Jain, Abhinav
Maruf Sazed


Bathla, Gautam
Title: '''Maximizing Classification Accuracy for a Fixed Model via Generative Adversarial Networks'''


Title: lyft-motion-prediction-autonomous-vehicles(Kaggle)(Tentative)


Description: Autonomous vehicles (AVs) are expected to dramatically redefine the future of transportation. However, there are still significant engineering challenges to be solved before one can fully realize the benefits of self-driving cars. One such challenge is building models that reliably predict the movement of traffic agents around the AV, such as cars, cyclists, and pedestrians.
'''Description:'''


Comments: We are more inclined towards a 3-D object detection project. We are in the process of finding the right problem statement for it and if we are not successful, we will continue with the above Kaggle competition.
In machine learning, models are trained on the training set and inference is done on the test set. To avoid overfitting, a validation set is used. In this approach, several models are used (with different architectures or through hyperparameter tuning), to identify the best possible model based on the validation set error. That is, the data set is kept fixed, and the models are changed. Even though different data augmentation techniques may be applied on the training set, a model selection process is still needed. In this project, we want to explore whether we can fix a model and find out the subset of the training data set that could possibly result in a greater accuracy. We will use a Generative Adversarial Network (GAN) to reconstruct images from the test data set. If the model is able to learn the distribution of the test set images, then the discriminator might be able to identify the images from the training set that are similar to the test set. The idea is borrowed from Andrew Ng, who recently launched a competition where he asked the participants to identify the best data set given a fixed model. The idea of using GANs to solve this problem is our own. However, it is well know that GANs are difficult to train and there is not certainty that subset of image could be better than using the full training images.

Latest revision as of 12:50, 16 December 2021

Use this format (Don’t remove Project 0)

Project # 0 Group members:

Abdelkareem, Youssef

Nasr, Islam

Huang, Xuanzhi


Title: Automatic Covid-19 Self-Test Supervision using Deep Learning

Description:

The current health regulations in Canada mandate that all travelers arriving who have to quarantine take a Covid-19 Self-test at home. But the process involves a human agent having a video call with you to walk you through the steps. The idea is to create a deep learning pipeline that takes a real-time video stream of the user and guides him through the process from start to end with no human interference. We would be the first to try to automate such a process using deep learning.

The steps of the test are as follows: 
 - Pickup the swab 
 - Place the swab in our nose up to a particular depth
 - Rotate in place for a certain period of time 
 - Return back the swab to the required area 
 - https://www.youtube.com/watch?v=jDIUFDMmBDo
Our pipeline will do the following steps: 
 - Use a real-time face detector to detect bounding box around User's face (Such as SSD-MobileNetV2) 
 - Use a real-time landmark detector to accurately detect the location of the nose. 
 - Use a novel model to classify whether the swab is (Inside Left Nose, Inside right Node, Outside Nose) 
 - Once the swab is detected to be inside the nose, the swab will have a marker (such as aruco marker) that can be automatically tracked in real-time using opencv, we will detect the location of the marker relative to the nose location and make sure it's correctly placed and rotated for the determined period. 
 - This whole pipeline will be developed as a state machine, whenever the user violates the rule of a certain step he goes back to the corresponding step and repeats again.
Challenges:
 - There is no available dataset for training our novel model (to classify whether the swab is inside the nose or not), we will need to build our own dataset using both synthetic and real data. We will build a small dataset (5000-6000 images) and will rely on transfer learning to finetune an existing face attribute (such as moustache and glasses) classification model on our small dataset.
 - The whole pipeline will need to operate at a minimum of 30 FPS on CPU in order to match the FPS of a real-time video stream.



Project # 1 Group Members:

Feng, Jared

Salm, Veronica

Jones-McCormick, Taj

Sarangian, Varnan

Title: An Arsenal of Augmentation Strategies for Natural Language Processing

Description: Data augmentation is an effective technique for improving the generalization power of modern neural architectures, especially when trained over smaller datasets. Strategies for modifying training examples in domains such as computer vision and speech processing are typically straightforward, however it becomes more challenging in text processing where such generalized approaches are non-trivial. Unlike with images, where most geometric operations typically preserve the images' significant features, incorporating augmentation in text at the syntactic level can reduce the data quality as it may produce noisy examples that are no longer human-readable.

The purpose of this project is to investigate data augmentation techniques in NLP to accommodate for training robust neural networks over smaller datasets. Ideally, this may act as an alternative to relying on finetuning computationally demanding pretrained models. We will start with a survey of existing text augmentation techniques and evaluate them on several downstream tasks (using existing benchmark datasets and possibly novel ones). We will further attempt to adapt popular augmentation approaches from external domains (i.e. computer vision) into an NLP setting. This will include an empirical investigation into determining whether its appropriate to augment at the syntactic (i.e. raw text) or semantic (i.e. encoded vectors) level.


Project # 2 Group members:

Shi, Yuliang

Liang, Wei


Title: Classification of COVID-19 Cases via Fine tuning model on X-Ray Chest Images

Description:

COVID-19 is a contagious disease that emerged in Wuhan, China in December 2019 and has been spreading worldwide rapidly, presenting unprecedented challenges to the world. Infected people may have a wide range of symptoms such as fever, cough, fatigue, etc. Researchers are tirelessly working together to better understand the methodology of death caused by COVID-19 . However, whether the x-ray chest image can be detected automatically by machine is still required further investigation. This gives us the motivation to consider deep learning methodology for classifying x-ray images of COVID-19 patients. In this study, we are going to use the two datasets. The dataset is a collection of multiple sources for X-ray images with three categories including: Normal, COVID-19, and Pneumonia. In total, it contains 6432 X-ray images and test data have 20% of total images. Each image is published on Kaggle website.

Purpose:

The objective of this project is to identify different chest x-ray images from three different categories - COVID-19, Pneumonia and normal chest. This is an image classification problem and we set up our model based on convolutional neural networks (CNN), with inputs being the chest x-ray images and outputs being the disease categories. CNN is powerful in computer visions and has achieved extensive success. A typical CNN architecture consists of input layers, convolutional layers, pooling layers and fully connected output layers. We will tailor our networks based on this classical CNN architecture and adopt some techniques for more accurate detection of COVID-19 from the x-ray images.

Challenges and Improvements:

We develop the fine tuning procedure to reduce the computational burdensome and compare their structures and performances among self- built and pre-trained models on the collection of large dataset.



Link To Proposal: STAT 940 Proposal


Project # 3 Group Members:

Leung, Alice

Yalsavar, Maryam

Jamialahmadi, Benyamin


Title: MLP-Mixer: MLP Architecture for NLP

Description: Although transformers, and other attention-based architectures are the common parts of the most state-of-the-art models in both Vision and Natural Language Processing (NLP), recently, “MLP-Mixer: An all-MLP Architecture for Vision” (https://arxiv.org/pdf/2105.01601.pdf) paper disclosed a model that is entirely based on multi-layer perceptrons (MLPs) with competitive results on both accuracy and computational cost to these advanced methods in field of vision. This paper claims that neither of attention, and convolutions are necessary which the claim is proved by its well-stablished experiments and results. Beside its astounding results, the application of this model is not yet addressed in field of NLP. So, this project approaches towards implementation of this model for NLP tasks. The overall outline and steps can be:

1. Paper and code assessment

2. Applying the model to a named entity recognition problem

3. Data preparation

4. Training, fine-tuning, and testing

Challenges: Challenges will be finding a good data set to apply NER to, and figuring out how to adapt this architecture from a computer vision problem to a NLP problem.



Project # 4 Group Members:

Mina Kebriaee


Title: Short-term Forecast for Solar Photovoltaic Farm Output

Description:

Synopsis: Electric power generation at a solar photovoltaic (PV) farm is predicted for near future based on historical data and weather features. A long-short-term-memory (LSTM) network is used to perform time-series prediction.

Solar energy is becoming increasingly popular as a viable renewable energy source. It offers many environmental advantages; however, fluctuations with changing weather patterns have always been a problem. While solar may not entirely replace fossil fuels, it has its place in the power generation portfolio. Many electricity companies are looking to add solar energy into their mix of power generation and in order to do so they require accurate solar production forecasts. Errors in forecasting could lead to large expenses like excess fuel consumption, emergency purchases of electricity from competitors, and/or might force utility operators to perform load shedding which directly affects electricity end-users. The major challenge in solar energy generation is the intermittency of photovoltaic system power generation mainly due to weather conditions which are extremely nonlinear and thus, analytical model-based approaches lack the required flexibility and complexity to capture the underlying patterns and behavior.
Applied machine learning (ML) techniques can help improve the prediction accuracy and effectively increase social benefits by using multi-dimensional feature-based networks that digest tens of relevant input data streams including historical data and local weather conditions. In other words, ML-based tools are inevitably becoming indispensable for reducing the effect of uncertainty and energy costs in modern power systems and smart grids. The goal of this project is to implement machine learning algorithm to predict the electric energy output at a solar PV farm based on local weather and temporal parameters. 
This project consists of three main parts:

1.Data: Pre-processing of the raw data files (input) and historical power generation data files (output) from the solar farm to get meaningful numeric values on an hourly basis.

2.Machine Learning: Designing and implementing Deep Neural Network (Long Short-Term Memory (LSTM) model to accurately forecast short-term photovoltaic solar power). This approach exploits the desirable properties of LSTM, which is a powerful tool for modeling dependency in data.

3.Application/reporting: Apply the model and report back the findings including the accuracy of the results using the available dataset.



Project # 5 Group Members:


Shervin Hakimi Mehrshad Sadria


Title: Kidney Function Analysis Using Deep Neural Networks

Description:

The kidney is one of the most intriguing organs of our body: a single kidney alone has enough resources for the body to function, it filters the blood and regulates fluids in the body, and is one of the most essential organs, but therein lies the problem: when this organ becomes ill there are multiple issues coming along with it such as Chronic Kidney Diseases, Renal Cell Carcinoma, etc. In these cases, the kidney progressively loses its function and would eventually lead to the person's death. There are several factors that can be used to assess the well-being of a kidney for instance number of glomeruli, cell size, urine secretion, etc. In this project, we aim to use clinical data focusing on kidneys to understand these diseases more thoroughly and in the end build a model that could assess the aforementioned clinical features. 

Challenges:


In order to answer the question we expect to be faced with few challenges as follow:

Clinical data annotation is a time-consuming and tedious task that requires the knowledge of the expert in the field which even might end up being accurate because of human error.

Cleaning biological data can be hard since your data might have a batch effect, missing values, alignment problems, and low image resolution.

Biological processes are complex which might require different types of datasets in different levels such as Genomics, Epigenomics, Proteomics and etc, in order to have a good understanding of the system.


Project # 6 Group Members:

Edward Bangala

Maruf Sazed

Title: Maximizing Classification Accuracy for a Fixed Model via Generative Adversarial Networks


Description:

In machine learning, models are trained on the training set and inference is done on the test set. To avoid overfitting, a validation set is used. In this approach, several models are used (with different architectures or through hyperparameter tuning), to identify the best possible model based on the validation set error. That is, the data set is kept fixed, and the models are changed. Even though different data augmentation techniques may be applied on the training set, a model selection process is still needed. In this project, we want to explore whether we can fix a model and find out the subset of the training data set that could possibly result in a greater accuracy. We will use a Generative Adversarial Network (GAN) to reconstruct images from the test data set. If the model is able to learn the distribution of the test set images, then the discriminator might be able to identify the images from the training set that are similar to the test set. The idea is borrowed from Andrew Ng, who recently launched a competition where he asked the participants to identify the best data set given a fixed model. The idea of using GANs to solve this problem is our own. However, it is well know that GANs are difficult to train and there is not certainty that subset of image could be better than using the full training images.