stat940W25-presentation

From statwiki
Revision as of 11:57, 24 March 2025 by Rtymkow (talk | contribs)
Jump to navigation Jump to search

Group 24 Presentation: Mitigating the Missing Fragmentation Problem in De Novo Peptide Sequencing With A Two-Stage Graph-Based Deep Learning Model

Background

- Proteins are crucial for biological functions

- Proteins are formed from peptides which are sequences of amino acids

- Mass spectrometry is used to analyze peptide sequences

- De Novo sequencing is used to piece together peptide sequences when the sequences are missing from existing established protein databases

- Deep learning has become commonly implimented to solve the problem of de-novo peptide sequencing

- When a peptide fails to fragment in the expected manner, it can make protein reconstruction difficult due to missing data

- One error in the protein can propogate to errors throughout the entire sequence

Paper Contributions

- Graph Novo was developed to handle incomplete segments

- GraphNovo-PathSearcher instead of directly predicting, does a path search method to predict the next peptide in a sequence

- GraphNovo-SeqFiller instead of directly predicting, does a path search method to predict the next peptide in a sequence.

- Input is mass spectrum from mass spectrometry

- Graph construction is done where nodes represent possible fragments, and edges represent possible peptides (PathSearcher module)

- PathSearcher uses machine learning to find the optimal path on the generated graph

- SeqFiller fills in missing amino acids that may have not been included in the PathSearcher module due to lacking data from the mass spectrometry inputs