a fair comparison of graph neural networks for graph classification: Difference between revisions

From statwiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 31: Line 31:
biased estimates of the true performance of a model, making it hard for other researchers to present
biased estimates of the true performance of a model, making it hard for other researchers to present
competitive results without following the same ambiguous evaluation procedures.
competitive results without following the same ambiguous evaluation procedures.
==Risk Assessment and Model Selection==
'''Risk Assesment

Revision as of 19:06, 9 November 2020

Presented By

Jaskirat Singh Bhatia

Background

Experimental reproducibility and replicability are critical topics in machine learning. Authors have often raised concerns about their lack in scientific publications to improve the quality of the field. Recently, the graph representation learning field has attracted the attention of a wide research community, which resulted in a large stream of works. As such, several Graph Neural Network models have been developed to effectively tackle graph classification. However, experimental procedures often lack rigorousness and are hardly reproducible. The authors tried to reproduce the results from such experiments to tackle the problem of ambiguity in experimental procedures and the impossibility of reproducing results. They also Standardized the experimental environment so that the results could be reproduced while using this environment.

Graph Neural Networks

1. A Neural Network which takes a Graph as an input

2. Tasks include classifying the graph or finding a missing edge/ node in the graph.

Problems in Papers

Some of the most common reproducibility problems encountered in this field concern hyperparameters selection and the correct usage of data splits for model selection versus model assessment. Moreover, the evaluation code is sometimes missing or incomplete, and experiments are not standardized across different works in terms of node and edge features.

These issues easily generate doubts and confusion among practitioners that need a fully transparent and reproducible experimental setting. As a matter of fact, the evaluation of a model goes through two different phases, namely model selection on the validation set and model assessment on the test set. Clearly, to fail in keeping these phases well separated could lead to over-optimistic and biased estimates of the true performance of a model, making it hard for other researchers to present competitive results without following the same ambiguous evaluation procedures.

Risk Assessment and Model Selection

Risk Assesment