Difference between revisions of "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

From statwiki
Jump to: navigation, search
Line 14: Line 14:
  
 
===Inter-sentence coherence loss===
 
===Inter-sentence coherence loss===
[[File:ConvexSmooth.PNG|frame|Relationship between convexity and smoothness.]]
+
[[File:Classification performance.JPG | center]]
 
 
 
===Removing dropout===
 
===Removing dropout===

Revision as of 19:06, 2 November 2020

Presented by

Maziar Dadbin

Introduction

Motivation

Model details

Factorized embedding parameterization

Cross-layer parameter sharing

Inter-sentence coherence loss

Classification performance.JPG

Removing dropout