ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
From statwiki
Revision as of 19:08, 2 November 2020 by
Mdadbin
(
talk
|
contribs
)
(
diff
)
← Older revision
|
Latest revision
(
diff
) |
Newer revision →
(
diff
)
Jump to:
navigation
,
search
Contents
1
Presented by
2
Introduction
3
Motivation
4
Model details
4.1
Factorized embedding parameterization
4.2
Cross-layer parameter sharing
4.3
Inter-sentence coherence loss
4.4
Removing dropout
Presented by
Maziar Dadbin
Introduction
Motivation
Model details
Factorized embedding parameterization
Cross-layer parameter sharing
Inter-sentence coherence loss
Removing dropout
Navigation menu
Personal tools
Log in
Request account
Namespaces
Page
Discussion
Variants
Views
Read
View source
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help
Tools
What links here
Related changes
Special pages
Permanent link
Page information