Papers of the day   All papers

A Primer in BERTology: What we know about how BERT works

Comments

Anna Rogers: Have you been drowning in BERT papers? We have. So... the first ever... Primer in BERTology is out! http://arxiv.org/abs/2002.12327 We survey over 40 papers on BERT's linguistic knowledge, compression, architecture tweaks, multilinguality, and more! With OlgaKovaleva & @arumshisky.

6 replies, 1100 likes


(((ู„()(ู„() 'yoav)))): superb BERT survey by @annargrs, Kovaleva and @arumshisky . Terrific summary of the many analyses and modification papers, very useful ref and/or starting point. https://arxiv.org/abs/2002.12327

3 replies, 546 likes


Anna Rumshisky: Our much-anticipated BERTology primer is out on arxiv: http://arxiv.org/abs/2002.12327. Why and how does BERT work? What does it learn, and where is it stored? We review 40+ recent papers in search of answers and give our view on future directions (with #OlgaKovaleva & @annargrs).

1 replies, 217 likes


elvis: BERT works well on many NLP tasks. This primer on BERTology aims to answer the questions of what knowledge is learned and where it is represented. How that knowledge is learned and methods researchers are using to improve it. Rogers et al. 2020 paper: https://arxiv.org/abs/2002.12327 https://t.co/eXjF947Mhz

0 replies, 186 likes


Denny Britz: Everyone writing summary and analysis papers like these instead of chasing after SOTA is a true hero ๐Ÿ‘๐Ÿ’ช With the flood of new models and papers, studies like these are invaluable. They save researchers thousands of hours of time.

1 replies, 130 likes


Shane Gu ้กพไธ–็ฟ”: The crux of self-supervised (unsupervised) learning lies in the art of defining auxiliary prediction tasks. BERT, SimCLR etc revolutionized SOA results by solving diverse predictive tasks generated automatically. https://arxiv.org/abs/2002.05709

0 replies, 27 likes


Sam Bowman: This is a very clear/useful survey! (And it goes way beyond what I usually think of as 'BERTology'โ€”there's a comparison of the efficiency of different distillation methods, for example.)

1 replies, 24 likes


ู…ุฌู…ูˆุนุฉ ุฅูŠูˆุงู† ุงู„ุจุญุซูŠุฉ: ูˆุฑู‚ุฉ ุจุนู†ูˆุงู† (ุชู…ู‡ูŠุฏ ููŠ ุจูŠุฑุชูˆู„ูˆุฌูŠ: ูƒูŠู ุชุนู…ู„ ู†ู…ุงุฐุฌ ุจูŠุฑุช) A Primer in BERTology: What we know about how BERT works ู…ููŠุฏุฉ ู„ูู‡ู… ู†ู…ุงุฐุฌ Bert ุงู„ู…ุฎุชู„ูุฉ. https://arxiv.org/abs/2002.12327 https://t.co/vh60mU8C3Q

0 replies, 18 likes


Abhilasha Ravichander: @rctatman Non-exhaustive list of studies I liked: 1) BERT Rediscovers the Classical NLP Pipeline by @iftenney et. al, 2) Still a Pain in the Neck by @VeredShwartz&Dagan 3) How Contextual are contextualized word representations by @ethayarajh and a nice survey: https://arxiv.org/abs/2002.12327

0 replies, 16 likes


deepset: Wow! Amazing, condensed overview of current research landscape on BERT. It covers many interesting findings on improving pretraining and what the model is good/bad at. Very inspiring !

0 replies, 15 likes


arXiv CS-CL: A Primer in BERTology: What we know about how BERT works http://arxiv.org/abs/2002.12327

0 replies, 14 likes


Spiros Denaxas: I really liked this paper "A Primer in BERTology: What we know about how BERT works" by @annargrs & coauthors, gives a really useful overview of BERT accessible to non-NLP researchers like myself :) Paper: https://arxiv.org/pdf/2002.12327.pdf original BERT paper: https://arxiv.org/abs/1810.04805 https://t.co/e96Sbwp1B4

0 replies, 10 likes


Sara Tonelli: Kudos to the all-female authors ๐Ÿ‘

0 replies, 7 likes


arXiv CS-CL: A Primer in BERTology: What we know about how BERT works http://arxiv.org/abs/2002.12327

0 replies, 6 likes


Arkaitz Zubiaga: The fact that there is material for a 10-page review with nearly 100 references on BERT, a model first published in 2019, says a lot about the speed at which #NLProc is evolving! https://arxiv.org/abs/2002.12327

0 replies, 5 likes


arXiv CS-CL: A Primer in BERTology: What we know about how BERT works http://arxiv.org/abs/2002.12327

0 replies, 5 likes


arXiv CS-CL: A Primer in BERTology: What we know about how BERT works http://arxiv.org/abs/2002.12327

0 replies, 5 likes


AUEB NLP Group: AUEB NLP meeting, Tue April 7, *18:15-20:00*: "A Primer in BERTology: What we know about how BERT works", Rogers et al. (https://arxiv.org/abs/2002.12327). Study the paper before the meeting. All welcome (but max capacity 100). AUEB NLP & NCSR/DICE members expected to participate.

1 replies, 4 likes


arXiv CS-CL: A Primer in BERTology: What we know about how BERT works http://arxiv.org/abs/2002.12327

0 replies, 3 likes


Thiago Guerrera: The following paper provides an overview of the inner workings of BERT-type models. It claims that BERT and other Transformer-based models works remarkably well and points that we need to better understand why. https://arxiv.org/abs/2002.12327

1 replies, 3 likes


arXiv CS-CL: A Primer in BERTology: What we know about how BERT works http://arxiv.org/abs/2002.12327

0 replies, 2 likes


Sai Prasanna: My highlights from "A Primer in BERTology: What we know about how BERT works" by @annargrs , OlgaKovaleva & @arumshisky .https://arxiv.org/abs/2002.12327

1 replies, 2 likes


Anselmo Peรฑas: Findings around BERT in all its dimensions and applications. Great paper! https://arxiv.org/pdf/2002.12327 To conclude that we need new benchmarks that require verbal reasoning. We are on the way @nlpmaster_uned @UNEDNLP @ETSIIUNED

0 replies, 1 likes


Content

Found on Feb 28 2020 at https://arxiv.org/pdf/2002.12327.pdf

PDF content of a computer science paper: A Primer in BERTology: What we know about how BERT works