Papers of the day   All papers

Probing Neural Network Comprehension of Natural Language Arguments


Jul 22 2019 hardmaru

Contrary to popular belief, training a gigantic model on a humongous dataset of human text will not lead to AGI. 🙈🧠 Probing Neural Network Comprehension of Natural Language Arguments:
18 replies, 559 likes

Jul 18 2019 Benjamin Heinzerling

BERT is very good at being right for the wrong reasons. Great analysis of BERT's ability to learn exploiting annotation artifacts better than other models: Performance drops from 77% to random chance level when these artifacts are removed.
0 replies, 424 likes

Jul 21 2019 Melanie Mitchell

I'm fascinated by transformer architectures in NLP & curious abt what they actually learn. I just read, which shows, for one dataset on which a transformer is close to "human performance", spurious statistics completely account for the network's success.
7 replies, 181 likes

Jul 23 2019 John Regehr

hell of an abstract
2 replies, 167 likes

Jul 21 2019 /MachineLearning

BERT's success in some benchmarks tests may be simply due to the exploitation of spurious statistical cues in the dataset. Without them it is no better then random.
1 replies, 135 likes

Jul 23 2019 Emily M. Bender

Niven & Kao's upcoming #acl2019nlp paper "Probing Neural Network Comprehension of Natural Language Arguments" asks exactly the right question of unreasonable performance: "what has BERT learned about argument comprehension?" Preprint: /1
1 replies, 73 likes

Aug 16 2019 Leon Derczynski

Clever Hans: the horse we thought could do arithmetic, but really was relying on other signals. Really enjoyed this blog post on NLP, clever hands and what to do instead of leaderboards. #nlproc
1 replies, 60 likes

Jul 22 2019 Emmanuel Ameisen

If you look at the dataset to understand how your model performs, you'll often see that your model is actually struggling. Here, BERT's accuracy on the test set drops from 77% to 50% (random) after researchers identify and correct data leakage.
1 replies, 47 likes

Jul 19 2019 Leon Derczynski

Huge if true - this work indicates that BERT exploits artifacts that distort / inflate behaviors around it, and when cleaned up, the results are markedly less impressive
7 replies, 42 likes

Jul 23 2019 Jens Lehmann is an interesting paper, which shows that the way neural networks solve tasks differs substantially from how humans do it. They often defeat benchmarks in ways they were not meant to be defeated, which can lead to an overestimation of their abilities.
0 replies, 30 likes

Jul 22 2019 Skynet Today 🤖

wow, AGI is maybe not around the corner, can you believe it.
0 replies, 25 likes

Jul 23 2019 always @ ( * )

More derp learning -
1 replies, 22 likes

Jul 23 2019 fastml extra

"BERT's peak performance of 77% on the Argument Reasoning Comprehension Task is entirely accounted for by exploitation of spurious statistical cues in the dataset. " On a fixed dataset, " all models achieve random accuracy".
1 replies, 18 likes

Jul 21 2019 Dean P

Probing Neural Network Comprehension of Natural Language Arguments by Timothy Niven and Hung-Yu Kao claims that #BERT’s performance on the argument reasoning tasks is due to a problem in the dataset. @GoogleAI
1 replies, 14 likes

Jul 22 2019 Daniel Situnayake

I mean maybe that's how half of us reason, too 🤔
0 replies, 5 likes

Sep 16 2019 Michał Chromiak

This paper claims BERT’s impressive performance might be attributed to “exploitation of spurious statistical cues in the dataset” and that without them, BERT may be no better than random models. #DeepLearning #MachineLearning #ML #DL
0 replies, 4 likes

Jul 19 2019 Max Little

Simple dataset confounding strikes state of the art deep learning once again; the NLP edition.
1 replies, 4 likes

Jul 20 2019 Hugues de Mazancourt

@mlpowered A good illustration in this paper: researchers found why BERT reaches just three points below the average untrained human baseline. It's simply due to spurious statistical clues in the data set. Always look at the data.
0 replies, 3 likes

Sep 06 2019 cathal horan

Great paper to show the importance of #datasets in #DeepLearning #NLP. It shows that BERT performance in certain tasks is due to "exploiting" #statistical cues, eg negation. Remove negation from data and results are close to random. #MachineLearning
0 replies, 3 likes

Nov 07 2019 Jonathan Peck

@mtutek @GaryMarcus @tdietterich It is debatable whether GPT-2 represents actual progress. On some benchmarks at least, its high performance is an illusion.
0 replies, 2 likes

Jul 19 2019 Ramakanth Kavuluru

We didn’t have much luck with BERT in a recent project. It is really troublesome if performance depends on weird annotation artifacts. Need to watch out for things like this in future.
0 replies, 2 likes

Jul 21 2019 Kristen Allen

@alexdaviscmu "We show that this result is entirely accounted for by exploitation of spurious statistical cues in the dataset. We analyze the nature of these cues and demonstrate that a range of models all exploit them."
1 replies, 2 likes

Jul 21 2019 Rohit Pgarg

Probing Neural Network Comprehension of Natural Language Arguments Found via @benbenhh PAPER THREAD 1/
1 replies, 1 likes