Papers of the day   All papers

Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets


Aug 24 2019 Jonathan Berant

New #emnlp2019 paper by @megamor2 and with @yoavgo about annotator bias. We check whether models capture properties of the annotators rather than the task when annotators create language utterances at scale. You'll never guess what we found out! :)
12 replies, 305 likes

Sep 25 2019 Ben Hamner

Nice demonstration of annotator bias in human-labeled ML datasets and models: in three recent NLP datasets, adding an annotator ID as a feature improved the model accuracy from 1.6-4.2%
3 replies, 180 likes

Aug 25 2019 Robert Munro

"test set annotators should be disjoint from training set annotators." Really interesting conclusion!
3 replies, 102 likes

Nov 05 2019 Jonathan Berant

"Are we modeling the task or the annotator? " #emnlp2019 today (Tue) at 1530 @megamor2 will present our work on annotator bias in Session3A machine learning ii. Come check it out! @yoavgo
0 replies, 61 likes

Aug 25 2019 (((ل()(ل() 'yoav))))

when it comes to language and its diversity, average humans are really bad at recall, and are not very creative. beware of this.
3 replies, 52 likes

Aug 25 2019 Melanie Mitchell

Good articles on some of the subtle associations and biases that exist in standard ML benchmarks:
1 replies, 31 likes

Sep 26 2019 Ian Soboroff

No matter what the task, annotators will disagree with one another, often quite reasonably. They also make mistakes, sometimes systematically.
0 replies, 8 likes

Aug 25 2019 Hugh Harvey

Another form of dataset bias to watch out for. Very interesting - has anyone done similar with segmentation annotation?
1 replies, 7 likes