Papers of the day   All papers

Efficient Transformers: A Survey

Comments

Yi Tay: Inspired by the dizzying number of efficient Transformers ("x-formers") models that are coming out lately, we wrote a survey paper to organize all this information. Check it out at https://arxiv.org/abs/2009.06732. Joint work with @m__dehghani @dara_bahri and @metzlerd. @GoogleAI 😀😃 https://t.co/0M7a0oCqdj

16 replies, 911 likes


hardmaru: Finally, a paper that summarizes the recent improvements that make the Transformer much more efficient! https://arxiv.org/abs/2009.06732 https://t.co/2kwW6O7uIQ

2 replies, 630 likes


Oriol Vinyals: Very nice to see this done! Also, this is clearly a busy research area, so try to think "outside of the box" -- or "outside of the circles" : ) #xformers https://t.co/sclZmVGSkG

0 replies, 168 likes


elvis: This year we have been seeing a variety of works introducing techniques for improving Transformers and making them more computational and memory efficient. This recent survey paper highlights some notable ideas like Reformer and Longformer, among others. https://arxiv.org/abs/2009.06732 https://t.co/mfe4cwAzYl

1 replies, 128 likes


Grady Booch: Someday, somewhere, somebody is going to write the classic book "Design Patterns for Neural Networks". Will it be you?

6 replies, 97 likes


Mostafa Dehghani: If you're looking into applying transformers on large inputs (e.g. long documents, images, videos, etc), or if you are working on a new variant of efficient transformers, this should give you a nice overview of the existing works. https://twitter.com/ytay017/status/1306264630377889792?s=20

2 replies, 69 likes


Thang Luong: Great survey given an influx of papers these days! Works listed seem to be more about "efficient self-attention". One could include another level of category that improves Transformer architecture as a whole, e.g., Funnel-Transformer (not my work though!) https://arxiv.org/abs/2006.03236

0 replies, 29 likes


AmsterdamNLP: Super-useful review of efficient transformer models, by a team from GoogleAI and GoogleBrain including Amsterdam's @m__dehghani

0 replies, 16 likes


MT Group at FBK: Our pick of the week: Yi Tay et al. paper "Efficient Transformers: A Survey". By @sarapapi https://arxiv.org/abs/2009.06732 #nlproc #machinelearning @ytay017 @m__dehghani @dara_bahri @metzlerd @GoogleAI

0 replies, 4 likes


Tsun-Yi Yang 楊存毅 🇹🇼🏳️‍🌈: https://arxiv.org/pdf/2009.06732.pdf Efficient transformer survey: a good paper for covering the existing models https://t.co/CbjJPkdUEk

0 replies, 4 likes


Computólogo: This is gold..

0 replies, 3 likes


Sara Papi: An interesting survey about transformers on current trends, including the so-called "X-Formers" like the Longformer. The paper defines a taxonomy based on computation and memory efficiency @fbk_mt https://arxiv.org/pdf/2009.06732.pdf

0 replies, 3 likes


Rodrigo Rivera-Castro: The popularity of Transformer architectures is through the roof! Almost weekly, new literature appears. Surveys such as the one below bring more clarity. I wish the authors would have included benchmarks to better compare these architectures beyond their complexity and class.

0 replies, 2 likes


Marek Bardoński: If you are a confused researcher here is a great paper that will explain and characterise a large and thoughtful selection of recent efficiency-flavored “X-former” models. https://arxiv.org/pdf/2009.06732.pdf

0 replies, 2 likes


Robert Oschler: @Merzmensch

0 replies, 1 likes


Content

Found on Sep 16 2020 at https://arxiv.org/pdf/2009.06732.pdf

PDF content of a computer science paper: Efficient Transformers: A Survey