Efficient Transformers: A Survey


Yi Tay: Inspired by the dizzying number of efficient Transformers ("x-formers") models that are coming out lately, we wrote a survey paper to organize all this information. Check it out at Joint work with @m__dehghani @dara_bahri and @metzlerd. @GoogleAI 😀😃

hardmaru: Finally, a paper that summarizes the recent improvements that make the Transformer much more efficient!

Oriol Vinyals: Very nice to see this done! Also, this is clearly a busy research area, so try to think "outside of the box" -- or "outside of the circles" : ) #xformers

elvis: This year we have been seeing a variety of works introducing techniques for improving Transformers and making them more computational and memory efficient. This recent survey paper highlights some notable ideas like Reformer and Longformer, among others.

Grady Booch: Someday, somewhere, somebody is going to write the classic book "Design Patterns for Neural Networks". Will it be you?

Mostafa Dehghani: If you're looking into applying transformers on large inputs (e.g. long documents, images, videos, etc), or if you are working on a new variant of efficient transformers, this should give you a nice overview of the existing works.

Thang Luong: Great survey given an influx of papers these days! Works listed seem to be more about "efficient self-attention". One could include another level of category that improves Transformer architecture as a whole, e.g., Funnel-Transformer (not my work though!)

AmsterdamNLP: Super-useful review of efficient transformer models, by a team from GoogleAI and GoogleBrain including Amsterdam's @m__dehghani

MT Group at FBK: Our pick of the week: Yi Tay et al. paper "Efficient Transformers: A Survey". By @sarapapi #nlproc #machinelearning @ytay017 @m__dehghani @dara_bahri @metzlerd @GoogleAI

Tsun-Yi Yang 楊存毅 🇹🇼🏳️‍🌈: Efficient transformer survey: a good paper for covering the existing models

Computólogo: This is gold..

Sara Papi: An interesting survey about transformers on current trends, including the so-called "X-Formers" like the Longformer. The paper defines a taxonomy based on computation and memory efficiency @fbk_mt

Rodrigo Rivera-Castro: The popularity of Transformer architectures is through the roof! Almost weekly, new literature appears. Surveys such as the one below bring more clarity. I wish the authors would have included benchmarks to better compare these architectures beyond their complexity and class.

Marek Bardoński: If you are a confused researcher here is a great paper that will explain and characterise a large and thoughtful selection of recent efficiency-flavored “X-former” models.

PDF content of a computer science paper: Efficient Transformers: A Survey