Papers of the day   All papers

Generating Diverse High-Fidelity Images with VQ-VAE-2

Comments

Aäron van den Oord: VQVAE-2 finally out! Powerful autoregressive models in a hierarchical compressed latent space. No modes were collapsed in the creation of these samples ;) Arixv: http://arxiv.org/abs/1906.00446 With @catamorphist and @vinyals More samples and details 👇 [thread] https://t.co/aIg6sk6aZt

12 replies, 906 likes


Oriol Vinyals: Surprising how simple ideas can yield such a good generative model! -Mean Squared Error loss on pixels -Non-autoregressive image decoder -Discrete latents w/ straight through estimator w/ @catamorphist & @avdnoord VQ-VAE-2: http://arxiv.org/abs/1906.00446 Code: https://github.com/deepmind/sonnet/blob/master/sonnet/examples/vqvae_example.ipynb https://t.co/xhqB2v7Hk7

7 replies, 662 likes


Oriol Vinyals: Great post by Prof. David McAllester on why discrete representations matter, based on our findings in VQ-VAE2. "Vector quantization seems to be a minimal-bias way for symbols to enter into deep models." https://arxiv.org/abs/1906.00446 https://machinethoughts.wordpress.com/2019/06/25/the-inevitability-of-vector-quantization-in-deep-architectures https://t.co/GbuLBAkcDN

4 replies, 406 likes


Ben Poole: Big hierarchical VQ-VAEs with autoregressive priors do amazing things. Awesome work from @catamorphist @avdnoord @OriolVinyalsML: https://arxiv.org/abs/1906.00446 https://t.co/JpEbEJnXk4

2 replies, 324 likes


roadrunner01: Generating Diverse High-Fidelity Images with VQ-VAE-2 pdf: https://arxiv.org/pdf/1906.00446.pdf abs: https://arxiv.org/abs/1906.00446 https://t.co/LvxiOLpqlL

2 replies, 127 likes


Gene Kogan: For most of the creatives/non-scientists out there, this may seem like just another BigGAN/StyleGAN, but this has important advantages: It's likelihood-based (can be evaluated formally), samples much faster, and should be superior in generator diversity. Really good stuff

3 replies, 114 likes


Xander Steenbrugge: Generative Modelling space on fire! After Google's #BigGan and Nvidia's #StyleGAN we now finally have autoencoder based models that generate samples of equal/better? quality! The sample diversity is especially striking given that mode collapse has always been an issue for GANs.

0 replies, 51 likes


Kyle McDonald: VAE-style networks have surpassed the quality of BigGAN and StyleGAN. i always knew they had it in them 🎉

1 replies, 43 likes


Max Jaderberg: Insanely good samples from the latest incarnation of the VQVAE generative model

1 replies, 40 likes


François Fleuret: Beside the quantitative evidences that they are more robust to mode collapse than GANs, their roots in "classical" density estimation make VAE more promising as a generic tool. We have "good enough classifiers" since 2015, maybe are we also good for density models...

0 replies, 27 likes


Danilo J. Rezende: Great results on generative modelling from @catamorphist, @avdnoord and @OriolVinyalsML !

0 replies, 21 likes


Simon Kornblith: @carlesgelada @timnitGebru @ylecun Here's my favorite example of this, from https://arxiv.org/abs/1906.00446. The left are samples from VQ-VAE2; the right are from BigGAN; both are trained on ImageNet. Isn't it kind of obvious what's going to happen if these algorithms are a trained on a face dataset? https://t.co/cvvnG7laDN

3 replies, 18 likes


Daisuke Okanohara: VQ-VAE-2 improves VQ-VAE by using1) hierarchical latent variables 2) a prior distribution that matches the marginal posterior using an auto-regressive model with self-attention; achieving diverse and high-fidelity image generation. https://arxiv.org/abs/1906.00446 https://drive.google.com/file/d/1H2nr_Cu7OK18tRemsWn_6o5DGMNYentM

0 replies, 15 likes


d00d: VAE based image generation with quality comparable to GAN generated images, but more variety and faster sampling...

1 replies, 12 likes


Alex Nichol: The VQ-VAE-2 paper is hilariously vague. E.g. "It consists of a few residual blocks followed by a number of strided transposed convolutions". (Paper: https://arxiv.org/pdf/1906.00446.pdf)

1 replies, 6 likes


René Schulte: Impressive new step for generated images. The below photos are all synthesized by an AI 👌 Instead of a GAN they use a Vector Quantized Variational AutoEncoder (VQ-VAE) which makes it easier to handle and much faster. 🚀 https://arxiv.org/abs/1906.00446 #AI #DeepLearning #ML #DNN https://t.co/PZbzIzrj9N

0 replies, 6 likes


Kaixhin: I love this mix between very general (autoregressively-decoded discrete sequences), general (hierarchical structure) and specific (local spatial structure) priors to model complex distributions in the real world 🌏

0 replies, 6 likes


Alex J. Champandard: 2/ At this stage, we know it's possible to generate HD images with many/most techniques. NVIDIA built StyleGAN, OpenAI developed GLOW, DeepMind created VQVAE, etc. Everyone has their favorite! 🐕 https://openai.com/blog/glow/ https://github.com/NVlabs/stylegan https://arxiv.org/abs/1906.00446 .

1 replies, 4 likes


Alex J. Champandard: 6/ The idea of working in a smaller and coarser space is not new. It's what made GANs scale to 1024x1024 in the first place (progressive growing) and it's the idea that helped VQVAE catch up. https://arxiv.org/abs/1710.10196 https://arxiv.org/abs/1906.00446 https://t.co/RltlVfZvqc

1 replies, 3 likes


Kyle Kastner: @selimonder This paper (along with a few others recently such as https://arxiv.org/abs/1812.01608 , https://arxiv.org/abs/1906.00446) are exploiting the multi-scale structure inherent in audio and images. That kind of structure is much harder to get *easily* in language - dependency which may not be local

1 replies, 2 likes


Seth Stafford: THIS: "The . shift from symbolic logic . to distributed vector representations is . viewed as [a] cornerstone of . deep learning . I . believe . logical symbolic reasoning is necessary for AGI. Vector quantization seems . a minimal-bias way for symbols to enter . deep models."

0 replies, 1 likes


JFPuget Wash Hands Social Distancing Wear Mask: A clear example of two different ML algorithms. The one on the left has much more diverse outputs (and better quality ones) Not sure though it is less biased.

0 replies, 1 likes


Content

Found on Jun 04 2019 at https://arxiv.org/pdf/1906.00446.pdf

PDF content of a computer science paper: Generating Diverse High-Fidelity Images with VQ-VAE-2