Stable diffusion inpainting 1.5 uses KL autoencoder however paper reports best metric with VQ-VAE


I am wondering if anybody knows why Stable Diffusion reports best results in their paper (Supplemental material, Section D, Table 8) with a VQ-VAE with a codebook size of 8192 and dimension on 3. However in the 1.5 released weights, a KL autoencoder is used. Does anybody know the reason why of this change?


1 Like