Text generation quality is low with diffusers

#6
by perk11 - opened

Running the test diffusers code, I'm geting this:
output_t2i

Even simpler examples seem to have errors in the text:

image

Because the model uses an autoregressive generation structure, it exhibits greater diversity compared to diffusion models. Therefore, you can try generating multiple images, and occasional random errors in individual letters are within normal expectations.

I expected better text quality based on benchmark results, but completely understandable. Thank you for the explanation.

Because the model uses an autoregressive generation structure, it exhibits greater diversity compared to diffusion models. Therefore, you can try generating multiple images, and occasional random errors in individual letters are within normal expectations.

How can we train Loras for this model? And when will the optimisations be released?

Sign up or log in to comment