> "DALL-E 2's works very simply: ... a model called the prior maps the text enco...

pas · on April 19, 2022

https://ml.berkeley.edu/blog/posts/dalle2/

it's "simple" because how it works is "just" brute-fucking-force. of course coming up with the architecture and making it fast (so it scales up well) is the challenge.

and scaling works .. because .. well, no one knows why (but likely because it's just a nice architecture for learning, evolution also converged on it without knowing why)

see also: https://www.gwern.net/Scaling-hypothesis