Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Can I train DALL-E2 on my personal computer with a fairly decent gpu? or it is out of the question?


Nope, and you'll still need a pretty beefy computer to run the trained data. Currently GPT-NeoX-20B, the "open source GPT3," requires 42 GB of VRAM, so you're looking at minimum a $5-6k graphics card (though a Quadro RTX 8000 is actually in stock so there's that). Or use a service like GooseAI.

Eleuther.ai or some other open source / open research developers will likely try to reproduce DALL-E 2 but it'll take some time and a lot of donated hardware and cycles.


I'm pretty confident that part of OpenAI's competitive edge is that they can train these models on GIANT clusters of machines.

This article predicts that GPT-3 cost $10-$20m to train. I imagine DALL-E could cost even more: https://lastweekin.ai/p/gpt-3-is-no-longer-the-only-game?s=r


Maybe possible with a fabulous GPU, but still likely not, and if it did work it would take a horrendously long time. The real blocker is gonna be GPU memory. With an RTX 3090 you have 24 GB of GPU RAM and _might_ be able to try it, but I'm still not sure it would fit. The key model has 3.5 billion parameters, which at 16-bit requires 7GB of GPU-memory for each copy. Training requires 3 or 4 copies of the model, depending on the algorithm you use. And then you need memory for the data and activations, which you can reduce with a small batch size. But if it did fit, on a single GPU with a small batch size, you're probably looking at years of training time.

Even an RTX 3080 is a complete non-starter.


Something like the Quadro RTX 8000 may theoretically work, it does have 48GB of RAM [1].

[1] https://www.nvidia.com/content/dam/en-zz/Solutions/design-vi...


Sure, with a $5k card like that it would be physically capable, but still unreasonably slow. FWIW RTX 8000 is previous-generation - the current gen is the A6000, which is also $5k, and similarly spec'd but faster. If you're looking for a deep learning card, they're actually a great value - I've got a bunch of them.


I wonder whether it could be adapted to run on Apple M1 Ultra hardware, which can have 128GB of on-chip "unified" memory, can't find right away how much of that is available to the GPU or the neural cores, but if its even half of it, perhaps Apple will have the AI market cornered.


This is a cute question. Not today! I hope someone comes back to read this question in 10-15 years time, when we will all have the ability to train Dall-E quality models on our AR glasses.


Unfortunately, it is out of the question. OpenAI trains on hundreds of thousands of dollars of GPUs and even then the trainings take two weeks. Also, as far as I know their training data (400 M image/caption pairs) is not available to the public!


It is more like 10M+ for single run for the latest generation models[1]. This is a key reason why not lot of models are out there.

Few groups have that kind of money to commit, also the viability is not yet very clear , i.e. how much the model with make if commercialized so they can recoup the investment.

There is also cost of running the model on each API call, of course not factoring in any of the employee and other costs for sales marketing etc.

[1] https://venturebeat.com/2020/06/01/ai-machine-learning-opena...


fortunately there are even larger public datasets like LAION 5b


Never gonna happen ha.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: