Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Short story, CPU's can do calculations, they can do them one at a time. Think of something like 1+1, = 2. If you had 1 million equations like these, CPU's will generally do them one at a go, i.e the first one, then the second, etc.

GPUs were optimised to draw, so were able to do dozens of these at a go. So these can be used for AI/ML in both gradient descent and inference (forward passes). Because you can do many at a go, in parallel, they speed things up dramatically. Geoff Hinton experimented with GPUs exploiting their ability to do this, but they aren't actually optimised to do that. It just turned out that it is the best way available to do it at the time, and still currently.

AI chips, are optimised to do either inference or gradient descent. They are not good at drawing like GPUs are. They are optimised for machine learning and joining other AI chips together so you can have massive networks of chips that can parallel compute.

One other class of chips that has not yet shown up are ASICs that mimic the transformers architectures for even more speed - though it changes too much at the moment for it to be useful.

Also because of the mechanics of scale manufacturing: GPUs are currently cheaper per flop of compute as the aggregate of scale is shared with graphical uses. Though with time if there is enough scale AI chips should end up cheaper



Do you have any sources on those informations? I find it really hard to find stuff for what you describe. Also do you know about the detail of producing those Asics? Are they CMOS or flash (in-memory-compute?)


All current AI accelerators (that aren't a research project) are ordinary CMOS. Google published some papers about TPUv3. You should read them if you want to know more about the architecture of these kinds of chips.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: