>Internal storage, regardless of how much, will run out quickly. This only appli...

yumraj · on Sept 7, 2023

We’re talking about 192GB of GPU accessible memory here.

Or are you comparing with CPU inference? In which case apples-oranges.

How much do GPUs with 192GB of RAM cost?

Edit: also I think (unverified) very very few systems have multiple PCI 3/4 NVME slots. There are companies with PCI cards that can take NVMEs but that’ll in itself cost, without NVMEs, more than your $900 system.

superkuh · on Sept 7, 2023

Yes, CPU inference. For llama.cpp with Apple M1/M2 the GPU inference (via metal) is about 5x faster than CPU for text generation and about the same speed for prompt processing. Not insignificant but not giant either.

You generally can't hook up large storage drives to nvme. Those are all tiny flash storage. I'm not sure why you brought it up.

yumraj · on Sept 7, 2023

> You generally can't hook up large storage drives to nvme. Those are all tiny flash storage.

What’s your definition of large?

2TB and 4TB NVME are not tiny. You can even buy 8TB NVMEs, though those are more expensive and IMHO not worth it for this use case.

2TB NVMEs are $60-$100 right now.

You can attach several of those via Thunderbolt/USB4 enclosures providing 2500-3000 MB/s

LTL_FTC · on Sept 7, 2023

“external USB3 SSD... slowly” so which is it? Sata ports aren’t exactly faster than usb3. If you want speed you need pcie drives. Not sata. Thunderbolt is a great solution. Plus, my network storage sustains 10Gb networking. There are other avenues

GeekyBear · on Sept 7, 2023

> Actual desktop PCs have many SATA ports

How many of those PCs have 10 Gigabit Ethernet by default? You can set up fast networked storage in any size you like and share it with many computers, not just one.

andromeduck · on Sept 7, 2023

Who TF is still using SATA with SSDs?!