yeah we just have the 100gig link, atm that's about all the gpu clusters can pul...

mwambua · 2025-10-01T19:06:19 1759345579

How did you arrive at the decision of not putting the GPU machines in the colo? Were the power costs going to be too high? Or do you just expect to need more physical access to the GPU machines vs the storage ones?

g413n · 2025-10-01T20:29:25 1759350565

When I was working at sfcompute prior to this we saw multiple datacenters literally catch on fire bc the industry was not experienced with the power density of h100s. Our training chips just aren't a standard package in the way JBODs are.

Symbiote · 2025-10-01T21:57:56 1759355876

Isn't the easy option to spread the computers out, i.e. not fill the rack, but only half of it?

A GPU cluster next to my servers has done this, presumably they couldn't have 64A in one rack so they've got 32A in two. (230V 3phase.)

pixl97 · 2025-10-01T22:29:12 1759357752

Rackspace is typically at a premium at most data centers.

Symbiote · 2025-10-01T22:39:21 1759358361

I'm more surprised that a data centre will apparently provide more power to a rack than is safe to use.

toast0 · 2025-10-02T07:21:50 1759389710

My info may be dated, but power density has gone up a ton over time. I'd expect a lot of datacenters to have plenty of space, but not much power. You can only retrofit so much additional power distribution and cooling into a building designed for much less power density.

tempest_ · 2025-10-02T13:40:01 1759412401

This is my experience as well. We have 42u racks with 8 machines in them because we cant get more power circuits to the rack.

g413n · 2025-10-02T15:14:38 1759418078

yep this was the case for us.

lemonlearnings · 2025-10-01T22:49:20 1759358960

Adding the compute story would be interesting as a follow up.

Where is that done? How many GPUs do you need to crunching all that data. Etc.

Very interesting and refreshing read though. Feels like what Silicon Valley is more about than just the usual: tf apply then smile and dial.