Yeah, the synchronous updates are a big deal. To get an idea of how much bandwidth this typically takes, Oracle has 1600gbps per node of interconnect with latency very low, guessing in the tens of microseconds. A really good home connection might have 1gbps of interconnect with latency in the tens of milliseconds. The big question is whether we really need all this interconnect -- GPT-JT[1] is a very promising step in this direction. The idea is that we just drop most of the gradient updates and it still works well. Unclear whether this will take off generally -- if it does it would be a huge deal, because 1600gbps of interconnect is very expensive.