The dynamic range is there in a raw file. You just need to apply tonemapping the way cell phones do it.
My 2008-era 1Ds3 still crushes a modern phone (I have a Pixel 2) in dynamic range when processed properly, and a D7500 is even better still, not to mention the latest full-frame models like the D850 and such.
Phones do, counterintuitively, have the advantage in extreme low light because they're able to stack multi-second exposures very reliably. But then again, I'm comparing my phone against a 2008 camera... the very best latest full frame cameras might be on par (at an extreme depth of field disadvantage though).
From what I can find the 1DsIII has a dynamic range of 8.8 stops and the iPhone 11 has 10 stops. Have you found measurements otherwise? Single exposure cameras are really no match for computational photography in this area because it gives an effectively infinite dynamic range. You can take the equivalent of a long exposure photo for every shadow on a sunny day.
This site uses user-submitted raw files to calculate the dynamic range. I'm not sure whether this is only a single iPhone shot or not, and it's an Xs, not an 11. (Incidentally, that's data from my personal 1Ds3.)
If you're looking at DxOmark reviews, I'm fairly certain that they do not review ILCs and smartphones on the same scale, and I take every number they give with a huge grain of salt.
Re: computational techniques:
You certaily could get infinite dynamic range, and perhaps the iPhone 11 does better than my Pixel (I hear it does multiple exposure durations instead of repeating one exposure duration), but at least with my Pixel there's tons of shot noise everywhere, even in the midtones and highlights, because the full well capacity on the dinky sensors is accordingly tiny.
You just rarely scrutinize a cell phone photo alongside a large-sensor camera photo so you gloss over where the heavy sharpening plus noise reduction has smeared the fine detail to paste.
The processing has a lot to overcome, after all: the small lens-sensor combinations are running at the limits of diffraction and thus are extremely soft if viewed at the raw data level. Sharpening it heavily just to level the playing field with a large sensor (which you can safely do because a lot of the blur is just diffraction) enhances the noise a ton, beyond what the dynamic range numbers derived from the raws would suggest.
At 960fps all 10 shots were taken in a 1/100th of a second. So same as 1 shot at 1/100th. Probably good enough for most shots except sports/action and my guess is the computational photography can deal with that too in various way.
No, you can't do full sensor readout at 960fps. You can only do 960fps by shooting at 2MP and keeping only 8 bits of data per pixel (or even less). Full sensor readout rates on phones are closer to 20-30 fps IIRC. And even that comes with caveats (no exposure adjustment, shutter speed must be faster than a certain amount, etc..). And that's at the low resolutions phones have. And then when you start hitting the limits of readout speed you start increasing rolling shutter issues and so on.
My 2008-era 1Ds3 still crushes a modern phone (I have a Pixel 2) in dynamic range when processed properly, and a D7500 is even better still, not to mention the latest full-frame models like the D850 and such.
Phones do, counterintuitively, have the advantage in extreme low light because they're able to stack multi-second exposures very reliably. But then again, I'm comparing my phone against a 2008 camera... the very best latest full frame cameras might be on par (at an extreme depth of field disadvantage though).