I'm curious about your data set. The correlation of cases to deaths using data from covidtracking.com for the US over the past year is 0.28-0.3 when I ran it. I slid it by two, three and four weeks.
I'm not sure that a simple sliding correlation really captures how treatments, protocols, and behaviors have changed over time. Leaving aside the winter holidays case peak (which is much more multi-modal than the others), I see two peaks:
* A peak of cases around Apr 11, followed by a peak of hospitalizations on Apr 22, with a peak of deaths also around Apr 22.
* A peak of cases around Jul 22, followed by a peak of hospitalizations around July 26, followed by a peak of deaths around August 4.
If I were going to do a more detailed analysis, I would want to try breaking out individual states/counties (subject to some reasonable population minimum), such that multiple distinct trends nationally don't interfere with each other in the data.
Totally agree. I ran it for New Jersey with the similar results, but it is brute force. Scratching the surface quickly leads to many more variables. For example, more testing would lead to more cases. Then, of course, we'd need to look at how the testing was done (eg random or hospital entry) and what test it was.
I really wish that stochastic testing were discussed more seriously.