Would Statistical Rethinking help me interpret web app metrics? E.g. if I have a canary out and the response times are longer after x requests, is that significant?
I've found that Statistics is one of those topics that changes your world view about everything. You can consider pretty much any issue statistically, and that will enrich your perspective significantly. In that sense, Statistical Rethinking will help. However, it's a book on Bayesian stats, it's quite dense, and examples are coded in R. It may be overkill for web app metrics interpretation. For that you may be better served with basic stats & inference, frequencies, descriptive statistics, percentiles, basic distributions, data visualization (e.g., trend lines, scatter plots, boxplots, histograms), etc.
To be clear though, Statistical Rethinking is a beautiful piece of work. You can check out the author's lectures[0] and see how much they suit your needs.
The book is a bit more foundational than that. It teaches you about Bayesian statistics, and discusses (among other things) why the concept of binary yes/no statistical significance is usually not the best way of evaluating a hypothesis with data.
However, for your question specifically, the choice of prior is less meaningful when you have lots of data, and presumably a web app seeing hundreds or thousands of requests per second can gather enough data to determine if the canary has a different latency profile than the deployed version within a few seconds. Also, presumably you would use an uninformed prior for a test like that. If I were trying to prevent latency regressions in an automated deployment pipeline I would just compare latency samples after 1 minute with a t-test or something simple like that.