As promised some more on this. The first thing I thought, on seeing this paper – a feeling that others apparently shared – was, why had no-one else already thought of this? Had we all just behaved like the fabled economist who, when their companion points out a £10 note lying on the pavement, ignores it, saying “If there really was a £10 note, someone would have picked it up already”?

Certainly the Schwartz fiasco will have put people off from pursuing this approach, as many of us had shown via a variety of arguments that the theoretical relationship in the simple 1-box climate model that directly links the autocorrelation of internal variability to equilibrium response, cannot be directly used for diagnosing the latter from the former in more complex climate models. Of course, this is not quite what Cox et al do, rather they show a strong correlation between their measure of variability and the sensitivity, across the ensemble of CMIP5 models. One complication in their analysis is that they measure variability via the 20th century simulations. Most of the variation in temperature seen in the 20th century is actually the response to external forcing and this forcing is far from the white noise assumed by Cox et al’s analysis (even after detrending, the variation about the trend is not white noise either). This would seem to undermine the theoretical basis for their relationship.

So, rather than using the 20th century simulations, I’ve had a quick look at the pre-industrial control simulations in which models are run for lengthy periods of time with no changes in external forcing. In all the following analyses I have restricted my attention to the models for which I had at least 500y of P-I control simulation, in order that the behaviour of each model would be well characterised (it is well known that the empirical estimate of the lag-1 autocorrelation tends to be biased low to a substantial degree for short time series). This restricted my set to 13 models. In this set of 13 models I included both the MIROC models (5 and ESM) which Cox et al used as alternates, as I happen to know that the changes between the two generations here are substantial and were specifically made to affect the climate sensitivity-relevant processes as can be seen in their widely differing equilibrium sensitivities. It may however be that my results are themselves somewhat sensitive to the choice of models.

So, firstly, here’s a quick look at whether the lag-1 autocorrelation of annual mean temperature is related to the equilibrium sensitivity across this set of models:

Nope. The regression line is nearly flat and nowhere near significant.

However, this isn’t quite what Cox et al presented. They actually calculated a function psi which depends also on the magnitude of interannual variability as well as its persistence. In fact their psi is defined as sd/sqrt(-log(alpha)) were sd is the standard deviation of interannual variability and alpha is the lag-1 correlation coefficient. They argue that this is the most relevant diagnostic as it is linearly related to sensitivity in their theoretical case. Sure enough when we calculate psi for the control simulations and correlate this with sensitivity we see:

There is a significant correlation at the 5% level! Just to be clear, the values of psi here are not the same ones that Cox et al calculate, instead I’ve applied their formula to the model data from the control simulations in order to eliminate the effect of external forcing. So why does this work whereas the lag-1 autocorrelation is not useful?

Well the answer is found by checking the relationship between standard deviation (the numerator in their psi function) and sensitivity, and here it is:

This is actually a much stronger correlation than the previous one, now significant at the 1% level. Of course we have no direct measure of the magnitude of internal variability of the real climate system, but this could be reasonably estimated by subtracting the forced response from the observations (by some combination of statistical and/or model-based calculation). So this relationship could in principle also be used as an emergent constraint (without prejudice as to its credibility).

In terms of the simple one-box climate model, the differing magnitudes of interannual variability across the ensemble could be due to the variation in (internally-generated) radiative imbalance on the interannual time scale, or the effective heat capacity of the thin layer that reacts on this time scale, or the radiative feedback lambda = 1/sensitivity. I suppose more detailed examination of model data might reveal which factor is most important here. I would be very surprised if people haven’t already looked into this in some detail, and don’t propose to do so myself at this point. Certainly many people have looked at variability on various space and time scales and tried to relate this to equilibrium sensitivity. Anyway, at this point I think I should call a halt and “reach out to” (don’t you hate that phrase) Andy Dessler and perhaps one or two others to ask if this strong correlation makes sense to them. I can’t help but think it would have been noticed previously if it’s actually robust (eg if it exists across CMIP3 as well as CMIP5). And if not, maybe it’s just luck.

I’m enjoying reading this even if I’m kinda skimming over the details.

In your middle figure, for psi, where you get 5% sig corr, it looks by eye as though that rather depends on the two outlying points (CS ~2 and ~4.5). If you drop those out, how does it look? And is it at all fair to ask that?

Entirely fair question though to some extent the significance is an indication of how robust the relationship is and cutting out the prominent points could be considered a sort of reverse cherry-picking. The answer: without those two points the significance disappears as you’d suspected (though remains in the last plot).

Hi James

I dgitised your ECS vs psi plot and bootstrapped the correlation (1000 samples). Bootstrapping gives a 1% chance that the correlation is below zero. If the lowest ECS model (inmcm4, I presume) were not present then there would be a 4% chance of r < 0. If neither the lowest no highest ECS models were present then there would be about a 9% chance of r <0.