Wednesday, January 31, 2018

Trip to the 21st(?) Century

Went to Reading University last week, to do a lecture as part of a course for PhD students. The course was aimed at teaching paleodata students about models and how they might be useful for their research. A very good idea! Here are some students, and Joy (at the end of the bench) who was in charge, doing a lot of the teaching and most of the lecturing (a lot of work!) all while living up to her name!

Luckily, frothy coffee (a necessity for this travelling lecturer!) is as plentiful outside of Yorkshire as it  is inside. 

The main feature of Reading University campus is the pond which, at this time of year, is chock full of birds. I managed a couple of trips to inspect it, but I only had my iPhone so no point posting the bird pics. Here instead is some pond and trees.

Railway Britain does not seem to have advanced much since the 1980 and 1990s. Biggest difference is that the announcements at the stations are now audible. People also look at phones a lot more, and thankfully they shout into them a lot less. There are also places on the trains to plug your phone in to charge up its failing battery. 

Birmingham New Street station. Dangerously narrow platforms, with crowds standing well over the yellow line! ("Abunai desu kara kiiroi sen made osagari kudasai" - standard train announcement at all Japanese railway stations - because it is dangerous, stay back from the yellow line)

Reading station, however, was redeveloped THIS CENTURY! It's a sort of curvier, blue, dirty version of a late 20th century Japanese station. But at least it has nice wide platforms.

Settle-Reading is about 240 miles, taking about five and a half hours. I suppose this is an improvement over the seemingly endless period in the mid-1990s when all journeys took at least 10 hours due to every inch of track having to be inspected with a fine tooth comb. But it seemed like a lot of travelling for my 1.5 days in Reading. Trains back to the safety of the 19th century are rare and there was a long wait on an empty platform at Leeds station on Friday, but one can gain a little encouragement from the flavour of Victoriana emerging on the painted pillars.

Eventually... Ahhh, Settle station - back in the 19th century at last!

Monday, January 29, 2018 Cox et al part 3

As promised some more on this. The first thing I thought, on seeing this paper – a feeling that others apparently shared – was, why had no-one else already thought of this? Had we all just behaved like the fabled economist who, when their companion points out a £10 note lying on the pavement, ignores it, saying "If there really was a £10 note, someone would have picked it up already"?

Certainly the Schwartz fiasco will have put people off from pursuing this approach, as many of us had shown via a variety of arguments that the theoretical relationship in the simple 1-box climate model that directly links the autocorrelation of internal variability to equilibrium response, cannot be directly used for diagnosing the latter from the former in more complex climate models. Of course, this is not quite what Cox et al do, rather they show a strong correlation between their measure of variability and the sensitivity, across the ensemble of CMIP5 models. One complication in their analysis is that they measure variability via the 20th century simulations. Most of the variation in temperature seen in the 20th century is actually the response to external forcing and this forcing is far from the white noise assumed by Cox et al’s analysis (even after detrending, the variation about the trend is not white noise either). This would seem to undermine the theoretical basis for their relationship.

So, rather than using the 20th century simulations, I’ve had a quick look at the pre-industrial control simulations in which models are run for lengthy periods of time with no changes in external forcing. In all the following analyses I have restricted my attention to the models for which I had at least 500y of P-I control simulation, in order that the behaviour of each model would be well characterised (it is well known that the empirical estimate of the lag-1 autocorrelation tends to be biased low to a substantial degree for short time series). This restricted my set to 13 models. In this set of 13 models I  included both the MIROC models (5 and ESM) which Cox et al used as alternates, as I happen to know that the changes between the two generations here are substantial and were specifically made to affect the climate sensitivity-relevant processes as can be seen in their widely differing equilibrium sensitivities. It may however be that my results are themselves somewhat sensitive to the choice of models.

So, firstly, here’s a quick look at whether the lag-1 autocorrelation of annual mean temperature is related to the equilibrium sensitivity across this set of models:

Screenshot 2018-01-25 17.11.56
Nope. The regression line is nearly flat and nowhere near significant.

However, this isn’t quite what Cox et al presented. They actually calculated a function psi which depends also on the magnitude of interannual variability as well as its persistence. In fact their psi is defined as sd/sqrt(-log(alpha)) were sd is the standard deviation of interannual variability and alpha is the lag-1 correlation coefficient. They argue that this is the most relevant diagnostic as it is linearly related to sensitivity in their theoretical case. Sure enough when we calculate psi for the control simulations and correlate this with sensitivity we see:

Screenshot 2018-01-25 17.12.11

There is a significant correlation at the 5% level! Just to be clear, the values of psi here are not the same ones that Cox et al calculate, instead I’ve applied their formula to the model data from the control simulations in order to eliminate the effect of external forcing. So why does this work whereas the lag-1 autocorrelation is not useful?

Well the answer is found by checking the relationship between standard deviation (the numerator in their psi function) and sensitivity, and here it is:

Screenshot 2018-01-25 17.12.24

This is actually a much stronger correlation than the previous one, now significant at the 1% level. Of course we have no direct measure of the magnitude of internal variability of the real climate system, but this could be reasonably estimated by subtracting the forced response from the observations (by some combination of statistical and/or model-based calculation). So this relationship could in principle also be used as an emergent constraint (without prejudice as to its credibility).

In terms of the simple one-box climate model, the differing magnitudes of interannual variability across the ensemble could be due to the variation in (internally-generated) radiative imbalance on the interannual time scale, or the effective heat capacity of the thin layer that reacts on this time scale, or the radiative feedback lambda = 1/sensitivity. I suppose more detailed examination of model data might reveal which factor is most important here. I would be very surprised if people haven’t already looked into this in some detail, and don’t propose to do so myself at this point. Certainly many people have looked at variability on various space and time scales and tried to relate this to equilibrium sensitivity. Anyway, at this point I think I should call a halt and "reach out to" (don’t you hate that phrase) Andy Dessler and perhaps one or two others to ask if this strong correlation makes sense to them. I can’t help but think it would have been noticed previously if it’s actually robust (eg if it exists across CMIP3 as well as CMIP5). And if not, maybe it’s just luck.

Friday, January 26, 2018 More about Cox et al.

Time to move this discussion onto the BlueSkiesResearch blog as it is, after all, directly related to my work. Previous post here but I might copy that over here too.

Conversations about the Cox et al paper have continued on twitter and blogs. Firstly, Rasmus Benestad posted an article on RealClimate that I thought missed the mark rather badly. His main complaint seems to be that the simple model discussed by Cox et al doesn’t adequately describe the behaviour of the climate system over short and long time scales. Of course that’s well known but Cox et al explicitly acknowledge this and don’t actually use the simple model to directly diagnose the climate sensitivity. Rather, they use it as motivation for searching for a relationship between variability and sensitivity, and for diagnosing what functional form this relationship might take. Since a major criticism of the emergent constraint approach is that it risks data mining and p-hacking to generate relationships out of random noise, it’s clearly a good thing to have some theoretical basis for them, as jules and I have frequently mentioned in the context of our own paleoclimate research.

And more recently, Tapio Schneider has posted an article arguing that Cox et al underestimated their uncertainties. Unfortunately, he does this via an analysis of his own work that certainly does underestimate uncertainties, but which does not (I believe) accurately represent the Cox et al work. Here’s the Cox et al figure again, and below it another regression analysis of different data from Schneider’s blog.
Screenshot 2018-01-18 10.17.32
Screenshot 2018-01-25 10.45.40
It’s clear at a glance that the uncertainty bounds on the Cox et al regression basically include most of the models whereas the uncertainty bounds of Schneider exclude the vast majority of his (I’m talking about the black dashed lines in both plots). I think the simple error here is that Schneider is considering only the uncertainty on the regression line itself whereas Cox is considering the predictive uncertainty of the regression relationship. The theoretical basis for most of the emergent constraint work is that reality can be considered to be "like" one of the models in the sense of satisfying the regression relationship that the models exhibit, ie it follows on naturally from the statistically indistinguishable paradigm for ensemble interpretation (I don’t preclude the possibility that there may be other ways to justify it). The intuitive idea is that reality is just like another model for which we can observe the variable on the x-axis (albeit typically with some non-negligible uncertainty) and want to predict the corresponding variable on the y-axis. So the location of reality along the x-axis is constrained by our observations of the climate system, and it is likely to be a similar distance from the regression line as the models themselves are.

Schneider then compares his interpretation of the emergent constraint method with model weighting, this being a fairly standard Bayesian approach. We also did this in our LGM paper, though we did the regression method properly so the differences were less marked. I always meant to go back and explore the ideas underlying the two approaches in more detail, but I believe that the main practical difference is that the Bayesian weighting approach is using the models themselves as a prior whereas the regression is implicitly using a uniform prior on the unknown. The regression has the ability to extrapolate beyond the model range and also can be used more readily when there is a very small number of models, as is typically the case in paleo research.

Here’s our own example from the paper which attempts to use tropical temperature at the Last Glacial Maximum as a constraint on the equilibrium sensitivity.
Screenshot 2018-01-25 11.01.56
The models are the big blue dots (yes, only 7 of them, hence the large uncertainty in the regression). I used the random sampling (red dots) to generate the pdf for sensitivity, by first sampling from the pdf for tropical temperature and then for each dot sampling from the regression prediction. The broad scatter of the red dots is due to using t-distributions which I think is necessary due to the small number of models involved (eg even the uncertainty on the tropical temp constraint is a t-distribution as it was estimated by a leave-one-out cross validation process). But this is perhaps a bit of a fine detail on the overall picture. It is often not clear exactly how other authors have approached this and to be fair it probably matters less when considering modern constraints when data are generally more precise and ensemble sizes are rather larger.

We also did the Bayesian model weighting in this paper, but with only 7 models the result is a bit unsatisfactory. However the main reason we didn’t like it for that work is that by using the models as a prior, it already constrains the sensitivity substantially! Whereas if the observations of LGM cooling had been outside the model range, the regression would have been able to extrapolate as necessary.
Screenshot 2018-01-25 15.06.47
Here’s the weighting approach applied to the same question, with the blue dots marking the models, the green curve is the prior pdf (equal weighting on the models) and the thick red is the posterior which is the weighted sum of the thinner red curves. Each model has to be dressed up in a rather fat gaussian kernel (standard techniques exist to choose an appropriate width) to make an acceptably smooth shape. It’s different from the regression-based answer, but not radically so, and the difference can for the most part be attributed to the different prior.

Having said all that, I’m not uncritically a fan of the Cox et al work and result, a point that I’ll address in a subsequent post. But I thought I should point out that at least these two criticisms of Schneider and Benestad seem basically unfounded and unfair.

Thursday, January 18, 2018

More sensitivity stuff

After what feels like a very long hiatus, it seems that people are writing interesting stuff about climate sensitivity again. Just last week on Twitter I saw Andrew Dessler tweeting about his most recent manuscript which is up on ACP(D) for comment. My eyebrow was slightly raised at the range of values he found when analysing outputs of the MPI ensemble, 2.1 to 3.9K, until I realised that these were the outliers from their 100-member ensemble and eyeballing the histogram suggests the standard error on individual estimates (which I didn't see quoted) is around 0.5C or lower. Worth considering, but not a show-stopper in the context of other uncertainties we have to deal with. It would, I think, be interesting to consider whether more precise estimates can be calculated with a more comprehensive use of the data, such as by fitting a simple model to the time series rather than just using the difference between two snapshots. Which, coincidentally (or not) is something I might have more to talk about in the not too distant future.

Then just today, a new paper using interannual variability as an emergent constraint. By chance I bumped into one of the authors last week in Leeds so had a good idea what was coming but have not had time to consider in much detail. (The nature paper is paywalled but has a copy already.) Here's a screenshot of the main analysis for those who can't be bothered downloading it. The x-axis is a measure of interannual variability over the observational period, and the letters are CMIP models.

Using interannual variability to diagnose the equilibrium response has a somewhat chequered history, eg here and here for my previous posts though the links to the underlying papers are dead now so I've put the new ones here:

The central problem with the Schwartz approach is the strong (and wrong) assumption that the climate system has a single dominant time scale. It is easy to show (I may return to this in a future post) that the short time scale response simply cannot in principle directly constrain the equilibrium response of a two-time scale system. So this may be why the idea has not been followed up all that much (though in fact Andrew Dessler has done some work on this, such as this paper for example).

The latest paper gets round this by essentially using climate models to provide the link between interannual variability and equilibrium response. It remains possible that the models all get this wrong in a similar manner and thus the real climate system lies outside of their prediction, but this “unknown unknown” issue intrinsically applies to just about everything we ever do and isn't a specific criticism of this paper. My instinct is their result is probably over-optimistic and future work will find more uncertainties than they have presented, but that could just be a reflexive bias on my part. For example, it is not clear from what is written that they have accounted for observational uncertainty in their constraint, which (if they have not done) will probably bias the estimate low as uncorrelated errors will reduce their estimate of the real system's autocorrelation relative to the models where obs are perfect. There is also a hint of p-hacking in the analysis but they have done some quite careful investigation and justification of their choices. It will certainly provide an interesting avenue for more research.

Thursday, November 30, 2017

Implicit priors and the energy balance of the earth system

So, this old chestnut seems to keep on coming back....

Back in 2002, Gregory et al proposed that we could generate “An observationally based estimate of the climate sensitivity” via the energy balance equation S = F2x dT/Q where S is the equilibrium sensitivity to 2xCO2, F2x = 3.7 is the (known constant) forcing of 2xCO2, dT is the observed surface air temperature change and Q is the net radiative imbalance at the surface which takes account of both radiative forcing and the deep ocean heat uptake. (Their notation is marginally different, I'm simplifying a bit.)

Observational values for both dT and Q can be calculated/observed, albeit with uncertainties (reasonably taken to be Gaussian). Repeatedly sampling from these observationally-derived distributions and taking the ratio generates an ensemble of values for S which can be used as a probability distribution. Or can it? Is there a valid Bayesian interpretation of this, and if so, what was the prior for S? Because we know that it is not possible to generate a Bayesian posterior pdf from observations alone. And yet, it seems that one was generated.

This method may date back to before Gregory et al, and is still used quite regularly. For example, Thorsten Mauritsen (who we were visiting in Hamburg recently) and Robert Pincus did it in their recent “Committed warming” paper. Using historical observations, they generated a rather tight estimate for S as 1.1-4.4C, though this wasn't really the main focus of their paper. It seems a bit optimistic compared to much of the literature (which indicates the 20th century to provide a rather weaker constraint than that) so what's the explanation for this?

The key is in the use of the observationally-derived distributions for the quantities dT and Q. It seems quite common among scientists to interpret a measurement xo of an unknown x, with some known (or perhaps assumed) uncertainty σ, as implying the probability distribution N(xo,σ) for x. However, this is not justifiable in general. In Bayesian terms, it may be considered equivalent to starting with a uniform prior for x and updating with the likelihood arising from the observation. In many cases, this may be a reasonable enough thing to do, but it's not automatically correct. For instance, if x is known to be positive definite, then the posterior distribution must be truncated at 0, making it no longer Gaussian (even if only to a negligible degree). (Note however that it is perfectly permissible to do things like use (x- 2σ, x+ 2σ) as a 95% frequentist confidence interval for x, even when it is not a reasonable 95% Bayesian credible interval. Most scientists don't really understand the distinction between confidence intervals and credible intervals, which may help to explain why the error is so prevalent.)

So by using the observational estimates for dT and Q in this way, the researcher is implicitly making the assumption of independent uniform priors for these quantities. This implies, via the energy balance equation, that their prior on S is the quotient of two uniform priors. Which has a funny shape in general, with a flat region near 0 and then a quadratically-decaying tail. Moreover, this prior on S is not independent of the prior for either dT or Q. Although it looks like there are three unknown quantities, the energy balance equation tying them together means there are only two degrees of freedom here.

At the time of the IPCC AR4, this rather unconventional implicit prior for S was noticed by Nic Lewis who engaged in some correspondence with IPCC authors about the description and presentation of the Gregory et al results in that IPCC report. His interpretation and analysis is very sightly different to mine, in that he took the uncertainty in dT to be so (relatively) small that one could ignore it and consider the uniform prior on Q alone, which implies an inverse quadratic prior on S. However the principle of his analysis is similar enough.

In my opinion, a much more straightforward and natural way to approach the problem is instead to define the priors over Q and S directly. These can be whatever we want and are prepared to defend publicly. I've previously advocated a Cauchy prior for S which avoids the unreasonableness and arbitrariness of a uniform prior for this constant. In contrast, a uniform prior over Q (independent of S) is probably fairly harmless in this instance, and this does allow for directly using the observational estimate of Q as a pdf. Sampling from these priors to generate an ensemble of (S,Q) pairs allows us to calculate the resulting dT and weight the ensemble members according to how well the simulated values match the observed temperature rise. This is standard Monte Carlo integration using Bayes Theorem to update a prior with a likelihood. Applying this approach to Thorsten's data set (and using my preferred Cauchy prior), we obtain a slightly higher range for S of 1.2 - 4.8C. Here's a picture of the results (oops, ECS = S there, an inconsistent labelling that I can't be bothered fixing).

The median and 5-95% ranges for prior and posterior are also given. As you can see, the Cauchy prior doesn't really cut off the high tail that aggressively. In fact it's a lot higher than a U[0,10] or even U[0,20] prior would imply.  

Wednesday, November 15, 2017

Watt's up with Pat Frank?

And now for your scheduled return to the climate blogosphere wars. I haven't missed it at all. Pat Frank has posted a rather tedious pile of blether on WTFUWT which mentions me, albeit tangentially. Well, maybe a bit more than tangentially. The story, such as it is, is that he submitted a paper (which apparently has been rejected 6 times already by different journals) to GMD where I'm an editor. I took on responsibility for dealing with it, which was a fairly simple task as the glaring error in the manuscript is not really that well hidden. A bit of googling confirmed that several others had already seen this and dealt with it appropriately, so rather than waste the time, effort and good-will of hard-pressed reviewers I summarily rejected it. There followed the inevitable appeal which was sent to Jules (dealing with appeals happens to be one of her specific roles as an Exec Ed) who passed it on to fellow Exec Ed Didier Roche due to the obvious conflict of interest. He has upheld the appeal but not before several more rambling screeds appeared, and the blog post, and several hundred comments.

I'd suggest the comment thread for general entertainment purposes, but I defy anyone to wade though it all (never was it more true that comment threads on blogs are a write-only medium). A couple of sane voices did their best to uphold my honour but the vast majority is just boring vacuous idiocy. Sigh. Are there not any interesting sceptics around these days?

Tuesday, October 10, 2017

We have corporate sponsorship

I've been waiting a long time to use this clip!

Finally signed our first contract for some work which is due to start shortly. It's not a huge project but should be interesting and generate some worthwhile results. We didn't really have to punch ourselves in the face or threaten to reveal the dirty secrets of climate modelling (jules already has a journal dedicated to that cause).

If anyone else wants to jump on the bandwagon and pay jules or me to do something interesting, leave a comment :-)

Sunday, October 01, 2017

Stockholm art

Stockholm has a modern art museum and we all know how important it is to open one's mind to surrealist thoughts before a science conference...

We've never had a cargo disaster like this bicycle case, despite shipping 3 tandems across the oceans to Japan and back!

I soon discovered one of the escaped bicycle wheels spinning in a corner:

Wonder what beautiful piccies will be added to these frames, presently labelled "Plingeling" and "Pling":

Perhaps I should have looked behind this sheet to see the exhibit behind, but I was too shy:

But there was also some good stuff:

Can't beat Klein bloooooo! The handy information board informed me that he spent time in Japan learning Zen. That must partly explain why it is just so good. Ahhhh...

And then there was the extensive MODEL GRID SECTION of which this is a small part!!!!! 
Woo Hooo! 
If GMD had any money it could sponsor this!

[jules' pics] Stockholm

#PMIP2017 was held in Stockholm. Maybe it was the unusual warmth and sunshine, but Stockholm seemed like a very happy kind of place.


Nowhere else have I seen children swinging joyfully from the street signs.

Construction is always good sign of prosperity...?

Then there is the river

Private yachts.

Public life saving.

Posted By Blogger to jules' pics at 10/01/2017 02:17:00 PM

Sunday, September 24, 2017

Running hot...or not?

The question has been asked (repeatedly): are the CMIP models “running hot”? By which it is not meant, are the models too warm - they have a wide range of temperature biases which are normally subtracted off by the use of anomalies (which is a separate debate) - but whether they are warming up too much relative to observations.

But I don't care about that, because I've been running too! It's been a bit warm in Hamburg and humid too, so I was a bit apprehensive about this morning's half marathon up and down a bit of river bank at the north edge of the city. 

However I didn't need to worry about that, it was grey and chilly this morning. What I should have been more concerned about is the lack of recent training and surfeit of pastries (not to mention currywurst).

It's a funny affair with another identical half marathon going off 20 mins ahead of us, that being the “Cup” event (part of a series of three races). (Fortunately I didn't find that web page until just now or I might have had to enter all of them.) But the cup runners are not all that quick, so I spent most of the race overtaking them. This wasn't really a problem as the small field of 500 runners was fairly well strung out by the time I caught them. The course was a riverside path, just hard-trodden earth which was mostly dry but a little slippery in parts.

It wasn't all as flat and smooth as this!

Plenty of sharp turns and short rises too. Despite being about 500m too short, it was still a personal worst, slower even than my very first half marathon when I'd never run that far before! 9th finisher in my race in 1:29:14, 2nd MV45 and also well beaten by one woman who was 2nd overall.