I wasn’t intending to write a post tonight, and I’ve had a bit to drink so maybe I’ve missed something. If not, Tamino has done some incredibly stupid cherry-picking.
I just stumbled across a post about a week ago becaues of a response I saw on Watts Up With That? Tamino’s post is made in reference to the notion:
Most who call themselves “skeptics” of global warming would probably say “No global warming since 1998!” Under the name “hiatus” or “pause,” it features prominently in public discussion and even in senate testimony (e.g. from Judith Curry).
To rebut this notion, he creates a series of graphs for various temperature data sets in the form of:
He describes the image:
We’ll start with the HadCRUT4 data set from the Hadley Centre/Climate Research Unit in the U.K. Taking the data from 1979 through 1997, we’ll compute a linear regression line, then extrapolate that line through to 2013 to construct our “still-warming” prediction. We’ll also compute the standard deviation of the residuals from our linear regression so we can add two lines to the graph, one of which is two standard deviations above our forecast, the other two standard deviations below, in order to delineate the range in which we would expect most of the future data to be.
We’ll also take the final value of the linear regression line (not the slope) as our estimate of what we would expect if we had been given certain knowledge of no statistically significant warming from 1998 through 2013, and we’ll add extra lines, two standard deviations above and below, to mark out the expected range.
I don’t think this test is appropriate for the conclusions Tamino draws, but that’s not what I want to discuss today. What I want to discuss is a far simpler point. Tamino claims he is going to discuss the claim there has been “no global warming since 1998.” To do this, he sets his breakpoint at 1997.
Why? Why would you include data from 1998 if people say there hasn’t been warming since 1998? The phrase “since 1998” is clearly intended to mean, “We saw warming up through 1998 but not after 1998.” The entire point of people picking the year 1998 was it was anomalously warm. Obviously people aren’t wanting to include it in the period they say has had no warming.
I can’t think of a legitimate reason for Tamino’s decision. As far as I can tell, it’s simply idiotic cherry-picking – claiming to test one period by randomly using data from a different period. Maybe I’m missing something though.
Regardless, one might be curious how much of a difference this issue makes. As such, I’ve taken the liberty of making a (very) crude replication of the key elements of Tamino’s graph:
There are some minor differences, but it clearly shows the same thing Tamino describes:
What actually happened is that, according to the HadCRUT4 data, most of the data are above both forecasts. Twelve of sixteen were hotter than expected even according to the still-warming prediction, and all sixteen were above the no-warming prediction
But what if we didn’t include data from 1998 when “testing” to see if there had been warming “since 1998”? Would we get the same results? No. Using the same process as the image above, I get:
In this image, a couple years are below the blue line, and about half the years are below the red line. It’s a dramatic change. It shows Tamino exaggerated his results by randomly choosing to include the hot 1998 in the period he tested to see if there had been no warming “since 1998.”
As far as I can tell, that’s just stupid cherry-picking. It’s stupid because he flat-out said what he was doing. Anyone reading his post can tell he isn’t testing the argument he claimed to be examining. It seems he basically said, “I’m going to examine argument X by testing argument Y.”
Maybe I’m missing something though. Tamino often says he knows a great deal about analyzing data, and he’s published scientific papers heavily involving linear regressions. Why should anyone care if I think Tamino used stupid cherry-picking to exaggerate the results of an already inappropriate test? I’m just a nobody.
Of course, I’m a nobody who understands simple phrases like “since 1998.” That’s one point in my favor.
February 7th, 7:17 AM Update: A post I wrote a few days ago is highly relevant to Tamino’s. As I pointed out there, you can get all sorts of graphs if you’re willing to cherry-pick things like endpoints. To demonstrate, I’ve created a third graph following Tamino’s method:
The difference between this and his image is enormous. Why? Because I decided to make an arbitrary choice of my own. I began by making the non-arbitrary choice to test the claim there has been no warming since 1998, instead of Tamino’s since 1997. I then decided to make both sides of the break point equal in length. With those two changes, the results are dramatically different.
Are they any better than Tamino’s? No. Are they any worse? No. It’s true I used less data than Tamino did, but he could have used more data if he wanted. The HadCRUT4 data set goes farther back than 1979. The fact other data sets might not begin until 1979 doesn’t mean he can’t use the data we do have before 1979. And if we can arbitrarily exclude 50+ years of data, I can arbitrarily exclude five years of data.
What this shows is Tamino’s method is inappropriate, and it practically begs for cherry-picking.