Laying the Points Out

There has been a lot of hoopla about supposed problems in the surface temperature record recently. Unfortunately, a lot of people don’t seem to understand what the hoopla is actually about. I think the problem is people have focused too much on rhetoric and too little on actual information. I’d like to try to reverse that.

This all started when claims by blogger Steven Goddard gained traction in the media. His claims were of the sort:

Right after the year 2000, NASA and NOAA dramatically altered US climate history, making the past much colder and the present much warmer. The animation below shows how NASA cooled 1934 and warmed 1998, to make 1998 the hottest year in US history instead of 1934. This alteration turned a long term cooling trend since 1930 into a warming trend.

This is a very serious claim, but it is also a very uninformative one. It tells us very little about what was done, much less how or why it was done. This lack of information has helped cause a lot of confusion. When fact checkers at Polifact examined Goddard’s argument, they found it lacking. Unfortunately, they didn’t give a clear rebuttal. This led to Anthony Watts, proprietor of the most popular blog on global warming (Watts Up With That?), writing a post in which he said Polifact was wrong but further confused things.

The problem is there are several different issues, and people tend to conflate them all. I’ll list each in turn:

1) Temperature data is adjusted for a variety of factors known to causes biases in the data.

2) The USHCN record has data that is recorded, but not used in its calculations.

3) The USHCN record is based upon 1,218 records, but there are fewer than 1,218 stations in it.

4) Steven Goddard uses a wonky methodology which introduces biases into his calculations.

Point 1 has been getting the least attention. People are aware the data has problems and adjustments are made to try to address them. There are disputes over the particulars of these adjustments, but they haven’t come up much in the recent discussions.

Point 2 is a more commonly discussed point. According to an NCDC statement recently publicized, the data not used is data which fails quality control tests. That claim hasn’t been subjected to examination enough to determine if it’s valid, but it’s obviously understandable bad data may get discarded.

Point 3 is the big one. It’s the key to the confused post Watt’s wrote, and it’s the most confusing issue. USHCN had, at one point, 1218 stations recording temperatures. It no longer does. Stations have stopped recording data for a variety of reasons. This causes missing data. Another source of missing data is the data filtered out for quality control purposes.

The confusing part is USHCN doesn’t simply perform its calculations without the missing data. Instead, it tries to estimate what the missing data is then perform its calculations over the measured and estimated data. This means stations can be used in the calculations long after they’re closed.

It seems really weird. You have to wonder why they estimate data then perform calculations rather than just performing the calculations without that data. The answer is USHCN’s methodology doesn’t handle missing data well. Suppose you had five station records:

1	1	1	1	1
2	2	2	2	2
3	3	3	3	3
4	4	4	4	4
5	5	5	5	5

You could average the five lines together, and you’d get a final result of 3, 3, 3, 3, 3. No problem, right? Right. The problem comes when you have missing data, like:

1	1	1	1	1
2	2	2	NA	NA
3	3	3	3	3
4	4	4	4	4
5	5	5	5	5

If you averaged these five lines together, you’d get 3, 3, 3, 3.25, 3.25. If the data had been missing from the second line instead of the fourth, you’d get 3, 3, 3, 2.75, 2.75. Obviously, there’s a problem. We don’t want our results to have significant changes because of small amounts of missing data.

To address that, USHCN attempts to fill in the missing data before calculating it’s averages. It does this by looking at the neighbors of stations with missing data. In the example above, we can use the third and fifth line to try to estimate the missing data. There’s a 1 on one side of the missing data and a 3 on the other. Based on that, we estimate the missing values are 2 and get:

1	1	1	1	1
2	2	2	2	2
3	3	3	3	3
4	4	4	4	4
5	5	5	5	5

This gives us the right data, and when we average it together, we get the right results (3, 3, 3, 3, 3). Clearly, the methodology can work. It won’t be as precise or as accurate in the real world as in these simple examples, but it clearly can improve the results. It’s certainly better than simply averaging things together without doing anything to address the missing data.

But that’s what Steven Goddard does. That’s the wonky method referred to in Point 4. Goddard simply averages all the series together. If he saw line four was missing two data points, he’d just shrug and average things anyway. He’d find his results were 3, 3, 3, 3.25, 3.25. He’d then accuse the people of fraud if they said the right answer was 3, 3, 3, 3, 3.

(Goddard also ignores the fact stations aren’t located equal distances apart. I won’t delve into that point, but obviously, if you have 9 stations in one area and one station in another, you won’t get a good answer if you just average all 10 together.)


Now that we’ve established what all the issues are, it’s much easier to understand what people are saying. With that, let’s go back to Steven Goddard’s argument. He said alterations to the data “turned a long term cooling trend since 1930 into a warming trend.” This graph, which I’m stealing from Zeke Hausfaster, shows how Goddard justifies that argument:

Goddard-and-NCDC-methods-raw-adj

The red line is what you get if you use Goddard’s methodology. The blue line is what you get if you estimate the missing values anomalize and grid the data. As you can see, Goddard can only claim the data shows a cooling trend if he uses his wonky methodology. Given Goddard’s argument is based upon a faulty methodology, Polifact was right to criticize it.

And when Anthony Watts said the Polifact story was wrong, announcing:

I was so used to Goddard being wrong, I expected it again, but this time Steve Goddard was right and my confirmation bias prevented me from seeing that there was in fact a real issue in the data and that NCDC has dead stations that are reporting data that isn’t real

He muddied the waters. The argument Polifact checked was wrong. Point 3 is true as Watts says. It just doesn’t create that red line. You only get that red line if you accept Goddard’s wonky methodology (Point 4). If you don’t accept it, Goddard is wrong.


I’m sure you noticed that graph had a third line, a green one. That line is what you get if you make the adjustments of Point 1. Those adjustments are clearly far more significant. With them, it wouldn’t matter if you used Goddard’s wonky methodology. You’d still get a warming trend.

That’s why Goddard likes to make a big deal out of them. They are so important to his argument they are a primary cause of the differences Goddard highlighted in the post which triggered all this attention. He made this image, which Polifact reposted:

1998changesannotated

The differences between those two graphs are caused by different data being available and adjustments for known problems being handled differently. That means for Goddard to make his argument, he needs to argue all four points discussed in this post. For him to be right, he needs to be right about all four points.

Other people can take more nuanced approaches. Anthony Watts only talked about Points 2 and 3. Zeke, while highlighting all four points, only discussed point 4. Other people might only care about Point 1 because it has the biggest effect.

That’s all fine. We can talk about whichever points we want to talk about. It’s just important we try to understand which points each other are talking about.

July 5th Edit: I’ve corrected a mistake in this post where I described the blue line in Zeke’s graph incorrectly. The line has nothing to do with infilling data. It is what you get when you use anomalies and gridding to remove biases due to coverage issues.

145 comments

  1. ‘We don’t want our results to have significant changes because of small amounts of missing data’

    Nasty and inconvenient and tiresome it may be, but if you ain’t got the data, you ain’t got it. The correct response is not to estimate/make up/fabricate/guess/work back from the desired answer, but to use better methods, more attuned to the real world and its (in)ability to make measurements.

    The current methods have a strong whiff of ‘cheating’ over them.

  2. @Brandon

    I deliberately left a choice of possibilities of motivations. But whyever it is done, data that didn’t previously exist (null) is brought into existence without making an observation of Nature.

  3. Well said, Brandon!

    Some time during the thread “The scientific method is at work (!!) on the USHCN temperature data set” my responses started going in to moderation, so I could no longer engage in dialogue. You too?

    I tried to say much the same about infilling here.

    I’ve noted that there’s nothing new in this fussing about adjustments increasing the trend. In 2002, skeptics Balling and Idso published in GRL a complaint about USHCN adjustments increasing the trend, saying:

    “It is noteworthy that while the various time series are highly correlated, the adjustments to the RAW record result in a significant warming signal in the record that approximates the widely-publicized 0.50°C increase in global temperatures over the past century.”

  4. Brandon,
    Thanks for this post. I haven’t been looking into this todo, beyond reading the posts at Lucia’s, which cover (as you say) only point 4. This is very helpful to me to put things in context. [Not that I’m likely to look further into it, to be honest. But it helps to know what all the hubbub is about.]

  5. Nick Stokes
    July 2, 2014 at 3:35 am
    ———————————–
    And about Tom Karl’s pet rat TOBy?

    Every move you make Nicky. Every move you make ;-)

  6. Too few “data” points to make your claim. If I take your matrix and repeat it until I have several hundred lines, missing data make only very small differences. If I extend the repetition to 1220 lines the average is 3.0000 (I can assume any number of significant figures and decimal places) with 100% of the data. Deleting 100 lines (~8% of the data) I still get an average of 3.0000 on 1220 total lines. If I delete 200 lines the average is 2.9996685 or 99.9% of the all data value. If I replace the missing data by 0’s then the average is 2.5, but that was not your example. On a large data set a few missing values make little or no difference.

  7. Sorry but if you don’t have data for stations you can’t just make it up by averaging other stations that may be hundreds of miles away and pretend that you are getting a valid result. The simple fact that both Anthony and Goddard agree on is that the hockey stick shape is WRONG. Whether it is because of incentives to give the government data that supports its narrative, which is my belief based on simple reason that takes into account human nature, the choice of an inappropriate method (or just incompetence), which is Anthony’s claim, or outright fabrication for ideological reasons, which seems to be the Goddard claim, is irrelevant. The trend is WRONG and any analysis based on that trend is not scientific. Had I done to the data what these scientists done I would have been dismissed from my engineering and science courses for misconduct. I do not see why ‘scientists’ get a free pass when they do the same.

  8. Latimer Alder, data is “brought into existence without making an observation of Nature” all the time. It’s called data processing. There’s nothing mathematically inappropriate about this. It’s just interpolating data. People do it all the time. It is a little weird to interpolate data prior to smoothing (averaging), but there are plenty of legitimate examples of it being done.

    Nick Stokes, thanks! And yeah, my comments started going into moderation. I’m still annoyed about that. I wrote a post because Anthony Watts resorted to what, as far as I can see, was blatant censorship because I disagreed with him. Interestingly, that was the first time I’ve (publicly) disagreed with him. How did I wind up getting censored the one time I disagreed?

    On the issue of “fussing about adjustments,” you’re right the general point is old, but that’s because the general point hasn’t been fully resolved. There are still open questions, so of course people will keep talking about the same topic. The trick is those open questions are nowhere near as large as some people make them out to be. As they’ve been partially resolved, people should have toned down their positions. They often haven’t.

    HaroldW, glad to. And yeah, I didn’t write this for people interested in examining this issue in detail. They shouldn’t need a post like this. This post is just for people who are seeing all the rhetoric and not having a way to know what the rhetoric is all about.

  9. Jonathon Abott, mwgrant, I’m glad to know it turned out well!

    Bob Greene, I don’t agree with your claim. While it’s true deleting a relatively small subset of your data doesn’t have to bias your results, that doesn’t mean doing so cannot bias your results. It depends on a variety of factors. One of the biggest is the size of your values. Using the numbers 1-5 makes it harder to bias your results because they’re so small. Use numbers like 40-90, and the effect manifests much easier. (Also, the effect tends to require a non-random subset be removed. Removing a random one will generally not bias the results.)

    bit chilly, I wrote this post at about two in the morning because I was tired of the rhetoric and I wanted to make things clearer as soon as possible. I stayed up for a while after it to respond to people, but I went to to bed before Bob Greene commented. I think I can be forgiven for that :P

    Plus, you asked that question only 24 minutes after Greene commented. That’s not exactly a lot of time to respond!

  10. adjustments should be trend neutral, because the errors are randomly distributed. the one exception to this would be urbanization and land use. however, this is not what we see. the adjustments themselves have introduced a trend, which indicates they are mathematically incorrect. the urbanization trend in the adjustments should be to cool the present (or warm the past). the opposite of what the adjustments are actually doing.

  11. The point made by Bob Greene is also made by Goddard in one of his most recent posts. Goddard writes
    “With a database that large, the distribution of error will be uniformly distributed between “too low” and “too high” and average out to zero.”
    If the dataset is very large and the missing data occurs randomly, that would probably be true.

    But he then blows up his own argument by saying “..when station loss is biased towards loss of colder rural stations.” If the gaps in the data occur more for colder stations (or warmer) then there would be a net effect.

  12. “If you averaged these five lines together, you’d get 3, 3, 3, 3.25, 3.25. If the data had been missing from the second line instead of the fourth, you’d get 3, 3, 3, 2.75, 2.75. Obviously, there’s a problem.”

    why is it an ‘instead of’ situation? Isn’t there so much data, missing from many places, they should cancel out?

    That’s how it seems to me,

    And suppose a station existed 1900-1995 and consistently was slightly hotter than 2 nearby stations. Meanwhile the entire area warms during this period. The problem with your recommended method is there are scenarios where the station was hotter because of a multi-decadal but not a permanent trend – in those scenarios the station, if it still existed, would *no longer* be reading hotter than the two nearby stations.

    Anyway, IMO, the whole argument is moot, reading the earth’s temperature from surface stations is just dumb. way, way, way too problematic. Would you want a doctor taking your temperature by sticking thermometer in each ear, plus some in between your toes and then applying a sampling technique? .. of course not.

  13. See Steirou and Koutsoyaniannis from the EGU meeting 2012. Available at itia.ntua.gr/1212.
    They show conclusively two things concerning (1) which is called ‘homogenization’. First, that the past is being cooled unjustifiably, far more than TOBS requires, and the opposite of what UHI correction requires (NCDC USHCN v1 published a TOBS average of +0.3F for a 2.5×3.5 grid). Second, based on a global sample of 163 stations, that the net is definitely warm biased. They called for rethinking homogenization.

    The present practices are questionable. For an example of ‘automatic’ data quality control that produces a warming trend where none exists in the ‘raw’ data, see BEST station 166900 on their website. 26 monthly lows were rejected based on the BEST ‘regional expectation’. 166900 is the scientific research base Amundsen-Scott at the South Pole! If any station anywhere does not have quality problems, it is this one. It is only an example. But it only takes one example to falsify their entire process. Done.

  14. I just love this simplified explanation.
    It is so simplified it has lost the whole essence of what Goddard, Watts and many others have all found, they are not only talking about a few stations or a few values or just One adjustment and You cannot takes the first 3 points in isolation because they are all happening at the same time to the same data.
    There are whole swathes of Estimated values, where the “local” stations are also Estimated and changes to values on a Daily basis.
    There is Estimated values where there is no missing data.
    Brandon, I challenge you to look at the data yourself before making such “simplified”, “Sweeping” and derogatory statements.
    Look at the data and then justify what you find with what you have just written.

  15. The good old climate scientists’ standby, making stuff up, producing all sorts of sophistry to justify it, circling the wagons, then shouting down and destroying the reputation of anyone who has the temerity to question it.

    And then they complain that the public don’t take them as seriously as they would like…

    You couldn’t make it up!

  16. Dear lord. I keep forgetting how creepy that emoticon is so I use it again and scare myself. I swear that thing is trying to consume my soul.

    Anyway… ferdberple, you are wrong when you say:

    adjustments should be trend neutral, because the errors are randomly distributed. the one exception to this would be urbanization and land use.

    I can think of at least half a dozen other types of systematic errors in the data which are not randomly distributed. For example, in the USHCN data set, it is trivially easy to see station drop outs are not randomly distributed. That means their errors will not be trend neutral.

    Paul Matthews, I saw that too. It’s pretty funny. I think it’s worse than you say though. The full quote is:

    Infilling is exactly the wrong thing to do, when station loss is biased towards loss of colder rural stations.

    That first clause is important. According to it, infilling is wrong “when station loss is biased towards loss of colder rural stations.” Apparently it wouldn’t be wrong if station loss were biased toward loss of warmer, urban stations?

    Shub Niggurath, Steven Mosher, thanks!

  17. Rud Istvan, there are many different topics that can be discussed. This post isn’t trying to examine them all. You may be right about issues regarding Point 1, but there’s too much other stuff being discussed for me to want to delve into those issues right now.

    A C Osborn, if you think I got something wrong, I’d suggest you quote what I said and explain how it is wrong. You just look silly when you tell people you disagree with to “look at the data yourself.” A lot of people have looked at the data for themselves. They don’t all agree with you.

    catweazle666, your comment amuses me. I wouldn’t have written this post save I was censored at WUWT when I tried to clarify what the different issues were. After I was censored, Anthony Watts said untrue things about me to paint me in a negative light, to the point where he simply fabricated claims (even about our personal communication). If the good old climate scientists’ standby is:

    making stuff up, producing all sorts of sophistry to justify it, circling the wagons, then shouting down and destroying the reputation of anyone who has the temerity to question it.

    You should probably go bother Watts for turning traitor and becoming one of “them.”

  18. Brandon, my appreciation was in sarcasm. Your post utilizes circular reasoning and is probably ok for little data sets whose properties are known (’cause you made them up yourself).

  19. Oh, sorry for not realizing that Shub Niggurath.

    If I may suggest something, you won’t get convince people by being sarcastic and vague. A far better way to convince people is to quote what I said and explain why you think it’s wrong.

  20. Brandon,

    To quote Edward Everett after Lincoln’s Gettysberg Address – Everett said, “I should be glad, if I could flatter myself that I came as near to the central idea of the occasion, in two hours, as you did in two minutes.”

    You have essentially done the same with your 4 points. I am greatly amused by the comments on Climate Etc. that are wildly “off topic” and the lack of “homework” that any of the bloggers have done with respect to QC methods. All of them are published.

    Please continue your excellent work.

    Philbert

  21. Look at the problem in front of you and work toward a solution. Your example is afflicted by a case of ‘engineer brain’ – i.e., it works nicely for the example in front of you which incidentally you happened to make up.

    If a network of stations are this similar to one another in a climate field such that values from neighbouring stations represent each other well, they are already completely redundant. They might as well be represented by a single station. If adjacent stations are far enough from each other to require all their data to calculate the climate field, then you cannot be filling in values the way you have done.

  22. If you “create” data by interpolation then that’s not data – it’s a result of calculation from other data. So presenting it as data is dishonest. More importantly, the observed data has some error margin associated with it. Numbers derived from calculations, such as interpolation, has even greater error margins associated with it. I was trained to never provide data without the associated error and to then provide proper error analysis on any results I calculate from that data. I suspect if that was properly done in climate analysis, everyone would have to admit that the observed data and required calculations result in too much error to make a definitive statement. Unfortunately the CAGW supporters have put themselves so far out on the limb with their positions, they can never admit that.

  23. Shub, there are other techniques you can use when the stations are more widely separated. For the USHCN, your concern about infilling isn’t an issue. It seems that most people who read Brandon’s summary did think it was pretty clear.

  24. Shub Niggurath, your complaint is off base. You suggest if this infilling is appropriate, many “stations are already completely redundant.” The reality is we know you wouldn’t need 1218 temperature stations to get a reasonably accurate record of temperatures in the United States. A fraction of that number would be sufficient. As such, many stations are redundant to a large extent. Their existence is merely there to increase the precision, not accuracy, of the results.

    Additionally, nobody has suggested infilling should be done “the way [I] have done.” I merely gave a toy example. The process used in creating the USHCN record is more complicated. It’s also more appropriate for the problem at hand. Quite frankly, if you’d take your own advice:

    Look at the problem in front of you and work toward a solution.

    You’d find infilling data has little effect on the results, as a matter of definition. Mathematically, it’s nothing more than interpolating data as a way of spatially weighting the data set. Nobody would have cared if they used a different process to accomplish this, even if the processes were structurally identical.

  25. Patrick B:

    If you “create” data by interpolation then that’s not data – it’s a result of calculation from other data.

    No, it’s data.

    It’s computational data rather observational data, but still data.

  26. Patrick B July 2, 2014 at 2:59 pm
    “If you “create” data by interpolation then that’s not data – it’s a result of calculation from other data. So presenting it as data is dishonest.”

    They aren’t presenting it as data. They are barely presenting it at all. Folks like Goddard have chosen to dig into large numerical files that they find on the internet that are clearly labelled “adjusted”. And they dig out numbers that were clearly labelled with an E – estimated. Nobody seems interested in the other file that is also there, labelled unadjusted.

    Data is adjusted for a clear purpose – to calculate an average with reduced bias. If you aren’t doing that, leave it alone.

  27. Carrick, not only that, but I believe the USHCN flags all of its infilled data as estimated. It would certainly be worth examining how the USHCN’s approach affects the uncertainty in their results, but there’s no reason to expect this infilling to matter much on the scales people are talking about.

    What I’d like to know is how this approach affects their ability to estimate regional temperatures. As you know, one of the problems I have with BEST is it appears to smear signals out a great distance from their origin. I think that’d make it interesting to compare USHCN and BEST. It’d be pretty funny if the effect of this infilling was smaller than a similar effect in BEST that nobody paid attention to.

  28. Armando, I don’t find examining extremes like that very interesting. By its very nature, calculating empirical breakpoints by comparing stations to nearby stations will be less effective in areas with less data. There’s obviously a problem in that example, but it just reeks of cherry-picking.

    I prefer examining more middle of the road cases, and looking at them in more detail. I started down that road in some posts a couple months ago, but I got sidetracked by other topics. You can see a bit in these two links:

    https://hiizuru.wordpress.com/2014/04/30/a-small-challenge/
    https://hiizuru.wordpress.com/2014/05/03/a-small-answer/

    The comments on the first of those are worth reading as well. Also, to avoid cherry-picking, it’s important to try to see how prevalent the problems are. Problems in .1% of the data aren’t too significant, but if you see them in 10% of the data, they’re probably fairly serious. As such, I recommend looking at data on larger scales. I started by looking at the stations of Illinois (where I happen to live) some time back:

    https://hiizuru.wordpress.com/2013/12/19/illinois-sucks-at-measuring-temperature/

    There’s a lot more I intend to say about BEST’s breakpoint system, but I haven’t gotten around to that yet. Still, I think those should provide a better starting point than a link to a single station.

  29. At once, infilling temperatures has little impact and is a necessity to reduce bias.

    Carrick, you say my concern about infilling is ‘not an issue’. What have you taken to be my concern? Brandon’s artificial example itself proves infilling is unnecessary. Station 4 and 5 have data points missing? Throw out 4, 5, and even 1 and 2 and obtain an average of station 3. It gives the same answer as Brandon’s. Why is this not an alternative?

    For every field with missing (temperature) values, the data can be tested to find out if infilling does not change the summary statistic significantly, and if so, the incomplete data can be eliminated.

  30. Shub Niggurath, you should try to understand the people you criticize. You make it obvious you’re not when you say things like:

    At once, infilling temperatures has little impact and is a necessity to reduce bias.

    Infilling temperatures is not necessary to reduce bias. The bias in question only happens if you use a stupid methodology. If you use any appropriate methodology, that bias doesn’t happen. That’s why I said:

    Mathematically, it’s nothing more than interpolating data as a way of spatially weighting the data set.

    I obviously didn’t mean the stupid thing you portray me as meaning. What I meant is using this methodology instead of using some other methodology to weight the data has little effect. They’ll both have the same effect.

    To put it another way, this methodology has little effect other than to remove the biases introduced by stupid methodologies like Steven Goddard’s.

  31. “Why has the E number climbed close to 40% of total data?”

    More reckless exaggeration. Back in early May, I listed the numbers that had reported so far in 2014. They were:
    891 for Jan, 883 for Feb, 883 for Mar, and 645 for Apr.
    They trickle in. 60% of 1218 is 731.

    But the thing is, if averaging absolute temperatures, you can’t easily introduce new stations. And volunteers wear out. The network was set up in 1987.

  32. The interesting thing to me in this is that the methodology used puts current temperatures above the temps in the 1930s. We know in the 30s the west went through a dust bowl. There was mass loss of vegetation and loss of animal life. All around the country temperatures were so blistering that to this day the records and heat waves are memorable and in the record books. Now we are told that for numerous years it’s been as warm as that time. But we don’t have a dust bowl. We don’t have long periods of 100+ degree weather all over the country breaking records. The 1930s records still stand in many cases. By adjusting the 1930s down maybe these records are broken but it seems unreal. It was clearly really hot back then to the point people were dying and animals were dying and vegetation was dying. There is no such thing happening today. How can it be as hot?

    It does seem as if AGW’ers are trying to rewrite history. Like they tried to do with the MWP and LIA. I talked to the head of lawrence livermore climate modeling project when he gave a lecture in my global warming class at stanford and he said the LIA and MWP didn’t happen. He said that to my face. We have proof that for hundreds of years large portions of the world were hotter than at anytime maybe even todays temperatures and similarly for hundreds of years temperatures were lower. We have proof of this. We have drawings artists made of the Themes frozen over in London and cows being barbequed in parks. We have evidence of the Vikings conquering large parts of Europe. We know they were not covered in ice. Yet today Greenland is covered in ice but we are told today is warmer than then. We are told that for hundreds of years the northern hemisphere or maybe just part of europe was hotter than the rest of the world at the time. How is that possible? Is there any other examples of regions of the world diverging from world temperatures for hundreds of years? Where is the explanation for such a bizarre thing to happen? Why would a climate “scientist” believe that without any reason to say why he believed it?

    I don’t believe it is hotter today than it was in 1936. I believe if you asked someone who lived at that time in America in those places and experienced those heat waves they would DISAGREE that today is hotter. The adjustments to the data are as great as the entire magnitude of the global warming signal. The certainty of those estimates while they may be sensible and even somewhat accurate cannot be nearly as accurate as the actual data therefore the lack of data introduces uncertainty. The fact you can average in a value does NOT mean that you’ve had added any information into the system. The point you neglected to mention in your analysis above is that the 3.25 numbers are AS ACCURATE as the 3 numbers because the uncertainty in the result from the lacking data makes any better number impossible. With the data missing you don’t know what it was. If the number in that slot was 4 instead of 2 as you “guessed” then the 3.25 number would have BEEN MORE CORRECT. You knew in advance the number was 2 so you make it seem like your 3 guess is better than Steves’ 3.25 number but if the data is missing we don’t know if Steve’s number is not the more accurate number.

    You guys are just as deceptive as Steve was being. The whole thing is a political mess with both sides using deception and neither being completely truthful or scientific. It’s a sad state of affairs.

  33. logiclogiclogic, I was going to write a lengthy comment disagreeing with your depiction of things related to temperatures, but then I got to this part:

    You guys are just as deceptive as Steve was being. The whole thing is a political mess with both sides using deception and neither being completely truthful or scientific.

    Clearly, nothing I say could possibly change your mind.

  34. If point 3 – the infilling for stations now closed – is potentially the biggest source of divergence between the raw data, then why not check this out?
    There are various simple audit checks that can be made that would indicate bias, but might not measure the full extent. For instance, why not separate the 1218 stations between those that are closed since the database was established and those that are open? Then, for each station, create a temperature anomaly, with the average for a particular year equal to zero. After indexing, average the anomalies for the total, the currently open stations, and the now closed stations. Then calculate the OLS trends for a period up to when all 1218 stations were active for each of the 3 groups (all, still open, now closed) then OLS trends for the period since.
    As an illustration, assume all 1218 stations were open in 2005, and had been reporting since 1976 – the start of the late twentieth century global warming period. The sets of OLS trends for the three groups of stations (total, still reporting in 2014, not reporting in 2014) should be similarly different pre-2005 as post 2005. If they are not the allegations of bias might have some foundation.
    Further, one could breakdown the data into states, and do something a bit more sophisticated than Paul Homewood did for Alabama and Kansas.
    See http://notalotofpeopleknowthat.wordpress.com/2014/07/01/temperature-adjustments-in-alabama-2/
    and http://notalotofpeopleknowthat.wordpress.com/2014/06/30/kansas-temperature-trends/

    As an aside, does anybody know how to convert a .txt.gz file into a .txt file (or even a .xls or .xlsx file)? I am interested in trying a few of these checks for myself.

  35. manicbeancounter:

    If point 3 – the infilling for stations now closed – is potentially the biggest source of divergence between the raw data, then why not check this out?

    It was never potentially that, but it has been checked. This post by Zeke from a month ago shows a number of ways of checking the effect. The last graph is particularly good, showing using infilled stations gives results practically indistinguishable from using only non-infilled ones.

    As an aside, does anybody know how to convert a .txt.gz file into a .txt file (or even a .xls or .xlsx file)? I am interested in trying a few of these checks for myself.

    .gz files are just compressed files. You can find any number of programs which will decrompress them online (I think I use Z-Zip at the moment). The software should come with instructions, but it should basically just have you select the compressed file and tell it where you want the decompressed version stored.

    By the way, it’s worth pointing out something:

    Further, one could breakdown the data into states, and do something a bit more sophisticated than Paul Homewood did for Alabama and Kansas.

    Those links don’t show any test of infilled data. What they show are the effects of the adjustments made in Point 1. Those adjustments do matter quite a bit for the results. The infilling doesn’t.

  36. Brando wrote – The answer is USHCN’s methodology doesn’t handle missing data well.

    This was the point I really took from the post because I think that’s accurate. The example makes it clear that the output from NCDC depends quite a bit on which stations get closed and how the data is infilled. If it’s the colder stations (urban) that are disappearing and the estimated numbers are being pulled from comparatively warmer stations then that’s obviously a problem, particularly when Anthony’s surface stations project shows that most sites run hot due to micro site issues. I’m not a stats wiz but it would seem that Goddard’s averaging method seems to show that it is in fact the colder stations that are disappearing and until someone does a detailed analysis posts like this are simply conjecture not based on a lot of facts.

    I’ll have to admit I don’t trust government numbers on anything so I’m probably biased but I think quite a few people go the other way and are too trusting.

  37. Brandon, I am sorry you are having a fight with Anthony due to a disagreement about the importance of the data alterations going on.
    1218 stations set up as an ongoing permanent base.
    Stations have been lost, closed and replaced.
    Zeke has suggested to S G that there may be only 650 original stations. 12/5/2014.
    Nicks figures suggest there may be data from up to 833 real not all original stations.
    No one has said how many real, how many replaced and how many made up stations there are though this figure must be known.
    Could you find out with your contacts. If there are only 833 and some of these are also having days of infilling when they breakdown the actual amount of dummy stations per month would be over 40 percent.
    No matter how you slice it this is fudge territory.
    Further more Zeke went on to say that the current data is adjusted upwards if some poor station has the temerity to be out of line too much (a breakpoint).
    And is always being adjusted.
    He further said that these measurements are then taken as correct and all past records are adjusted downwards back to 1900.
    Nick cheats semantically by saying the data is unchanged when what you and I and SG et al are talking about are the adjusted records from this data.
    He knows very well that the adjustments are presented in the records as what people think of as real data.
    The effect on lowering the past means that the very things we all argue about such as the highest year on record changes in data sets and then sometimes changes back.
    How ridiculous.

  38. Bob Johnston, you kind of have things backwards. The results of Steven Goddard’s methodology do “depend[] quite a bit on which stations get closed,” but the results of the USHCN’s methodolog do not. That’s why as weird and unintuitive as their methodology is, it’s still fine. It’s more inefficient than anything else.

    Angech, sadly, the dispute Anthony Watts and I have had has pretty much nothing to do with “the importance of the data alterations going on.” The primary problem is his behavior was inexcusably bad (misrepresenting our private communication, misrepresenting the Polifact article, using petty behavior and censorship to shut down discussions, etc.). I criticized him for saying untrue and/or incoherent things, not for holding a position which is wrong.

    The reality is Watts knows, or at least should know, infilling station data does not introduce any notable biases or errors. Similarly, he knows (or should know) the USHCN warming trend is due primarily to adjustments unrelated to the infilling of data. There’s no technical dispute on either of those points. Everyone should be able to agree to them.

    The only legitimate disagreement I know of is on one of two points: 1) Are the adjustments I referenced in Point 1, which are the primary cause of the warming trend in the USHCN data set, done correctly? 2) Is changing estimates of past temperatures on a regular basis bad? I believe Watts says no the the former and yes to the latter. I say I don’t know to the former, and sort of to the latter.

    (Watts’s rhetoric regarding the latter is greatly exaggerated. The fact past temperatures change is undesirable, but it is not inherently wrong. What is wrong is the lack of documentation tracking these changes for verification purposes. Put simply, you shouldn’t overwrite old results with new ones. You should just store the new results in a new file.)

  39. Just interjecting a comment here… what you really want to do to compute the mean temperature of the CONUS is perform an integral over the temperature field (a continuous variable) and divide it by the area of the CONUS.

    Because there is a high degree of correlation that diminishes with distance (say averaged over five days), the Nyquist Sampling Theorem guarantees that you only need to sample the US with a fairly coarse spatial sampling in order to recover all of the information that is present (modulo microsite variability).

    That said, the way NCDC is computing the average is a bit goofy, because if you don’t have uniformly spaced stations, you’ll weight more heavily regions that have a high density of stations and underweight regions with a low density of stations. In other words, doing what NCDC is doing will result in a net bias in your estimate of the mean temperature of the CONUS.

    So when Brandon says you don’t need to use infilling to reduce bias, he is correct. If you did a correctly area-weighted average, by what ever method, it won’t matter whether you have stations dropping out over time, as long as you’ve met the minimum spatial resolution needed to fully sample the averaged temperature field.

    But if what you want to do is keep the same spatial weighting as used in the original average of the 1218 stations, you will either need to infill or something equivalent to infilling. Otherwise you will produce a geographic bias in your data over time, as the number of stations decreases over time.

  40. Shub Niggurath, if you want to know how the missing values in my examples were handled, I suggest reading how the post says they were handled. Then, if what it says isn’t clear to you, quote the portion that is unclear.

    Carrick, aye. You can easily create a two dimensional analogy of this. All you have to do is take a time series where data is missing. The NCDC methodology would interpolate that data so you had data for each point (analogous to infilling data), then it’d smooth the data (analogous to averaging series). There’s nothing inherently wrong with that.

    It’s true you could solve the same problem by using a smoothing algorithm which can work despite missing values. It’s even true such an algorithm would likely be preferable. That doesn’t mean it is necessary though.

    (If you want to extend the analogy to include spatial weighting, allow the measurement periods to be uneven. Lots of smoothing functions can’t account for that, and I’ve seen plenty of cases where people interpolated data before smoothing because of that.)

  41. “That said, the way NCDC is computing the average is a bit goofy, because if you don’t have uniformly spaced stations, you’ll weight more heavily regions that have a high density of stations and underweight regions with a low density of stations.”
    I’m pretty sure USHNC uses gridding. Menne uses a 1/4° grid in one of his papers.

  42. When in doubt, use PRISM to determine the long-term spatial averages of temperature AND precipitation. That group took some very innovative approaches to work with “imperfect data.”

  43. I didn’t write this for people interested in examining this issue in detail. They shouldn’t need a post like this. This post is just for people who are seeing all the rhetoric and not having a way to know what the rhetoric is all about.

    Job done in this case. Thanks Brandon.

  44. Nick Stokes:

    I’m pretty sure USHNC uses gridding. Menne uses a 1/4° grid in one of his papers

    That’s even more curious.

    If you grid, when you have 1218 points to start with, I wonder what’s the point of infilling?

    Brandon, you perform a DFT on irregularly spaced data. The only reason you’d ever interpolate is so that you can use e.g. an FFT-like algorithm. However, if you use polynomial interpolation, that smears the high-frequency content. It’s generally inferior to the non-uniform DFT in terms of SNR but is faster.

  45. The results of Steven Goddard’s methodology do “depend quite a bit on which stations get closed, but the results of the USHCN’s methodology do not.

    I see, NCDC has devised a reliable system where they can take flawed data that doesn’t show what their boss wants to see, eliminates a bunch of actual data and puts it through their black box and the output, which shows exactly what their boss want it to show, is now golden and is above reproach.

    Thanks, got it.

  46. Brandon,
    You say in your post: “To address that, USHCN attempts to fill in the missing data before calculating it’s averages.”

    then you say: “Infilling temperatures is not necessary to reduce bias”

    then you say: “I just showed my post alludes to options other than infilling temperatures.”

    then you say: “if you want to know how the missing values in my examples were handled, I suggest reading how the post says they were handled.”

    Not interested in playing games. I take your position to mean filling in data is a necessary and valid step required during calculating averages from a field. This position is untenable for your own example because the assumption enabling you to put in the ‘2’s in place of the NAs is sufficient to discard the two stations and leave the outcome unaffected.

    In other words, this step is not universally applicable and where applicable, unnecessary.

  47. Richard Drake, I’m glad to hear it!

    Carrick, agreed. Most problems have many solutions. This is a case where the USHCN methodology may not be the best solution, but it is good enough that changing the methodology wouldn’t improve the results in a significant way. And of course, changing the methodology to one like Steven Goddard’s would just make the results worse.

    Armando, indeed. Also, thank you for using that emoticon rather than the one I’m pretty sure is trying to consume my soul (see example).

    Bob Johnston, that is nothing like what I said. It’s also not even remotely close to true. Leaving aside everything else, the indsiputable reality is infilling data doesn’t affect the results anywhere near as much as other aspects of the USHCN methodology. The adjustments I referred to in my Point 1 have a far greater impact. A person could argue the infilling is perfectly correct yet those adjustments are fraudulent. But then, nuance is a matter for people who actually want to understand the subject they complain about.

    Shub Niggurath, since I’m long-past tired of dealing with the games you play, I’ll keep this simple:

    Not interested in playing games. I take your position to mean filling in data is a necessary and valid step required during calculating averages from a field. This position is untenable for your own example because the assumption enabling you to put in the ’2′s in place of the NAs is sufficient to discard the two stations and leave the outcome unaffected.

    You don’t understand my example. The station data in my example is listed horizontally. There is no way to read my description of the example as you’ve done. Your interpretation is so wrong as to be silly, and that is the entire basis of your claim. You’re clearly not trying to understand what you’re criticizing.

  48. Brandon, I do use spline interpolation sometimes simply because it’s easier to implement that Fourier interpolation. There are applications where it matters, but I am pretty sure that long-term trend estimation isn’t one of them.

  49. Carrick, that’s actually one of the things missing from discussions of temperature records which bugs me. People often say things “don’t matter” because they don’t have a significant effect on some OLS trend. I think that’s a strangely naive argument. If global warming is as serious a problem as many people claim, OLS trends aren’t all we should care about.

    I actually brought this up when Steven Mosher and I considered collaborating to examine UHI effects. His initial ideas for testing involved only looking at how UHI might effect long-term trends (or absolute differences between endpoints, he wasn’t consistent). I responded by saying we should look closer. I thought (and think) it’d be useful to check the effect of UHI affecting some months and/or years more than others. We know UHI’s impact depends upon things like season and rainfall. A year with severe droughts will not have the same UHI signal as one with record rainfalls. You can’t look for effects like that in linear trends.

    As a demonstration of why that could matter, suppose our temperature records show three years had identical temperatures, but rainfall amounts (and/or patterns) were different for each year. It’s possible those temperatures would have been different from one another if not for UHI. Similarly, it’s possible an anomalously warm year gets at least part of its anomalous warmth from anomalously high UHI. What if 1998’s spike was really 20% because of UHI?

    I’d like to know UHI doesn’t introduce systematic biases into the data. I’d also like to know such biases don’t get smeared around so much as to be unidentifiable (by the “corrections” done for UHI). I don’t think we can answer that issue by looking at impacts on some OLS regressions. Those OLS regressions may be enough for crude answers, but if global warming is a serious problem, crude answers aren’t enough.

    Plus, those OLS regressions are almost inevitably for large areas. They tell us next to nothing about effects at regional scales, scales which I think are the most important.

    So yeah, I’d like to know more about how USHCN’s infilling affects its ability to estimate temperatures on finer scales.

  50. Brandon Shollenberger
    Your response to me was exactly as I expected.
    Your Problem Solving skills are worthy of a Climate Scientist.

    Houston, we have a problem.

    Don’t worry, we have analysed what you have said and broken it down in to sections.
    Most of it is not important.
    We have made up some test numbers and run a simulation of the important part.
    The simulation with the made up numbers show that there is no problem other than the way that you have analysed the data.
    Apollo 13 carry on with the mission.

    Classic and priceless.
    And all without looking at a single actual data point.

    Stay with your adoring audience, you obviosly have no intention of actually UNDERSTANDING the real problems with the data.

  51. A C Osborn, I can’t imagine what makes people posts responses like yours. Even if you were right, you’d never convince anyone by behaving like you do. Posting comments solely to insult people while avoiding any sort of actual discussion just makes you look like a close-minded fool.

    And seriously, “adoring audience”? Half the people who’ve praised this post have had violent disagreements with me in the past. I’m sure a number of them hold not-to-kind views of me. Heck, at least one of them has long engaged in a smear campaign against me, going so far as to publicly say he believes I’d lie about anything to make myself appear correct. Nevermind the fact I’ve publicly accused a large majority of the people who have praised this post of significant bias and worse, often specifically by name.

    If I have an “adoring audience,” I have no idea who is in it.

  52. Brandon:

    Carrick, that’s actually one of the things missing from discussions of temperature records which bugs me. People often say things “don’t matter” because they don’t have a significant effect on some OLS trend. I think that’s a strangely naive argument. If global warming is as serious a problem as many people claim, OLS trends aren’t all we should care about.

    To be clear, when one says something “doesn’t matter”, he/she/it/they must be specific about what it doesn’t matter with respect to. Saying it doesn’t matter for e.g. long term trends says nothing about the importance of the effect any other problem.

    The problems I study are usually dominated by the low-frequecy portion of the signal. It turns out that most of the error associated with cubic-splining is high-frequency, where you’re just staring at noise-floor in any case. For such problems cubic splining versus Fourier interpolation are not going to have an appreciable effect on the measured SNR of your measurement. Like with trend estimates (which deal mostly with the low-frequency portion of the signal), improved sampling methods will just cost you additional processing time, with little benefit. (Since we have several GB of data per day of measurement, and as many as 45-days of measurements, how fast the processing is “does matter”.)

    You yourself suggested

    This is a case where the USHCN methodology may not be the best solution, but it is good enough that changing the methodology wouldn’t improve the results in a significant way.

    My point was “yes, this is probably true for OLS trend estimates”. If you are interested in the high-freuqency portion of the signal, then a better interpolation scheme than rectangular gridding is likely going to yield improvements.

    So yeah, I’d like to know more about how USHCN’s infilling affects its ability to estimate temperatures on finer scales.

    Agreed. Just because Menne used gridding in certain applications tells me nothing about how the average CONUS temperature was derived as it pertains to the published NCDC data product.

    I would like to understand this better too.

  53. Carrick, popular arguments about global warming just focused on the low frequency content prior to the “pause.” People are discussing higher frequency components more now, but the mainstream position is basically that they don’t matter because only the lower frequency components do. I don’t agree with that, so I don’t agree when they say things “don’t matter” based upon it. So I agree with what you say:

    My point was “yes, this is probably true for OLS trend estimates”. If you are interested in the high-freuqency portion of the signal, then a better interpolation scheme than rectangular gridding is likely going to yield improvements.

    I just think arguments should be clearer in their context. We have a ton of people arguing about the modern temperature record. They almost exclusively talk about low-frequency components, and they often don’t qualify their conclusions as such.

    (Since we have several GB of data per day of measurement, and as many as 45-days of measurements, how fast the processing is “does matter”.)

    I’m curious. What is it you guys are studying that involves collecting so much data? I’m see you make references to your work a number of times, but I don’t think I’ve ever really known what it is you’re working on.

  54. There are other points, call them 2.5 and 4.5.

    There is data that has been collected, but the process treats it as “missing” and in-fills. Such data may in fact be bad data, (same station name, but a different station place, altitude, observation device) but the process seems to wholesale the replacements. It is not clear to me that the infill estimate is “better” than the discarded data.

    The input of the originally collected (perhaps bad) data into Goddard’s process, (wonky as it may be) generates a visually significant and distinct chart from input of the replaced in-filled data. This suggests, but does not prove, that using the originally collected data in a CORRECTLY developed process may also generate a different chart than that produced by the estimated, in-filled, data. It not clear to me that a more correct, or less wonky, process is comparable sensitive.

  55. If I have an “adoring audience,” I have no idea who is in it.

    I don’t know about adoring, but this is one of the best skeptic blogs on my regular reading list and your analyses have me, a regular SkS reader, fairly convinced that Cook’s study is unsystematic mush (whereas I found Tol’s attempt at a takedown equally so). It helps that you’re clearly not a tribalist, while Watts unfortunately has those tendencies. I think it’s clear that he realized he was “losing the room” by criticizing Goddard’s claims and wanted to un-distance himself.

  56. Pouncer, averaging methods which aren’t terrible have been used without infilling data. Simply anomalizing the records prior to averaging them is enough to make the discrepancy between infilled and non-infilled much smaller. Do any sort of spatial weighting, and the difference is practically non-existent. One of my comments linked to a lost by Zeke showing some comparisons like that. I’d repost it/go into more detail, but I’m at a BBQ and commenting from a phone is mors difficult.

    QQ, I find that both delightful and disappointing. I’m happy about how you feel about this blog, but it worries me that it is basically a skeptical blog. I hadn’t intended for that to be the focus of it >.<

    (For what it's worth, I think you're probaby near the mark on Anthony Watts. I get the impression skeptics banding together to push a single story was important to him. Ironically, he promoted this article on Twitter as being helpful. I'm not sure how he squares that praise with the fact it says he was wrong in his post about this topic.)

  57. Brandon, atmospheric physics related. E.g., we’ve put 130 digital (LINUX based) sensors into the field that were recording at 1000 sps. That works out to closer to 45 GB of data per day (which we archive), but we typically process about 10% of that. The relatively high sampling rate is because we’re measuring pressure signals from impulsive events (and you get nonlinear regeneration of high frequency components… “wave steepening” …) In practice our band of measurement is about 0.05 Hz-500 Hz.

  58. unfortunately your 5×5 matrix as an example of how missing data is filled in is missing the fact pointed out be Goddard that now over 40% of the data is being estimated. Using this 40% makes for a whole different story… Namely, you only show 2 out of 25 stations missing (8%). Also, you write “If you averaged these five lines together, you’d get 3, 3, 3, 3.25, 3.25. If the data had been missing from the second line instead of the fourth, you’d get 3, 3, 3, 2.75, 2.75.” This is incorrect too, since your raw, original, data doesn’t have that many decimals. It’s single digit… so you have to write 3, 3, 3, 3, 3, and 3, 3, 3, 3, 3…. Basic summary statistics…

    Your original matrix is like this (8% estimated data)

    1 1 1 1 1
    2 2 2 na na
    3 3 3 3 3
    4 4 4 4 4
    5 5 5 5 5

    using the 40% missing station it actually is more realistic and like this

    1 na na 1 1 1 1 1 1 1
    2 2 2 na na 2 2 2 2 2
    na na 3 3 3 OR how about 3 3 3 3 3
    4 na 4 na 4 na na na na na
    na 5 na 5 5 na na na na na

    if we keep the na’s blank than the averages are 2, 4, 3, 3, 3, the first 2 values are off by 33%!, and 2,2,2,2,2 in the other case…. In addition, trying to estimate the na’s from neighboring stations becomes in some cases much harder since sometimes these neighbors are also missing…

    Hence, your example is over simplified and doesn’t fully explain the detail and depth of the enormous issue at hand.

  59. word press messed up my 2 matrices… so here it goes:

    1 na na 1 1
    2 2 2 na na
    na na 3 3 3
    4 na 4 na 4
    na 5 na 5 5

    OR how about
    1 1 1 1 1
    2 2 2 2 2
    3 3 3 3 3
    na na na na na
    na na na na na

  60. Brandon Shollenberger: “That first clause is important. According to it, infilling is wrong “when station loss is biased towards loss of colder rural stations.” Apparently it wouldn’t be wrong if station loss were biased toward loss of warmer, urban stations?”

    Don’t be a dense jerk. Your entire post above skipped over this issue. At least from my readings, it is of central importance to Goddard’s claims. If there is going to be loss of stations, the loss better be a random sample. If the loss of stations is biased toward either cooling or warming, then the dataset is giving a false signal.

    A couple thoughts with questions:

    1) Goddard claims that the loss of rural station data is giving a false warming signal. Is that correct or not correct?

    2) If the people at NASA and NOAA are aware that the loss of rural stations gives a false warming signal, and they proceed with it anyway, why is that not rightfully considered “fabricating” data as Goddard claims?

    3) If the people at NASA and NOAA are NOT aware that the loss of rural stations gives a false warming signal, are they even competent enough for their position of employment?

    The above shouldn’t be difficult to understand or even controversial. Why you choose to ignore and downplay it raises suspicions in my mind. Makes me think you’ve got an dog in this race.

  61. Soulsurfer, you are doing exactly what Brandon is cautioning against, by asserting that since 40% dropped data distorts the averages, it must similarly distort the interpolations. Here is an example matrix with 2 SFs, multiple columns with double-missing neighbors, and a bias for “low” dropped data to generate false “high” averages.

    n/a n/a 1.0 1.0 1.0
    n/a 2.0 n/a n/a n/a
    3.0 n/a 3.0 n/a 3.0
    4.0 n/a 4.0 4.0 4.0
    5.0 5.0 n/a 5.0 5.0

    Raw averages: 4,0, 3.5, 2.7, 3.3, and 3.2.

    3.0 2.0 1.0 1.0 1.0
    3.0 2.0 2.0 2.5 2.0
    3.0 3.5 3.0 2.5 3.0
    4.0 3.5 4.0 4.0 4.0
    5.0 5.0 4.0 5.0 5.0

    Interp averages: 3.6, 3.2, 2.8, 3, and 3. Where data isn’t fixed entirely, it’s at least significantly closer to the “true” signal of 3.0, using the most lazy, bare-assed form of interpolation. The lingering problems are from edge values, where I just shrugged and duplicated. Ironically, this is actually due to the simplicity of the example; the impact of edge duplication in a 500×500 matrix would be minimal, and that’s assuming NOAA doesn’t have anything more sophisticated for coastal interpolation. Nevertheless, it’s still an improvement over raw data. Even with losses on a scale of 40%, interpolating is a better idea than raw averaging, especially in cases of non-random data loss.

    Obviously, this doesn’t scale to infinity; at some point station loss may reach a point where coverage becomes too thin to accurately interpolate. But the USHCN is a huge network with a lot of slack left in it, and we can verify its integrity through correlation with other station databases, such as the much larger BEST database.

  62. SDB, Shub Niggurath, you guys are funny. SBD says:

    Don’t be a dense jerk. Your entire post above skipped over this issue.

    Even though I’ve answered this point in comments multiple times. For example:

    The reality is Watts knows, or at least should know, infilling station data does not introduce any notable biases or errors. Similarly, he knows (or should know) the USHCN warming trend is due primarily to adjustments unrelated to the infilling of data. There’s no technical dispute on either of those points. Everyone should be able to agree to them.

    I could have been more explicit in my post in stating this, but I’ve clearly made no effort to hide from the issue. I’ve discussed it multiple times in response to people talking about it. As such, it’s silly to ask say things like:

    1) Goddard claims that the loss of rural station data is giving a false warming signal. Is that correct or not correct?

    The above shouldn’t be difficult to understand or even controversial. Why you choose to ignore and downplay it raises suspicions in my mind. Makes me think you’ve got an dog in this race.

    And:

    Hey, some good questions. Good luck getting an answer.

    But to be clear, I’ll repeat the same point I’ve made all along: The loss of rural station data gives a false warming signal only if you use Steven Goddard’s wonky methodology. Under the USHCN’s methodology, or any other sensible methodology, it does not.

  63. I should point out when soulsurfer says:

    Hence, your example is over simplified and doesn’t fully explain the detail and depth of the enormous issue at hand.

    He’s sort of right. I intentionally created an example as simple as I could. I never set out to “fully explain the detail and depth” of the issue. All I was doing was explaining, in simple terms, why a biasing effect in Steven Goddard’s methodology can exist. The point of toy examples like these is to help you understand a concept, not explain every detail.

  64. Brandon Shollenberger: “The loss of rural station data gives a false warming signal only if you use Steven Goddard’s wonky methodology. Under the USHCN’s methodology, or any other sensible methodology, it does not.”

    Please explain more clearly how that could be true. It seems like you are saying that if we were to take both a random sample of rural stations and a random sample of urban stations, there would be no statistically significant temperature difference between them. If that is true, ok, I concede the point. But I thought the urban heat island effect was not controversial?

    If we were to take a random sample of both rural and urban stations and the urban stations were significantly warmer, then how can you conclude “The loss of rural station data gives a false warming signal only if you use Steven Goddard’s wonky methodology. Under the USHCN’s methodology, or any other sensible methodology, it does not.” (given the loss of rural stations, being infiled using urban stations).

    I don’t understand your (lack of?) logic here. Please explain more clearly. Thank you.

  65. From my understanding, temperature readings are taken at Midnight (?). Is there a particular reason for doing that, and does the influence of atmospheric thermoclines near the ground have any affect on these stations? Not being flippant but looking for more information. As someone who rides a motorcycle in the country, I can be enjoying a temperate 80°F breeze and then hit a thermocline where the temperature drops about 15°F for about five miles and then I exit out of it and back into the heat of the day. Same surroundings, same road, same amount of daylight hitting the surface, same topography, but for that five-mile stretch it feels like I’ve biked into Fall weather.

  66. QQ, nice example, but why stop there?

    Look at this matrix:

    NA NA NA NA 1
    2 NA NA NA NA
    NA NA 3 NA NA
    NA NA NA 4 NA
    NA 5 NA NA NA

    80% missing data and still gives the ‘right’ answer without infilling!

  67. SDB, Here’s a mathematical proof that Goddard’s method produces an artifact. I wouldn’t pay any attention to Shub here because he’s just being a contrarian and I think he’s smart enough to realize what he is arguing is crap.

  68. Carrick, this is a bit …tricky. You (and others) are pulling an old trick which was applied to McIntyre. As if McIntyre was supposed to come up with a valid alternate reconstruction.

    My position is clear: infilling is what it is. (1) where is the justification for its necessity outside of concocted examples, and (2) how is infilling justifiable given the purpose of measurement is hypothesis testing.

    Mind you, (1), given the examples, it asks for necessity, does not question validity. That is a separate issue.

  69. QQ:
    “Soulsurfer, you are doing exactly what Brandon is cautioning against, by asserting that since 40% dropped data distorts the averages, it must similarly distort the interpolations.”

    The problem is that Brandon is cautioning against dropping data by claiming that interpolations have no systematic error while Goddard argues that they do. The simple cherry-picked example in this blog post (and your comment as well) are very special ones – ones that works out perfectly (blog post) or really well (your comment) with interpolation to give the right answer. That’s a bit of confirmation bias, and Soulsurfer has come up with an example of data that would cause a systematic error if you were to use interpolation. The crux of the issue is that people like Goddard think these corrections are introducing a systematic error – one that causes the temperature data to be cooled more the farther back you go. He argues that the data processing is dropping cold temperatures and interpolating them as warmer ones. If you notice, your data shows the same trends as the OP – the numbers tend to be in increasing order and close to nearby ones. What if you’re dropping a rural station and interpolating between two cities near it? You get this:

    5 (n/a) 5

    That turns in to an average of 5. What if the station that was dropped was experiencing a 3? You get artificial warming, and that’s the type of systematic error that is the problem with interpolation, and also a problem with removing rural reporting stations. The actual average there should be 4.3333 but both averaging and interpolation give 5.

  70. Oh come on.

    There’s absolutely nothing at all tricky about this. As I’ve pointed out, there is an error in Goddard’s method, it can be fully identified by analytic methods. This has absolutely nothing remotely in common with demanding that McIntyre come up with a valid alternative reconstruction.

    This whole argument boils down to the question of “how to take the average of a continuos field over a surface”. To me this seems like a really trivial point. There are some technical niceties about better and worse ways of computing that average, but you have shown yourself intelligent to be able to sit down and reason this out on your own.

    So you should do it.

    Infilling is one approach. As was discussed above there are other, superior methods.

    But treating data that have flaws (e.g., missing data points) as if they are error free and expecting the results to be meaningful is just foolish.

    Unlike with McIntyre, nobody’s asking Goddard to come up with a better method. Many of us don’t think he’s even capable of it.

    But we already understand the nature of the flaw and know how to correct for it. We don’t need Goddard’s help. We just need people to attempt to be more intellectually honest about it, when people make layman processing errors, such as Goddard has done.

  71. “Many of us don’t think he’s even capable of it.”

    So….why talk about it?!

    This is incredible.

    All that has bothered me are: (a) which sane data processing method requires 40% data to be synthetic? (b) How can infilling not be circular when the objective is estimation itself?

    And, if missing data is such a simple thing, why does each iteration keep reducing past temperatures little by little? It should be a one-time thing?

  72. Infilling isn’t any more circular than averaging. The only difference between averaging and infilling is averaging, functionally speaking, replaces any missing values with the mean value for the entire day’s record, while infilling replaces them with only the mean from the nearest available neighbors. Yes, if the record looked like this:

    5 5 X 5 5

    and the station that was dropped was 3, then infilling would be inaccurate. Similarly, soulsurfer’s second exemplar, where the 40% is dropped evenly from the bottom two rows of the matrix, is impossible to cure with infilling. However, not only are those two examples more contrived than the one I put forth (one relies on outliers, the other on edge cases whose effects are muted in a larger matrix) averaging doesn’t do any better. You’re suggesting scientists throw up their hands and go “well, no point in using this clearly superior method because it’s not perfect, better use the inferior one” for no good reason.

    This is why Goddard’s conflation of the separate points is so ridiculously self-contradictory and serves no purpose except to maximize tribal shit-flinging at “warmists.” If the rural station loss is as bad as Goddard claims (Issue 3), averaging is clearly a worse method. If that data is gone, you want it to be interpolated from other rural stations, not replaced by a mean comprising an ever-increasing portion of urban stations. And in fact, the recent rural station dropout is why Brandon’s graph in the Irony post shows that post-2012 adjustments have largely been in the direction of cooling. That graph Goddard spams in every post has nothing to do with station dropout, it is a function of the TOBS, MMTS, etc., adjustments (Issue 1).

  73. qq, you forget the alternative – use only good stations.

    And it is a one-time thing. Why should infilling change past temperature every 5 years?

    Plus, I appreciate people who do the hard work. Anyone can come up with contrived examples that prove the own points.

  74. Shub:

    So….why talk about it?!

    This is incredible.

    Look, if Goddard makes a huge gaffe in his analysis, as he’s done, of course we should “talk” about it. We can talk about what the error is, about improved methodologies, philosophy of measurement etc.

    All without expecting a word of it to make a dent on his overly thick skull. Or anybody else who’s decided to try on a thick skull of their own for size.

    All that has bothered me are: (a) which sane data processing method requires 40% data to be synthetic? (b) How can infilling not be circular when the objective is estimation itself?

    Simple enough. If the data are heavily oversampled, you can drop some of the samples without it affecting the result, as long as you account for the effect of the dropped samples on the altered geographic weighting of the measurements.

    Infilling works (thought it’s not my preference) because it preserves the original geographical weighting, anomalization helps, gridding is better, Fourier interpolation schemes are probably “best”. Only ignoring the fact that your geographical weighting changes when you lose data is demonstrably “just wrong”.

    And, if missing data is such a simple thing, why does each iteration keep reducing past temperatures little by little? It should be a one-time thing?

    I think that is because infilling is non-optimal, and you end up getting “leaking” from regions that have large trends into regions with small trends. This happens for the same reason that using absolute temperature causes a larger error than using anomalized temperatures.

    The point that Brandon and others have been trying to make is that infilling tends to reduce the magnitude of the error associated with missing data. It doesn’t eliminate it, it’s just a “poor mans” way to analyze the data. Other methods will certainly perform better than what the USHCN processing does.

    I’m not defending USHCN, but I am saying what they are doing is not outright wrong.

    I am however saying what Goddard is doing is outright wrong.

  75. Shub:

    Plus, I appreciate people who do the hard work. Anyone can come up with contrived examples that prove the own points.

    You are missing a key point here. It doesn’t matter than the examples are contrived, only that they are representative of values that could happen in the real world.

    If the claim is that you can ignore missing data as Goddard seems to believe, you need but a single counter example to disprove it.

    In mathematics this goes under various names, but one of them is proof by counterexample.

  76. “The point that Brandon and others have been trying to make is that infilling tends to reduce the magnitude of the error associated with missing data”

    I’ve only seen you guys just state this. No evidence. And if 40% synthesis is required, for whatever reason, the whole thing is broken anyway and cannot be ‘fixed’, and should not be fixed.

  77. We are getting “lost in the forest” with this discussion. After reviewing all the published documents, if you have a clear path forward to the QC and processing of all these datasets, then get it out for review and it then has a chance of being adopted. These random messages and incomplete matrices do not stay on topic.

  78. qq, you forget the alternative – use only good stations.

    Which addresses the issue of rural station loss how, exactly?

  79. Shub:

    I’ve only seen you guys just state this. No evidence. And if 40% synthesis is required, for whatever reason, the whole thing is broken anyway and cannot be ‘fixed’, and should not be fixed.

    There is only so much hand holding I am willing to do. You are either interested in understanding this, in which case you will, or you won’t, in which case hand holding is useless in any case. But the point is obvious and so you can check it. If you find people are wrong, show it, but don’t expect any sympathy or even respect if you choose to be simultaneously bellicose and too lazy to do your own homework.

    Again, you are approaching this mathematically. This is fundamentally a data integrity issue and if anything a statistical problem.

    Math is the language of physical sciences. That you find a mathematical proof inferior to rhetoric tells us a lot.

  80. Brandon, you haven’t presented an explanation or argument for when, why, and how missing data can be imputed. Just asserting that it’s no big deal is not going to be convincing for most people, especially general public.

    For people unfamiliar with the practice, data imputation seems like a scam on its face. That’s totally understandable, and they should be skeptical.

    There are different kinds of missing data, in different conditions, and imputation is allowed, or even desirable, in only some of those cases. You haven’t gone into any of that, or cited any sources. It’s definitely not the case that imputation is just something scientists do automatically and it’s no big deal.

  81. Carrick, thanks for your comments. I’ve been caught up in holiday stuff today so I haven’t gotten around to responding to people (and my phone spent most of the day dead). It’s good someone did.

    Everyone else, I should be able to get back to you sometime tomorrow.

    ScienceNow, I showed why not infilling data can have negative effects. I’ve shown why infilling the data can be better than not infilling it. That’s all I’ve set out to do. If people want to argue infilling is better than Steven Goddard’s approach but worse than other approaches, they can. It could be interesting. What it couldn’t be is a refutation of the outline I’ve provided in this post.

    In other words, I’ve laid out the issues in a clear manner. I’ve shown one methodology is unquestionably bad. That’s about all I’ve claimed (or sought) to do.

  82. “….too lazy to do your own homework.”

    ? The grid with 5 points of data from above is homework. Misses lots of data and gives the right answer with ‘Goddard’s method’.

    More on why the original example is unsuited? Who averages stations recording temperatures with differences five times in magnitude from one another? Such a pathetic network cannot afford to miss data, can it?

  83. Just curious about two points related to the blue and red lines in your borrowed chart above.

    If the difference between it and the top red line is only because of randomly missing data, then:

    – why is the blue line consistently cooler than the red line? Wouldn’t the “missing” data – when estimated based on surrounding stations – be random as well, and on both sides of actual temperatures?

    – why does the blue line converge with the red line in recent times? Is it because we are missing less data now than in the past?

  84. Carrick: “SDB, Here’s a mathematical proof that Goddard’s method produces an artifact…”

    Thanks. But nowhere did I defend Goddard’s method. What I have been defending is his criticism that infilling is producing a false warming signal. To me this just seems logically simple. If cooler rural stations are being dropped, and they are being infilled by warmer urban stations, then the final average will produce a false warming signal.

    Is that incorrect?

  85. SDB:

    What I have been defending is his criticism that infilling is producing a false warming signal.

    I think it’s probable it does add a spurious signal (your example is correct):

    I think it’s confirmation bias to assume that it has to be a warming signal though. Station loss isn’t just because of e.g. cold. Most station losses are due to retirement, and that can be for any reason including damage to equipment and no money to replace. This preferentially happens in the SE US due to the increased frequency of lightening storms. Also regions that were heavily oversampled are getting edited over time.

    Brandon and I’ve even discussed this on a separate thread on this blog.

    I think the argument is that infilling introduces less bias than not infilling does.

    Also, if you combine infilling with anomalization (deseasonalization), then this:

    To me this just seems logically simple. If cooler rural stations are being dropped, and they are being infilled by warmer urban stations, then the final average will produce a false warming signal.

    Is not as big of an issue.

    (It is still an issue because deseasonalizing data only removes the annual variability. If you have different trends for different region, you’ll get a “spatial smearing” of regions with larger anomalized temperatures into regions with smaller ones. Detrending the data before processing may help on that. Using a more optimal method such as a 2D non-uniform discrete Fourier transform to interpolate missing points, could help too.

    That’s more work than I have to spend on this unfortunately without grant money to pay for salary.

  86. Alright, time to try to catch up. If I don’t respond to something and you’d like me to, just leave a comment saying so. Until then, Mike Fayette asks a couple questions:

    If the difference between it and the top red line is only because of randomly missing data, then:

    – why is the blue line consistently cooler than the red line? Wouldn’t the “missing” data – when estimated based on surrounding stations – be random as well, and on both sides of actual temperatures?

    The red and blue lines are not different because of randomly missing data. They both use the same data. Their differences arise from how they combine the data. The red line simply averages the data as is. The blue line anomalizes the data first and grids it to handle spatial weighting.

    My comment about the blue line being for infilled data was a mistake I failed to correct while writing the post. I’ll update the post to reflect this. Thanks for drawing my attention to it! (You’ll note the sentence following my description of the blue line doesn’t make sense with the description I gave.)

    As for the differences between the lines, that’s because if you don’t anomalize or spatially weight the data, stations with higher temperature are given undue weight.

    – why does the blue line converge with the red line in recent times? Is it because we are missing less data now than in the past?

    Generally speaking, how you align lines on a graph when using anomalies is arbitrary. You can shift lines up and down however you’d like. Zeke chose to align the three lines by setting them equal to one another in recent times for a visual purpose – doing so makes it easy to see the differences in the three lines.

    Zeke could have aligned them on any period. If he had chosen to align them at the beginning, all the lines would match up at the start of the graph and diverge by the same amount at the end.

    In other words, you can move those lines up and down as much as you’d like. The idea is just to pick the alignment which bests allow you to make the comparison you’re interested in.

  87. SBD:

    Thanks. But nowhere did I defend Goddard’s method. What I have been defending is his criticism that infilling is producing a false warming signal. To me this just seems logically simple. If cooler rural stations are being dropped, and they are being infilled by warmer urban stations, then the final average will produce a false warming signal.

    Is that incorrect?

    That is incorrect. Consider the examples given in this post. In them, a non-random subset of the data was removed. When infilling was used, the results came out perfectly. There was no bias despite that non-randomness.

    Of course, we could easily create examples where the simple infilling I used doesn’t remove the biases. That doesn’t mean much because the USHCN uses a more complicated (and more appropriate) methodology. A key difference is they use information from the non-missing portions of a station’s record. That means they can use how stations relate to one another when data for all of them exists to help estimate missing data.

    In other words, if one (rural) station is always cooler than another (urban) station, they’ll know that. They won’t take the urban station’s 80 degrees as representing the rural station’s missing data. They’ll do something like take that 80 – the normal difference between the two station records as representing it.

    (That said, this does not mean infilling introduces no biases. It just means different infilling techniques are susceptible to different biases. I believe the USHCN’s technique can allow for some biases if station drop out is dependent upon the rate of temperature change. The effect would be quite small though, given the magnitude of the rates of change.)

  88. ScienceNow:

    Brandon, you haven’t presented an explanation or argument for when, why, and how missing data can be imputed. Just asserting that it’s no big deal is not going to be convincing for most people, especially general public.

    I haven’t said infilling data is never a big deal. I haven’t even talked about infilling data in a general sense. There was no reason for me to. All this post tries to show on that topic is infilling data can be legitimate, which it shows just fine.

    You’re basically complaining I didn’t go into detail on points I wasn’t covering.

    Shub Niggurath:

    And it is a one-time thing. Why should infilling change past temperature every 5 years?

    The infilling is done based upon the relations of stations to one another. The more you know about how the data for two stations relate, the more information you have to try to estimate missing data in one of them.

    If you gain more information about how stations relate to one another, why wouldn’t you update your analyses which are dependent upon how those stations relate to one another?

    QQ:

    Infilling isn’t any more circular than averaging. The only difference between averaging and infilling is averaging, functionally speaking, replaces any missing values with the mean value for the entire day’s record, while infilling replaces them with only the mean from the nearest available neighbors.

    Remember, this is only true if one uses the simplistic method I used in my toy examples. The USHCN method is more complicated. In your example of 5, 5, 3, 5, 5, suppose that pattern repeated five times. If a 3 was misssing in couple entries, we wouldn’t have to simply average 5s together when infilling. We could use the relation between records to get a better estimate.

    Since entries where data was present for all stations show a pattern of (+2, +2, 0, +2, +2) we wouldn’t need to average 5, 5, NA, 5, 5. We could average 5-2, 5-2, NA, 5-2, 5-2. Doing so would give us 3, the right answer.

    Differences in absolute temperatures are rather easy to address when infilling data.

  89. Brandon Shollenberger: “That doesn’t mean much because the USHCN uses a more complicated (and more appropriate) methodology.”

    Can you link me to more information on USHCN’s methodology and statistical techniques?

  90. Brandon, you cannot just wave your arms about if you haven’t done the work. The biases you claim exist, must be shown to exist, the lack of correction should be quantified and then justified. Whatever the valid method, it ought to not require synthesis of data anywhere approaching 40%. It is a disgrace.

    I trust nothing from USHCN, GISS etc. It wasn’t so until quite recently. NCDC tells after Goddard’s news items that they were ‘intending’ to fix some of the problems. How long does it take to figure out how a handful of stations ‘relate to one another’? If I could get a correction of 0.4C lined up and apply it 0.1 C every ten years, I would knock almost half a century’s worth of global warming/cooling for a station this way?

    I found this in a statistics book:

    Everyone believes in the [normal] law of errors, the experimenters because they think it is a mathematical theorem, the mathematicians because they think it is an experimental fact.
    -Henri Poincare

  91. Even if you can’t convince me the point of posts is frequently to convince others. My point is simple. A climate “scientist” responsible for the models at Lawrence Livermore Berkeley told me during a Stanford class on Global warming that the LIA and MWP were not real. They didn’t occur. This was just a couple years ago. Numerous studies have come out now which validate the existince. The IPCC AR5 acknowledges the MWP and LIA. To me this was stupid. We had direct physical evidence of the MWP and LIA. We had written anecdotes describing these things. The idea that for hundreds of years the climate of northern europe was different than the rest of the world seemed preposterous. I don’t know of any reason for such a phenomenon and I have never heard an explanation of how this could happen. It seemed so obvious climate scientists PREFER to think it is non-existant because obviously if temperatures varied so much for so long then it is possible that current temperature variations could be tied to something other than CO2 and since they have no explanation for that it confuses them. Therefore, it’s easier to ignore it. Pretend it doesn’t exist. Something all scientists experiencing bias feel. It’s normal and expected. Something others should have been on the lookout for.

    When Steve does an average and ignores the Time of observation adjustments it is deceptive. When he doesn’t do averaging over areas it is deceptive but the problem is that when one does these adjustments, when one fills in data for missing / delayed data one is NOT giving data that is just as accurate. The precision of the data declines with the missing content. In some cases this may cause large variations in the accuracy of the data because some datapoints for instance may be one of only a couple datapoints for a large region leaving a lot of uncertainty or there may be more uncertainty about the time of observation adjustments than is generally recognized.

    I also find extremely disturbing that climate “scientists” don’t acknowledge the great unproven nature of much of what they are saying. There are almost no proofs that any of the things they say are true. The computer models include dozens of formula that express relationships between physical entities all of which have not been proven. There is almost zero experimental evidence to back any of the assumptions and conjectures in these climate models. The sign of forcing in many cases may be wrong. Some of the largest effects may be missing simply because as is acknowledged by nearly everyone there is still massive lack of knowledge of the ways in which many processes and things interact in the atmosphere and oceans. The NAO, PDO were discovered after AR1 and the initial predictions. There is still no good theory or understanding of how the sun, the deep ocean currents interact to produce oscillations we now know happen. It is clear that at least some if not most or all of the variations in temperature could possibly be explained by these phenomenon. By accepting the LIA and MWP we also must accept that since there is no epxlanation for these phenomenon yet that there is some underlying thing going on that we don’t understand possibly to do with interaction with the sun, deep ocean currents again. The high variability means that the assurance that they spoke of in AR3, AR4… was premature. However, any real scientist would have known this. It is absolutely 100% clear to any real scientist that this is politicized beyond all recognition as a science. I don’t understand what makes climate “science” a science. I don’t understand what “laws” the science has. After taking a class at stanford no laws were put forward. Broad ideas were put forward. General conjectures were postulated. Nothing was said to have been shown. No examples were demonstrated to show these things were laws or had reproducability. No experiments were shown that these things worked in the labs. I know this is a hard thing to do but it doesn’t change the facts. Without proof it isnt proven. Without data, without something to point to to say that something works in what scenarios you can’t say you’ve demonstrated anything scientific. Therefore what we have is politics masquerading as science.

  92. logiclogiclogic, I agree the point isn’t to convince you. I just know from experience engaging with a person who makes comments like yours is almost always unproductive, if not worse. I don’t care to engage you on points tangential to the topic of this post when I have no reason to expect anything to come from that engagement. I certainly don’t care to when experience indicates most people will just ignore it. Once you claim 3.25 is just as accurate as 3 for my example, most people will just tune you out.

    Shub Niggurath:

    Brandon, you cannot just wave your arms about if you haven’t done the work.

    I haven’t done this. You have a tendency to make baseless allegations. Oddly, it seems quite often you’re guilty of what you accuse others of.

    The biases you claim exist, must be shown to exist, the lack of correction should be quantified and then justified.

    The only biases I have claimed exist which lack a correction are those caused by Steven Goddard’s stpuid methodology.

    Whatever the valid method, it ought to not require synthesis of data anywhere approaching 40%. It is a disgrace.

    Yes, and people shouldn’t make things up about methodologies, but what can you do? It’s not like I can make anyone do the work to see infilling data over the spatial dimension(s) is functionally equivalent to simply weighting data by the spatial dimension(s). That that is true, by definition, is apparently irrelevant. People will just ignore whatever work I may do, accuse me of hand-waving and being lazy while contributing nothing of their own.

    I hope that wasn’t too subtle.

  93. My understanding of the CAGW hypothesis is that accelerating warming should manifest as higher maximum and minimum temperatures. We should be obliterating every record high temperature on the books, with regularity.

    In contrast, the Urban Heat Island effect would manifest primarily as elevated minimum (nighttime) temperatures, without as much of an impact on maximum temps.

    As Goddard has pointed out multiple times, there is zero evidence that maximum temperatures are hotter now than they were in the past. IIRC, 80+ percent of all state max temperature readings were set prior to 1950, and thus prior to the vast majority if CO2 changes.

    He has gone back to find stations that didn’t have any change in their time of observation between the 1930’s and today, and found the same trend. None of the ones he has dug up show accelerating high temperatures.




    Goddard has described this as a ‘nail in the coffin’ of those claiming that CAGW is driving temperatures higher. The only way to end up with higher average temps along with lower max temps is to have a marked increase in minimum temps as you would predict if the effect is primarily UHI rather than any sort of CO2-based mechanism.

  94. KTM, I’m too tired to write a response right now, but you were lucky. Your comment had enough links to land in the moderation queue. I got an alert right as I was getting ready to go to sleep, so I was able to fish it out right away.

  95. Q: Brandon, you cannot just wave your arms about if you haven’t done the work.

    A: I haven’t done this.

    That’s exactly what you have ‘done’. I have asked, several times, what quality assurance procedure requires creation of 40% synthetic data. You have given no answer. This has nothing to do with ‘laziness’. Work in this instance implies quantitative analysis using actual data.

  96. Shub:

    I have asked, several times, what quality assurance procedure requires creation of 40% synthetic data.

    As Brandon pointed out just above:

    infilling data over the spatial dimension(s) is functionally equivalent to simply weighting data by the spatial dimension(s).

    The answer: It’s not necessary to create 40% synthetic data.

    Nor is there anything wrong with infilling and marking the infilled values as “estimated”, because then the data values can be ignored.

    There are issues with the UHSCN data set, but this criticism that you keep parroting while failing to understand the answer is not a valid one.

  97. Shub Niggurath, as Carrick just pointed out, you’re just making things up. I’ve said, on multiple occasions, it is not necessary to create 40% synthetic data. That is one way of accomplishing the same goal any number of other methodologies could accomplish.

    For a general comment, there’s been a new post discussing adjustments in the USHCN network. I think it, or one of the next posts in its series, should cover everything I might need to say:

    http://judithcurry.com/2014/07/07/understanding-adjustments-to-temperature-data/

  98. “The answer: It’s not necessary to create 40% synthetic data.”

    “I’ve said, on multiple occasions, it is not necessary to create 40% synthetic data.”

    Thanks, gentlemen. I don’t recall Brandon stating this several times but if so, I apologize for missing.

    We could therefore move on to why the NCDC actually carries this out and whether it does so as Goddard claims and I will get out of your hair.

  99. Shub Niggurath, I don’t think we can move on to that. Carrick and I (and others) have been discussing that for some time now. You’re more than welcome to catch up to us though.

  100. I am reading the Zeke thread. Again, a lukewarmer discussed thread-bombed by Mosher responding to *every* comment.

    Raw data has no built in defense against wrong reasoning. Examine the rationale behind the pair-wise homogenization algorithm: it starts with the assumption that there are non-climatic trend differences between stations. It proceeds to use station data to identify these differences. It adjusts the very stations to ‘correct’ these differences. This is circular reasoning. That does not mean the changes are unwarranted or invalid. But if they are correct, it would be by accident.

    For every adjustment made, the product of the adjustment can be used to justify and quantify the magnitude of bias corrected – as being equal to the amount of the adjustment itself!

  101. Shub Niggurath, if you think it is contradictory to say something is not necessary and that it gets done, I don’t think anyone can help you.

    Carrick, you should get a kick out of this comment by Steven Mosher over in Zeke’s post about this topic:

    When Muller and Berkeley started to look at this matter their incentive
    was to build a better method and correct any mistakes they found.
    Koch and others found this goal laudable and funded them.
    With this incentive what did Berkeley find? Well, the better method
    extended the record, gave you a higher spatial resolution and showed
    that the NOAA folks basically get the adjustments correct.

    According to Mosher, BEST got a “higher spatial resolution.”

    Um.

  102. I wrote: “…discussed why it is not necessary to create 40% synthetic data *and* why the NCDC have done it…”

    You write: “…it is contradictory to say something is not necessary and that it gets done…”

    You see the problem?

    I’m out of here. Thanks for the discussion.

  103. Brandon:

    According to Mosher, BEST got a “higher spatial resolution.”

    Yes, I did get a chuckle out of that. Mosh is confusing spatial sampling frequency with spatial resolution:

    Spatial sampling frequency has to do with how closely you measure the temperature field and spatial resolution has to do with how finely you can resolve spatial variations in the temperature field.

  104. Shub, NCDC doesn’t have to infill anymore than I have to eat a steak instead of a burger. No contradictions there.

    NCDC does infill, probably as a convenience for people who want to look at specific sites without having to worry about futzing around with infilling themselves. I suggest writing to one of the NCDC USHCN caretakers if you really want to get their stated motivation for infilling. I don’t think it’s explained anywhere.

  105. Carrick, if I may, I’d like to extend your analogy a bit further. You don’t have to eat a steak. You don’t have to eat a burger. You just have to eat something.

    Similarly, the NCDC doesn’t have to use infilling. It doesn’t have to use gridding. It just has to do something to address biases introduced by changes in the climatological distribution of its stations.

  106. Brandon, I’m going through Zeke’s thread now. One claim that people make that has bothered me is that the blip circa 1940 is due to a “bucket effect”. That may explain a bit of the data, but you shouldn’t expect a blip in land data then. On that thread one person even claims there isn’t a blip on land (I guess not looking at data makes it easier to make claims).

    In fact land shows a very similar blip, in temperature location, duration and in magnitude:

    BEST, CRUTEM3, HadSST2 from WoodForTrees.

  107. Carrick, I saw the same thing and thought it was strange. I didn’t bother commenting though because I figure, if pressed, they’ll just say the land records were adjusted to match the blip which was fraudulently introduced into the ocean data, or something.

    By the way, according to WordPress, the reason your link didn’t show up the first time is you submitted a closing tag for the link, but no opening tag or link itself.

  108. Brandon, I used to be much more in favor of an open comment policy, until I saw on Judith’s blog just how badly ignoramuses can swamp a thread with their nonsensical chatter. Maybe what Lucia does would work better there… limit the length and frequency of posts on low S/N commenters.

    Thanks for the explanation on my HTML error. I figured I messed up the tag somehow, but wasn’t sure if it was that or a character in the URL that was causing wordpress to incorrectly flag the URL as invalid.

  109. Carrick, I am now, and will always be, a big proponent of an open comment policy. That doesn’t mean I think people shouldn’t take action to address users like you describe. I think they should. I just don’t think it should be done via moderation. Social pressure is the most powerful tool a blogger has. Simply talking and saying what isn’t good/appreciated/etc goes a long way.

    To me, moderation is primarily for dealing with spam and inappropriate content. If a person talks a lot without saying anything of value, that may be spam, but it could also just be an obnoxious person. I’m sure you’ve met plenty of those in real life. I doubt you taped their mouths shut :P

    That said, if one has a different purpose for their blog than I do, different policies could be appropriate. I’d use strict moderation if I did something like set up a website for hosting debates.

    And no problem about the HTML error. WordPress is so screwy with HTML I’m always worried about bugs. For example, I decided to test out migrating my blog to self-hosting to see what it’d be like. The first import failed so I restarted it. Unbeknownst to me, part of the failed import was saved, so I had a bunch of duplicate comments and posts. I deleted the duplicate posts, and WordPress started putting all the comments from the deleted duplicates into the other copies (but not in any sort of correct order). It was stupid.

    (To make matters worse, I decided to uninstall WordPress then reinstall it. I deleted all the files, but my host won’t let me use its service to reinstall WordPress because it doesn’t recognize that WordPress is gone. All it knows is it installed WordPress once. I don’t mind manually reinstalling it, but it seems kind of silly.)

  110. Brandon, actually I was thinking “technical posts” where you do want to foment discussion. People who are confrontational but are completely devoid of any meaningful understanding of a problem have little to contribute to such threads.

    When we have evening debates at conferences over various issues, usually it is moderated to prevent one person from hogging the podium.

    If a person has more to say, I have no problem letting them talk more. But some people seem to just repeat the same poorly conceived ideas over and over, with little evidence that they can adjust their thinking to new input. That’s a judgement call, but usually when a person is being overly repetitive or obdurate it is obvious. This runs into a snag when the moderator himself is the one being obdurate, but having multiple blogs does help that to some degree.

  111. Carrick, I get that, but I think it can largely be addressed without moderation. The host can explain why he/she isn’t going to engage and encourage others to do the same. If they do, then the person will have nobody to talk to. Either they’ll leave, or their comments will become spam (repeating points over and over without response).

    Though I guess if people won’t listen to the host’s requests, that won’t work. You see that at Climate Etc., but I think that’s mostly because Judith Curry has such a light presence there.

  112. I do really like this comment of Nick’s that got quoted on Judith’s blog. I think it is a very nice summary statement of the issues:

    […] there is a bias, and it’s a scientific duty to estimate and allow for its effect. The objectors want to say it is zero. That’s an estimate, baseless and bad. We can do much better.

    Making no adjustments is the same as assuming the data are correct without adjustments. But we know from various internal tests that the data do have biases that need correcting. I thought this comment of my was somewhat apropos (in response to Paul Matthews inexplicably arguing against making adjustments):

    Of course there are problems with the (raw) temperature record. Given the manner in which the data were collected, the issue isn’t whether the data should be adjusted to correct for the errors, but whether sufficiently good adjustments could ever be made, and whether we could know that they had been [correctly] made.

    I think it turns out that we can know they have been correctly made, but that’s the non-obvious part.

    Yes you should makes TOBS adjustment. Does it make more things better than things it makes worse? That’s the question that needs addressing. And “how do we know ‘it make more things better than things it makes worse?’ is every bit as critical of a question.

    I admit I got a bit exercised, more than I generally do when viewing willful ignorance from fellow members of my species:

    It turns out it takes no particular effort to be totally ignorant, just a belief system that overrides one’s intellectual abilities.

    I think this is a good principle to describe a large percentage of people on this planet. In the Frank Herbert Dune universe,they wouldn’t even be considered humans.

  113. If you only had one station, what Stockwell says is actually true.

    If you have multiple stations in the same regional network, which have homogenization corrections that occur randomly at different times, then it certainly is possible to separate the homogenization adjustments from the regional climatology.

    I suspect there are specific conditions where this isn’t possible though. This is where constructing a Monte Carlo temperature field, including the homogenization errors you are modeling, and verifying that your software can correctly extract the homogenization errors and the climatology.

    Simply writing down an equation that holds for just one station then following it by a lot of hand waving doesn’t cut it for me.

  114. I have seen some people comment that USHCN was never designed to detect global warming. The obsession with adjusting, infilling, and homogenizing makes that pretty clear to me.

    If your only goal is to track changes in the temperature record over time, it seems that people would take a very different approach. If a station has a continuous record for 100+ years measuring at 5 PM, perhaps there is a bias associated with that but it is constant. Likewise if they measure at 7 AM every day. Why try to take nice round peg of data and cram it into a square hole to make it “match” readings taken at midnight?

    Since the USHCN has migrated their time of observation from late afternoon to morning over the years, it has biased the data. Old afternoon Tmax readings might be biased a bit high, but old Tmin should be perfectly fine. Why try to average it? Just use the good Tmin data and throw out the biased Tmax data. And for modern observations, use the good Tmax data and throw out the biased Tmin data. Either way you will get the TREND you desire, without trying to rehabilitate bad data.

  115. KTM, the problem I see with that approach is I’m not sure the biases would cooperate in regard to timing. If you had both the min and max temperatures biased at the same time, you’d have to throw the data out all together. That seems bad.

    By the way guys, if you want to find examples of troubling adjustments to data, you may want to look at my latest post:

    https://hiizuru.wordpress.com/2014/07/10/is-best-really-the-best/

    It’s fascinating how wildly different BEST’s results can be from reality.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s