“Dishonest”

There is little stranger than watching an argument where neither side knows what they are talking about. Today I’m going to interject myself into such a discussion and try to clarify some things. The subject is Michael Mann’s infamous hockey stick graph, and the Wegman Report which criticized it.

I decided to write this after reading two comments by the user Kevin O’Neill at Judith Curry’s blog. The comments had a lot of incorrect, and even stupid, stuff in them, but the parts that stood out the most to me were:

Wegman deceptively displayed only upward-pointing ‘hockey sticks’ – though half of them would have had to be downward pointing.

And while Wegman in the text acknowledges that half the results will be downward sloping all of his results show upward sloping hockey sticks. Why? Pretty obvious that a downward sloping hockey stick wouldn’t look like MBH.

These remarks originate from a blog post written by Nick Stokes. Stokes discusses criticisms of Mann’s hockey stick which said his implementation of Principal Component Analysis (PCA) was biased toward selecting hockey stick shapes. The Wegman Report showed an image like this showing such a bias:

9-14-Wegman_Version

Stokes claims this is wrong, and the graph ought to look like this:

9-14-Stokes_Version

As you can see, the two are very different. The most glaring difference is some of the hockey sticks in Stokes’s version are upside down. Stokes and O’Neill claim it is dishonest to not show these. However, Stokes also says this in his post:

Now we see that there is still some tendency to HS shape, but much less. It can go either way, as expected. In the PCA analysis, sign doesn’t matter,

If the sign doesn’t matter in PCA, why would we need to show upside down hockey sticks? As Stokes acknowledges, upside down and right-side up hockey sticks are equivalent in PCA. Why should we need to add a large visual discrepancy to our images just to show something which “doesn’t matter”?

Neither Stokes nor O’Neill has ever answered that question. Neither has explained why the image above is “honest” while this version would be “dishonest”:

9-14-Stokes_Version_Flipped

The “honest” version favors Stokes and O’Neill’s views because it gives them ammunition to attack the Wegman Report, but the “dishonest” version is functionally equivalent. Aside from looking for a reason to attack the Wegman Report, there is no reason to prefer the “honest” version which adds huge visual discrepancies for no purpose.


For those who want to better understand why the orientation of these graphs doesn’t matter, the key is in how Michael Mann combined his proxies (which include those output by his PCA implementation). Series created by PCA are in arbitrary units. To plot them as temperatures, Mann had to convert them into temperatures. He did this by scaling his proxies to match the instrumental record.

You can think of it like multiplying each series by a number. If you multiply a series by 2, the figure will look the same but the scale will change. If you multiply that figure by -2 instead, the figure will flip upside down. The absolute magnitudes in the figure will be identical, but the orientation will be reversed.

Since the instrumental record goes up, all rescaled proxies would have to go up as well. Any proxies which originally had negatively oriented hockey sticks would be multiplied by some negative value, flipping them right-side up. That means every proxy could have an upside down hockey stick in it, and the resulting reconstruction would still be a right-side up hockey stick.

I apologize if this post is unclear. I’m writing it at 3AM in a hotel room after a sizable dart tournament. If you need anything clarified, feel free to ask.

Advertisements

167 comments

  1. Brandon,
    You are completely missing the point here. It isn’t about the orientation of the plots. It’s about their very HS-ness. Wegman said in his caption to Fig 4.4:
    “One of the most compelling illustrations that McIntyre and McKitrick have produced is created by feeding red noise [AR(1) with parameter = 0.2] into the MBH algorithm. The AR(1) process is a stationary process meaning that it should not exhibit any long-term trend. The MBH98 algorithm found ‘hockey stick’ trend in each of the independent replications.”

    But it wasn’t in each. They selected the 100 which had the best HS trend. That did include sorting by orientation, but that was a minor part. They rejected thousands which had the right orientation, but not enough HS-ness. Then they selected 12 from that top 100. Then claimed “The MBH98 algorithm found ‘hockey stick’ trend in each of the independent replications”

  2. Brandon,
    You are missing the point here completely. It isn’t about the orientation of the plots. It’s about their very HS-ness. Wegman said in his caption to Fig 4.4:
    “One of the most compelling illustrations that McIntyre and McKitrick have produced is created by feeding red noise [AR(1) with parameter = 0.2] into the MBH algorithm. The AR(1) process is a stationary process meaning that it should not exhibit any long-term trend. The MBH98 algorithm found ‘hockey stick’ trend in each of the independent replications.”

    But it wasn’t in each. They selected the 100 which had the best HS trend. That did include sorting by orientation, but that was a minor part. They rejected thousands which had the right orientation, but not enough HS-ness. Then they selected 12 from that top 100. Then claimed “The MBH98 algorithm found ‘hockey stick’ trend in each of the independent replications”

  3. Nick

    Getting back to basics, what makes you think that such proxies as Tree rings have any value other than as a vague indicator of moisture during their growing season.

    What makes you think they make any sort of reliable thermometer?

    tonyb

  4. Nick Stokes, that is BS, and you know it. Kevin O’Neill made an explicit claim then repeated it. I responded to that claim. There is no way you could have failed to understand that. It’s dishonest to pretend a different issue is the only issue people should discuss.

    Moreover, you know fully well I can provide links to you making the exact same argument O’Neill made. If it was an important enough point for you to bring up in the past, even leveling accusations of dishonesty over, it is certainly a point people are free to discuss.

    Don’t try to change a subject by pretending you and others haven’t advanced this stupid position. You know you have. You know others have. You know it is dishonest to pretemd you guys haven’t.

  5. Brandon,
    “Kevin O’Neill made an explicit claim then repeated it.”

    Kevin made the explicit claim:
    “5) Are you claiming McIntyre’s code did not sort and select the 100 most extreme cases out of 10,000?”

    and you dodged that one. Kevin is exactly right. He’s also right that Wegman’s code only produced results with upward HS’s. That’s a strong objection, but his other is stronger. And obvious to anyone.

    Wegman said “The MBH98 algorithm found ‘hockey stick’ trend in each of the independent replications” and showed 12 artificially selected, upright ones.

  6. Nick Stokes, even if you like one argument more than another, it is stupid to claim there is only one issue people are allowed to discuss. It is blatantly dishonest to claim someone has completely missed the point simply because they chose to discuss a different argument than you want to discuss.

    If you don’t want people to discuss a stupid argument you’ve made, you shouldn’t have made it. Once you have made it though, it is absurd to fault people for responding to it.

  7. Nick

    You are a reasonable guy but I posed a reasonable question which is surely directly relevant

    If the proxy isn’t worth a brass farthing in the first place applying any amount of clever mathematics to it doesn’t suddenly make it worthwhile does it?

    By all means ignore the question but its one that needs answering somewhere.

    tonyb.

  8. Choosing ‘n’ graphs from a set of ‘N’ to — literally — illustrate a claim about the general properties of N is always going to be a problem. If ‘n’ =1 then you’ve crowned or annointed an unrepresentative individual to represent the population. If ‘n’ = 1% or 5% of ‘N’ there will be arguments about sample size. No win.

    It would be helpful to define some numerical parameter of “hockey-stick-NESS”; count the number (‘n1’) of samples from the test population that exhibit such a property, and report the fraction. So of every N=100 series of red noise inputs some 5% or 25% or 75% come out with “hockey-stick-ness”. Even better if the HSN parameter is somewhat sliding. So 25% of samples have HSN greater than .5; 50% have HSN greater than .7 and 10% have HSN > .95. (or whatever.)

    The problem of comparing a visual graph to the ideal “hockey-stick” and calculating the HSN is left as an exercise for the student.

  9. Nick,

    I am truly interested in seeing the answer to Tony’s question. Please, if you can, provide that answer.

    Regards,

    Richard

  10. Nick Stokes, your alternative chart clearly shows that your main point was the orientation, not the level of hockey stickishness. Unless those are average graphs, in which case choosing the top 12 out of 100 or 10000 really doesn’t matter.

  11. Looks like it is Mann’s defenders who are unaware that PCA algorithms are blind to the sign of the indicator.

  12. climatereason September 14, 2014 at 10:53 am
    “If the proxy isn’t worth a brass farthing in the first place applying any amount of clever mathematics to it doesn’t suddenly make it worthwhile does it?”

    Tony, there are no proxies in this analysis. It’s just what a particular calculation process does to random noise.

    One could wish for better indicators than the proxies we have. But they are what we have, and they should be analyzed as well as possible.

  13. Brandon Shollenberger September 14, 2014 at 10:21 am
    “Nick Stokes, even if you like one argument more than another, it is stupid to claim there is only one issue people are allowed to discuss. It is blatantly dishonest to claim someone has completely missed the point simply because they chose to discuss a different argument than you want to discuss.

    If you don’t want people to discuss a stupid argument you’ve made, you shouldn’t have made it.”

    Well, you said I made it, which isn’t true. But it’s not a stupid argument. Interpolating an unstated selection procedure to select only the ones with the orientation you like, and then claiming they are all like that, is illegitimate, even if it doesn’t affect the maths. I can just imagine what you’d be saying if Mann did such a thing.

    But selecting for HS index does affect the maths. It’s not the 100 or so near mirror images of the ones he selected, but the other 9800 that have been excluded.

  14. Brandon – I never said it affected the math. You’ve built a strawman.

    In my response to Don at Climate Etc I listed nearly a dozen points.

    1) Are you claiming the Wegman report does not contain obvious plagiarisms?
    2) Are you claiming the Wegman Report does not reference dozens of sources not cited in the text (many of which are dubious or irrelevant)?
    3) Are you claiming the Wegman report does not mischaracterize the results of scientific papers?
    4) Are you claiming that Wegman independently verified McIntyre’s results?
    5) Are you claiming McIntyre’s code did not sort and select the 100 most extreme cases out of 10,000?
    6) Are you claiming that McIntyre’s code only produced upward-sloping ‘hockey sticks’?
    7) Are you claiming that McIntyre’s ‘trendless red noise’ was not created from proxy data that contained the climate signal?
    8) Are you claiming Wegman was not reprimanded for plagiarism by his university employer?
    9) Are you claiming Wegman was not forced to retract a paper due to plagiarism?
    10) Are you claiming I misquoted Professor Curry when she called the people who exposed Wegman “reprehensible”?
    11) Are you claiming that plagiarism isn’t fraud?

    Nowhere in this list or anywhere else have I claimed the upward-slope/downward-slope changes the math, but it’s obvious from this that the selection wasn’t random (and it visually gives the appearance that the noise is equivalent to the MBH graph – which is entirely misleading).

    Of course any investigation of the Wegman Report is hampered by the fact Wegman has never released his supporting materials. The screams of outrage from the pseudoskeptic community are deafening.

  15. Pouncer, I agree. I don’t think cherry-picking examples then presenting them as a representative sample is acceptable behavior. I can, however, believe that and hold the position of Nick Stokes and Kevin O’Neill regarding upside down hockey sticks is ridiculous.

    If the orientation of the hockey stick in a graph is irrelevant, upside down hockey sticks are equivalent to right-side up hockey sticks. As such, there is no benefit to showing both types. Doing so increases the visual disparity for no purpose.

    Once you flip over the upside down hockey sticks in Stokes’s graph, there is little difference between the cherry-picked sample and the representative sample. The only way Stokes can argue a significant effect caused by the cherry-picking is to rely upon a difference he himself notes is irrelevant.

    It’s silly to base an accusation of dishonesty primarily upon a point you yourself admit is irrelevant. The cherry-picking was wrong, but it did not make a significant difference as Stokes told people it did.

  16. Nick Stokes:

    Well, you said I made it, which isn’t true.

    Yes it is. Do you think people will just believe I lied about when I said I can provide quotes showing you made this argument? I don’t think so. I think most people would believe me. Even if they were skeptical though, providing such quotes is easy. Here is one it took me less than 60 seconds to find:

    I see no reason why I should butcher the actual PC1 calcs to perpetuate this subterfuge.

    You called showing only positively oriented graphs subterfuge. That is the same argument Kevin O’Neill made, the one you claimed you never made.

    But it’s not a stupid argument. Interpolating an unstated selection procedure to select only the ones with the orientation you like, and then claiming they are all like that, is illegitimate, even if it doesn’t affect the maths. I can just imagine what you’d be saying if Mann did such a thing.

    At most, I would have said he should have been a bit more clear in his discussion of the figure. I don’t know that I’d have done even that though. He made it abundantly clear there were upside down hockey sticks with his Figure 4.2 and his discussion of the figure in question:

    Because the red noise time series have a correlation of 0.2, some of these time series will turn upwards [or downwards] during the ‘calibration’ period and the MBH98 methodology will selectively emphasize these upturning [or downturning] time series.

    At most, I might have said he should have added a line explaining why he refers to upside down hockey sticks but doesn’t show any.

    If you want to suggest hypocritcal responses, let’s consider yours for a moment. You accuse Edward Wegman of dishonesty for flipping his example PCs, but you don’t say a word when the IPCC did the same thing with the actual NOAMER PC1. That PC1, plotted here, was an upside down hockey stick. As I pointed out to you in our previous dicsussion, the IPCC published it (as W USA) with a positive orientation. The same is true for many other publications. You’ve never complained about any of them.

    The truth is The Wegman Report forced every PC to have a positive orientation when displayed, including Michael Mann’s NOAMER PC1. Many other people have done the same thing. You’ve simply ignored it whenever someone you didn’t want to criticize did it. That is, you cherry-picked which examples of flipping* PCs to criticize.

    *Flipping PCs has the same effect as selecting only positively oriented graphs. Such a selection is implicitly flipping the graphs.

  17. BS – now *you’re* being obtuse or dishonest. Who could conceivably imagine that 1 point out of 11 (there are more points that can be made BTW) is the primary basis for the charge? Especially one labeled #6?

    So far you haven’t shown that any of them are incorrect – you merely claim that one of them (#6) is insignificant – but the correctness of #6 stands. You also fail to acknowledge that by showing only upward sloping results Wegman demonstrates his sample was biased.

    Please answer the other 10 questions I posed to Don. Then we can delve into the significance of each.

  18. Kevin O’Neill:

    Brandon – I never said it affected the math. You’ve built a strawman.

    Don’t accuse a person of making a straw man based upon something they’ve never said. It’s just silly. Anyone who reads this post can see I never claimed you said this affected the math.

    In my response to Don at Climate Etc I listed nearly a dozen points.

    Which you are now wasting our time and space with by repeating even though they aren’t the topic of discussion. In fact, nothing in your comment responds to the topic being discussed in this post.

    Nowhere in this list or anywhere else have I claimed the upward-slope/downward-slope changes the math, but it’s obvious from this that the selection wasn’t random (and it visually gives the appearance that the noise is equivalent to the MBH graph – which is entirely misleading).

    Yes, yes. You never said something nobody claimed you said. All you’re saying is something anyone can see is wrong by looking at the first and third images in this post. The non-randomness of the selection in the Wegman Report had no significant effect on its results. A randomly chosen selection would have had almost the exact same impact as a cherry-picked selection.

    The only caveat is negatively oriented graphs needed to be flipped over. The Wegman Report makes it clear such graphs happened. It says so in its discussion of the figure. A reader should naturally assume such graphs were flipped over (either implicitly or explicitly) for convenience since orientation of the graphs is irrelevant.

    There is no possible benefit to showing graphs with a negative orientation. The reader would not have been better informed of anything, nor would their understanding of any points been better. There is nothing deceptive about failing to show an irrelevant piece of information.

  19. Kevin O’Neill, in addition to falsely accusing me of creating a straw man as discussed in my previous comment, you’re now claiming I’m either obtuse or dishonest because I suggested one point “is the primary basis for [your] charge.” I’ve never said anything of the sort. You’re just making things up in order to level one accusation against me after another.

    People are free to discuss individual points other people make. I chose to discuss a point which shows you and Nick Stokes happily use completely irrelevant points to accuse other people of being dishonest. I do not need to discuss some laundry list of other points in order to discuss that single point.

    You expecting me to respond to everything you say in order to respond to a single thing you said is ridiculous. It serves no point other than to divert a legitimate discussion. I’m not going to fall for such a blatant attempt at diversion. I don’t see how anyone would. Anyone who wants to have an actual discussion would know better than to randomly start discussing 10 new points in order to discuss one the post was written about.

  20. Brandon – This post is titled “Dishonest.” It’s about *my* comments, but you only address point #6. And as you wrote to Nick – “It’s silly to base an accusation of dishonesty primarily upon a point you yourself admit is irrelevant. ” What does that say about *your* argument where you only address 1 of 11 points – and those 11 were never said to be the *only* arguments that can be made.

    The point in #6 is correct. You claim it’s insignificant, I claim it underscores Wegman’s lack of due diligence. If you want to argue significance, then what significance does the whole noise charade have on Mann’s results? Would the magnitude of error have even increased the uncertainty limits? Not bloody likely.

    Wegman had to retract his CDSA paper. He used the same material in the Wegman Report. That alone tells us enough to view everything else in the report with skepticism. But *you* ignore all this.

    You won’t answer the points I raised because you know they’re all correct.

  21. Kevin O’Neill:

    Brandon – This post is titled “Dishonest.” It’s about *my* comments, but you only address point #6. And as you wrote to Nick – “It’s silly to base an accusation of dishonesty primarily upon a point you yourself admit is irrelevant. ” What does that say about *your* argument where you only address 1 of 11 points – and those 11 were never said to be the *only* arguments that can be made.

    Your try to portray me as a hypocrite, but I’ve never admitted the point I’m discussing is irrelevant. I’ve specifically stated how important I believe it to be, something you could not have missed. Not only is your implied accusation false, no reasonable reader could believe it.

    Additionally, you misrepresent my post by saying it is about your comments. It is not. It is about a specific subset of your comments. I chose not to discuss the other parts of your comments because it is best to focus discussions on specific points. Nobody ever resolves anything by discussing 11 different points at once.

    The point in #6 is correct. You claim it’s insignificant, I claim it underscores Wegman’s lack of due diligence. If you want to argue significance, then what significance does the whole noise charade have on Mann’s results? Would the magnitude of error have even increased the uncertainty limits? Not bloody likely.

    There are two key differences between you and I. I claim this point is insignificant, giving a detailed explanation as to why. You claim this point “underscores Wegman’s lack of due diligence,” but you do nothing to support this argument. Instead, you refuse to discuss the point, demanding everyone discuss other points instead.

    You then use your refusal to stay on topic to argue:

    But *you* ignore all this.

    You won’t answer the points I raised because you know they’re all correct.

    This isn’t true at all. I’ve discussed each of the points you’ve raised in the past. My views on them are hardly unknown. I simply won’t discuss 10 irrelevant points in order to discuss one simple point. If you or Nick Stokes were actually willing to try to resolve this one point, we could all then move on to other points.

    But the fact you two respond to any discussion of one argument you make by insisting people only discuss other arguments you’ve made ensures no progress can ever be made. You aren’t trying to have an actual discussion. You’re trying to push your viewpoints on the world with a complete disregard for anyone or anything else.

    That sort of behavior is a form of trolling. Either you guys are behaving as trolls on purpose, or you just don’t know how to have an actual discussion.

  22. Your try to portray me as a hypocrite, but I’ve never admitted the point I’m discussing is irrelevant. I’ve specifically stated how important I believe it to be, something you could not have missed. Not only is your implied accusation false, no reasonable reader could believe it.

    I’ve never admitted the point is irrelevant either. So claiming I’m dishonest because *YOU* think the point is irrelevant is mere projection. I have specifically said the point *IS* relevant. Now, I said earlier you were either being obtuse or dishonest. I lean towards dishonest because you might well know the details better than I, but you refuse to acknowledge that it underscores Wegman’s lack of due diligence.

    I’ll play your charade and spell it out for you:

    M&M contaminated their “persistent red noise” by not detrending the proxy data. Wegman did not catch this.
    M&M relied up on a sorted group of the 100 most hockey stickish results which composed their “sample” – all of these had upward-slopes. Wegman *knew* that there should be downward sloping HS as well, and one “reviewer pointed out his figures should incorporate them, but he ignored them. Due diligence requires he find out *why* there were no downward slopes – he knew they existed. The probability of getting only upward slopes is astronomical.
    Wegman in figure 4.4 said the noise was AR(1) – it wasn’t. Again, due diligence lacking.
    Wegman in figure 4.4 said he found HS in *each* of the independent replications. Shall we beat this dead horse some more?

    If Wegman had stopped and wondered why all his results had positive slopes he might have understood what was going on. It doesn’t matter they are mathematically equivalent, the fact ≈ 50% of them weren’t the opposite sign was a clue – a clue he ignored.

    If Wegman had produced a Figure with both upward and downward sloping results, then we’d know he’d actually worked with and understood (to at least some extent) the code he was using. Instead his results show he failed at due diligence. That’s the significance. Wegman made many mistakes, he had plenty of clues and ignored them.

  23. Probably it makes more sense to discuss this with the authors than third persons, but I doubt the climate signal contamination is much of an issue.

    The signal to noise is very bad in these proxies, otherwise we’d use individual proxies to measure local temperature instead of relying on complex reconstruction algorithms.

    Because the sign of the proxy index is arbitrary, it could very easily been a convention on the part of the researcher who generated the proxies to show them with a positive slope. Obviously many proxies are going to show “up-ticks” (when properly aligned) in the 20th century even if they aren’t temperature proxies.

    I’d recommend that you spend your time more constructively, perhaps by critically considering Mann and other researchers whose work is peer reviewed and included in IPCC reports, rather than non-peer reviewed reports like this.

  24. Kevin O’Neill:

    I’ve never admitted the point is irrelevant either.

    What are you smoking? My comment about relevance was directed at Nick Stokes who specifically said orientation is irrelevant in PCA. I have never extended that comment to any further points. There is no reason to say you’ve never admitted a point I’ve never suggested you admitted.

    So claiming I’m dishonest because *YOU* think the point is irrelevant is mere projection. I have specifically said the point *IS* relevant.

    Actually, I had never called you dishonest. You just made that up.

    Moreover, you have done nothing to support that point, and you have adamantly refused to address the points raised in response to it. I think people could reasonably view that sort of behavior as intellectually dishonest.

    M&M relied up on a sorted group of the 100 most hockey stickish results which composed their “sample”

    Actually, they didn’t. Steve McIntyre and Ross McKitrick never used the selected 100 graphs as a “sample” for anything. You’re just making this up in order to smear them for something they had no responsibility for. I’ve even discussed that exact point on this blog several times.

    Wegman *knew* that there should be downward sloping HS as well, and one “reviewer pointed out his figures should incorporate them, but he ignored them. Due diligence requires he find out *why* there were no downward slopes – he knew they existed.

    I don’t know why you’d reference someone talking about negatively oriented graphs to say Edward Wegman knew about them when he specifically mentioned them in the text. The only reason I can see to do that is to bring in yet another issue you can try to smear him with.

    And naturally, you don’t even provide a link, reference or quote to justify your claim. You do, for some reason, provide an opening quotation mark you never follow up on, but that’s clearly no help. Since you chose not to do it, I’ll provide that link. It can be found here.

    Any fair-minded individual who reads the e-mail will see there is no reason to assume Wegman ignored the reviewer on this point. The reviewer said showing instances with both orientations was a matter of “fairness.” Wegman may have considered the point and disagreed. Disagreeing with someone is not ignoring them. The only reason to assume one rather than the other is bias/close-mindedness.

    If Wegman had stopped and wondered why all his results had positive slopes he might have understood what was going on. It doesn’t matter they are mathematically equivalent, the fact ≈ 50% of them weren’t the opposite sign was a clue – a clue he ignored.

    You have absolutely no basis for saying Wegman didn’t know what he was showing. You simply ignore the perfectly plausible explanation, that he chose to show a non-random sample because he thought it’d look clearer. That’d be wrong to do without specifying it was done, but as my post shows, it wouldn’t make a significant difference.

    If Wegman had produced a Figure with both upward and downward sloping results, then we’d know he’d actually worked with and understood (to at least some extent) the code he was using.

    Actually, we know he understood the code he used by virtue of other things he did with it. You’ve simply chosen to ignore parts of the Wegman Report which show such. This is remarkable as it requires not even looking at the figures on the pages immediately after the one you criticized.

    But then, you’ve consistently ignored the fact Wegman specifically mentioned upside down hockey sticks in the text discussing the figure you criticize.

  25. BS writes: “Actually, we know he understood the code he used by virtue of other things he did with it. You’ve simply chosen to ignore parts of the Wegman Report which show such. This is remarkable as it requires not even looking at the figures on the pages immediately after the one you criticized.”

    Then please explain the caption to Figure 4.4 if we know Wegman understood the code.

  26. Carrick:

    Probably it makes more sense to discuss this with the authors than third persons, but I doubt the climate signal contamination is much of an issue.

    If the people criticizing the Wegman Report were genuinely interested in understanding the issue, they’d know the issue of whether or not to detrend in this sort of thing has been discussed before. They’d know Michael Mann criticized Steve McIntyre and Ross McKitrick for (supposedly) detrending when doing similar MOnte Carlo experiments. They’d know they’re arguing the opposite position of the person they’re trying to defend.

    But they don’t. They don’t know anything about the discussions which have been held during the hockey stick debate. They don’t understand how the hockey stick was created. They don’t know any of this because they don’t care. All they want are talking points they can shout out whenever someone says something they dislike.

    Because the sign of the proxy index is arbitrary, it could very easily been a convention on the part of the researcher who generated the proxies to show them with a positive slope. Obviously many proxies are going to show “up-ticks” (when properly aligned) in the 20th century even if they aren’t temperature proxies.

    As I pointed out above, NOAMER PC1 was an upside down hockey stick. It was flipped over in multiple publications, including the IPCC Report. It was even flipped over before being included in some data files. Michael Mann and his co-authors had no problem with this. Their fellow climate scientists had no problem with this.

    The only people who have a problem with it are the climate warriors like Nick Stokes and Kevin O’Neill who will accept any talking point so long as it favors their world views.

  27. Kevin O’Neill, I ask people not to abbreviate my name that way due to the unfortunate association.

    Then please explain the caption to Figure 4.4 if we know Wegman understood the code.

    Understanding the code you use does not mean you understand every single nuance of it. It is often easy to slip up about what a parameter is set to. A person can even slip up when writing a description of code despite having known the correct answer when working with the code. That is especially true since people sometimes write/edit such descriptions weeks, or even months, after having worked with the code.

    Now that I’ve explained the caption, please explain to me what difference it would have made to the results published in the Wegman Report had they used what the caption said they used. I’ll point out the work to answer that question has already been done by other people (and even linked to on this page).

  28. Pouncer: “It would be helpful to define some numerical parameter of “hockey-stick-NESS”…”

    Already done. McIntyre & McKitrick (GRL, 2005) defined a “hockey stick index” as “the difference between the 1902–1980 mean and the 1400–1980 mean, divided by the 1400–1980 standard deviation.” The histogram of this metric on random data is shown in figure 2 here.

  29. Brandon writes: “Understanding the code you use does not mean you understand every single nuance of it. It is often easy to slip up about what a parameter is set to. A person can even slip up when writing a description of code despite having known the correct answer when working with the code. That is especially true since people sometimes write/edit such descriptions weeks, or even months, after having worked with the code.”

    “Now that I’ve explained the caption ….”

    No, that is not a satisfactory explanation. Wegman wrote in the caption to Figure 4.4:
    “Figure 4.4: One of the most compelling illustrations that McIntyre and McKitrick have produced is created by feeding red noise [AR(1) with parameter = 0.2] into the MBH algorithm. The AR(1) process is a stationary process meaning that it should not exhibit any long-term trend. The MBH98 algorithm found ‘hockey stick’ trend in each of the independent replications.”

    It’s the wrong noise model. The persistence stated is completely different to that M&M (and hence Wegman used). And all 12 of the results in Figure 4.4 are actually taken from M&Ms archived data (100 most HSish). So it isn’t “each” it’s the top 1%. It isn’t independent verification, it’s using M&Ms stored archive. How many times are you allowed to “slip up” in one caption? And he never questioned how *all* 12 had upward slopes.

    You already *know* all this. It isn’t news to you. The real question is why you simply won’t admit Wegman didn’t do due diligence. Trying to claim otherwise is *dishonest*.

  30. For the record, I don’t think due diligence was done in the Wegman study either. Kevin is likely exaggerating the extent of the problems, but I think the issues he is raising are legitimate ones to be concerned about (if one feels the need to critically examine “grey literature” such as this, which I really don’t).

    The issue with plagiarism is a real one, even if what Wegman was actually reprimanded over was lack of oversight of a student, rather than plagiarism, as Kevin is claiming. That is certainly a statement of poor scholarship.

    That said, I think the issue that Brandon is raising is a valid one. The sign of the proxies is a fake problem.

    It’s very interesting that Nick Stokes would have harped over this while strongly defending the right of Kevin Briffa to redact without comment a portion of his own results.

    The hypocrisy in focusing on a gray literature document that isn’t included of any official climate science while not only ignoring but outright defending blatant acts of research misconduct is a bit appalling.

  31. Kevin:

    And he never questioned how *all* 12 had upward slopes.

    And you again have failed to address that the slope doesn’t matter. Because it doesn’t.

    Wegman failed to do due diligence on a number of important issues. I don’t claim otherwise. In fact I claim I don’t take his report very seriously because of the lack of due diligence.

    But if you’re going to criticize something, you do need to get the arguments half-right first. And that involves due-dilgence on your part.

    Which does not look like a duplication of complaints by people like DeepClimate, who is hardly an accurate source in any case.

  32. Kevin O’Neill, I’ll note you are again refusing to stay on-topic. Normally I wouldn’t stand for that, but I give a lot of latitude to my critics:

    It’s the wrong noise model. The persistence stated is completely different to that M&M (and hence Wegman used).

    This is obviously false to anyone who has any idea what they’re talking about. The two noise models are highly related to one another. There is no way anyone with any understanding of this topic would say they are “completely different.”

    That is made obvious if one even looks at the acronyms involved. The claimed model was an AR model. The actual model was an ARFIMA model. ARFIMA is a generalized version of the AR model. The two are highly related. Not only that, but mixing up AR and ARFIMA isn’t a huge sin like you act.

    It isn’t independent verification, it’s using M&Ms stored archive. How many times are you allowed to “slip up” in one caption?

    The archive shown has a very similar variable name to the one generated freshly by the code. It would be trivially easy for someone to plot the wrong one, and the results would be unaffected. A person looking at just the two options could never tell which was fresh and which wasn’t.

    And he never questioned how *all* 12 had upward slopes.

    You have absolutely no basis for this claim. You did nothing to contradict the explanation I provided, but you now happily ignore the alternative interpretation it provides.

    You already *know* all this. It isn’t news to you. The real question is why you simply won’t admit Wegman didn’t do due diligence. Trying to claim otherwise is *dishonest*.

    This is a cheeky comment given you’ve flat-out made things up about what I’ve said then refused to correct yourself, and now are flat-out ignoring alternative interpretations simply because they are inconvenient for you. Not only that, you claim I know things even though your list includes statements anyone with the most basic understanding of this issue are wrong.

    You’re free to think the “real question” is whatever you want, but your approach to discussions shows there are plenty of questions people should be asking of you. Personally, I think the “real question” ought to be why I don’t moderate you for breaking my blog’s rules.

  33. “Do you think people will just believe I lied about when I said I can provide quotes showing you made this argument? “

    Yes, I think you should. Orientation of the plots has never been my argument. My argument has always been on the selection. If you take a bunch of random curves and select for HS shape, you’ll get HS shapes. I showed that for centered PCA profiles.

    I don’t think that they should, especially without telling, manipulate the plots to get the preferred orientation, and when prompted I’ll say so. But that is not the argument I’ve advanced, and none of your quotes so far say otherwise.

  34. Let me help Kevin on the due diligence.

    1) Are you claiming the Wegman report does not contain obvious plagiarisms?

    Let’s look at this. From the Wiki summary:

    George Mason University provost Peter Stearns announced on 22 February 2012 that charges of scientific misconduct had been investigated by two separate faculty committees: the one investigating the Wegman Report gave a unanimous finding that “no misconduct was involved” in the 2006 report to Congress. Stearns stated that “Extensive paraphrasing of another work did occur, in a background section, but the work was repeatedly referenced and the committee found that the paraphrasing did not constitute misconduct”.

    So it does not.

    2) Are you claiming the Wegman Report does not reference dozens of sources not cited in the text (many of which are dubious or irrelevant)?

    Why is this relevant?

    3) Are you claiming the Wegman report does not mischaracterize the results of scientific papers?

    Which papers?

    4) Are you claiming that Wegman independently verified McIntyre’s results?

    Why is this relevant?

    5) Are you claiming McIntyre’s code did not sort and select the 100 most extreme cases out of 10,000?

    For the purpose of a figure.

    6) Are you claiming that McIntyre’s code only produced upward-sloping ‘hockey sticks’?

    Again sign doesn’t matter. Plotting figures with both signs is a distraction. Criticizing not plotting both signs shows a lack of technical understanding.

    Iff your claim is that McIntyre ignored hockey sticks with either sign, then you have failed in due diligence. You need to read his papers and acquaint yourself with them, instead of regurgitating DeepClimate’s goofy nonsense.

    7) Are you claiming that McIntyre’s ‘trendless red noise’ was not created from proxy data that contained the climate signal?

    Signal to noise ratio matters. If the signal to noise ratio is poor, even if there is a temperature related climate signal, you can still treat individual proxies as if they were not temperaure proxies.

    Still a replication with the correct noise model (which is NOT AR(1)) would be useful.

    8) Are you claiming Wegman was not reprimanded for plagiarism by his university employer?

    He wasn’t. He was reprimanded

    9) Are you claiming Wegman was not forced to retract a paper due to plagiarism?

    Due to the plagiarism of a co-author. The co-author was guilty of this infraction, not Wegman. Wegman failed to properly supervise the student and oversee the writing of the 2008 paper which did contain plagiarism in the background section. If your goal is to be honest, you need to accurately spell this out.

    This is apropos:

    he 2008 social network analysis paper was investigated by a separate committee which unanimously found “that plagiarism occurred in contextual sections of the (CSDA) article, as a result of poor judgment for which Professor Wegman, as team leader, must bear responsibility.” Stearns announced that Wegman was to receive an “official letter of reprimand”, and in response to telephoned questions said the university was going to send the investigation reports to federal authorities. A university spokesman said the reports would not be made public. Bradley said the university had failed to notify him of its decision, and described the split result as “an absurd decision” which would encourage GMU students to think it acceptable to copy work without attribution. Stearns said that “instead of allowing the university process to be completed”, which had taken over two years, Bradley had openly discussed the plagiarism. The university was going to consider ways to make investigations more streamlined, and it was not investigating any other complaints about Wegman.[16]

    Given the ethical quagmire that is MBH 98/99, that this criticism is coming from Bradley is deeply ironical.

    10) Are you claiming I misquoted Professor Curry when she called the people who exposed Wegman “reprehensible”?

    Nobody said anything besides you about it, but here is the quote:

    So even if Wegman did copy his definition from the wikipedia (which is extremely unlikely, since the meaning of his definition is slightly different), this is not regarded as plaigarism and as per the wikipedia’s own entry on plaigarism, such commonly held knowledge (i think 18M definitions qualifies as common knowledge) is not something that can be plaigarized.

    Let me say that this is one of the most reprehensible attacks on a reputable scientist that I have seen, and the so-called tsunami of accusations made in regards to climategate are nothing in compared to the attack on Wegman.

    Sounds like she is saying the attacks are reprehensible, a perfectly valid opinion.

    What’s your point?

    11) Are you claiming that plagiarism isn’t fraud?

    I define scientific fraud fairly narrowly as ” the intentional misrepresentation of one’s methods or results.”

    If the plagiarism doesn’t involve method or results, IMO it does not scientific fraud, even though it is still misconduct.

    Using a paper that you know has errors, without reporting those errors, as part of a follow-on study… that smells more likely scientific fraud.

    You could probably call it “literary fraud”, not a very used or useful term, nor very relevant to considerations of validity of a research paper.

    Before you comment, if you choose to, I hope you reflect on the concept that if you are using key words and phrases incorrectly, it is perfectly valid to point that out. Words do have meaning, and using them wrong is not helpful.

  35. Nick Stokes:

    I don’t think that they should, especially without telling, manipulate the plots to get the preferred orientation, and when prompted I’ll say so. But that is not the argument I’ve advanced, and none of your quotes so far say otherwise.

    It’s odd you have problems with showing only one orientation, when the orientation obviously doesn’t matter, but you have no problem with Briffa deciding to not show part of his results without comment.

    Sorry but this seems like subterfuge on your part.

    Regarding my response to point 8:

    Wegman was reprimanded for poor judgement in connection with plagiarism on the part of a co-author. I think this amounts to putting your name on a paper where plagiarism occurred, and not checking whether the other person could conceivably have written the text they claimed to have written. I agree that a reprimand was in order (it should have been obvious that the text was plagiarize), but it’s interesting that reprimands happens as infrequently as they do, given how often people are coauthors to papers that include outright scientific fraud, where there are absolutely no sanctions against them.

    No doubt politics played a role here. Again it’s deeply ironic that Bradley is involved in finger pointing, given how dirty his hands are.

  36. Carrick:

    Wegman failed to do due diligence on a number of important issues. I don’t claim otherwise. In fact I claim I don’t take his report very seriously because of the lack of due diligence.

    I have no problem with this statement. I’ve never cared much about the Wegman Report. I barely said a word about it when it came out, and pretty much every time I’ve talked about it I also referred to the coinciding NRC report to show the bias of MBH’s methodology is indisputable (to anyone with a shred of intellectual integrity).

    That said, the criticisms of the Wegman Report go overboard. Making a fuss of the orientation of the graphs when the orientation is completely irrelevant (and the same thing is done by people on both sides)? Making a fuss of a sample being cherry-picked while ignoring the fact a representative sample would have been incredibly similar? Claiming Edward Wegman merely duplicated work when the three figures after the one being discussed clearly show new work (and its Appendix A giving a mathematic proof of the methodology’s bias)?

    Pretty much the only reason the discussion of The Wegman Report interests me is it shows a fascinating difference in standards. Above, you can see Nick Stokes suggesting my remarks on this page have been hypocritical, that I’d have treated the subject very differently if Michael Mann had been involved. I believe this was projection on his part. I think it’d be incredibly easy to show him jumping through hoops to defend Mann’s decisions to do far worse things than these.

    Another part of the difference in standards comes from people blindly accepting things Deep Climate and John Mashey say, no matter how wrong they are. Plus, I still get a chuckle over the fact nobody cared when USA Today said Mashey’s 250 page document analyzed 35 pages of plagiarism when it actually only covered 30. Five of those were copied from Deep Climate, and another five weren’t even covered by his report. How do you discuss charges of plagiarism while giving credit to the accusing document for work it doesn’t contain?

  37. Nick Stokes, this remark makes no sense:

    Yes, I think you should. Orientation of the plots has never been my argument.

    You quoted me having asked:

    “Do you think people will just believe I lied about when I said I can provide quotes showing you made this argument? “

    There is nothing you could be referring to here. I asked if you think people will believe I lied. You say you “think [you] should.” I should what? Think I lied? Why are you switching the topic from what other people will believe to what I should… do? Believe? Syntax alone makes your response unintelligible.

    Beyond that, how in the world do claim you’ve never advanced the argument failing to graphs with a negative orientation was wrong? You say no quote I’ve offered shows otherwise, but that’s purely hand-waving. It is obvious on the face you’ve called it wrong given you called it subterfuge. How could anyone interpret you calling something subterfuge as you not saying it’s wrong?

  38. Carrick September 15, 2014 at 4:39 pm
    “It’s odd you have problems with showing only one orientation, when the orientation obviously doesn’t matter”

    Do you think that he should show only one orientation? Which orientation should it be? How should he achieve it?

  39. Brandon,
    You say you “think [you] should.” I should what?
    Quote me actually making the argument that you attribute to me. Yes, it’s true that I don’t think they should, without telling, invert the plots. And even with telling, I can’t think of a really satisfactory way to do it. I think that. I think people should wear seatbelts too. But neither of these are bthe argument that I advanced.

  40. I would orient them against the 20th century rise in temperature using the sign of correlation over the period of overlap.

    But I would say why I did it (for purpose of visually judging the quality of the proxy) and I would explain how I prepared the figure. This is a legitimate criticism of Wegman’s report, but so is the criticism of Briffa to redact a portion of his results then not say he had done so, let alone give a motivation why. Or the criticism of Mann to not show the R2-verification scores where his proxies failed to validate.

    The Briffa and MBH examples are too me more significant because they relate to methods and results. Figure 4.4 of Wegman is labeled as an “illustration” of problems, as such it is not a “result”.

  41. Carrick,
    “I would orient them against the 20th century rise in temperature using the sign of correlation over the period of overlap.”

    They are red noise samples, not proxies. Some will have vary little correlation at all. Are you going to invert them because of an epsilon?

  42. Carrick – If you ran a program that supposedly gave you a random sample of 12 results out of 10000 and the expected distribution was half negative values and half positive values, would you not be a bit concerned if *all* the results came out negative or positive? As Aereon said in the Chronicles of Riddick, “Now, what would be the odds of that?”

    This why the results are significant – not because the sign is irrelevant to the math, but because the fact they were all one sign defies probability and is a sure sign (sic) that there’s a problem. Ignoring this is a complete lack of due diligence. I don’t believe you would have ignored it and I know I wouldn’t have. If Wegman had taken 6 positive values and 6 negative values and simply flipped the negative so they were all consistent you would have a case. That is not what happened. We know it didn’t happen because they all came from the archive. So Wegman *only* saw positive cases.

    Plagiarism: GMU’s Wegman Report report never looked at the Social Network Analysis section. It was this section that was *also* in the CDSA paper that was retracted. It is a rather odd contention then that the same plagiarism doesn’t extend to the Wegman Report.

    Mischaracterization: Wegman wrote: “Both Esper et al. (2002) and Moberg et al. (2005) indicate that current global temperatures are not warmer that the medieval warm period.”
    Yet Esper actually said: “… annual temperatures up to AD 2000 over extra-tropical NH land areas have probably exceeded by about 0.3 C the warmest previous interval over the past 1162 years. ”
    and Moberg said: “We find no evidence for any earlier periods in the last two millennia with warmer conditions than the post-1990 period—in agreement with previous similar studies”
    Of note, the word ‘global’ doesn’t even appear in Esper (2002) – so he not only changed Esper’s temperature results – he changed their scope as well.

    Independent results: Why is this relevant? Oh, c’mon – the Figure 4.4 caption claims he independently verified the results.

  43. Nick, whether red noise or proxy data, the algorithm is treating them as proxies.

    It would be appropriate to graph them for the purpose of visualization with the same orientation as used by the algorithm. If the algorithm regress them against temperature over the 20th century, the orientation it uses is the orientation you should pick.

    So generally, pick the sign of the correlation between the proxy (or red noise sample) and the temperature signal.

    Whether proxy or red noise sample, if they have little correlation, of course, the orientation you use won’t matter very much. So flipping signs won’t do much visually either. A bad proxy still looks like a bad proxy when you multiply it by –1.

  44. Brandon – For the purposes of reproduction/replication – yes, the two noise models are completely different. More important, an eminent statistician would not confuse the two. Especially when the persistence of the two are considered. AR1 with a coefficient of .2, gives an average auto-correlation period for the noise of: 1.5 years. M&M used ARFIMA, which is akin to using AR1 with a coefficient of .9 which gives an average period of 19 years. I consider that in the context of climate to be completely different.

    Your explanation simply doesn’t hold water. All 12 of the results in Fig. 4.4 were upward sloping. Wegman never saw a negative one. We know this because the archived 100 are all positive and that’s where he drew them from. As I wrote in the reply to Carrick – Now, what would be the odds of that?”

    The title of this post is at least appropriate – considering who wrote it.

  45. Kevin O’Neill’s latest comment amuses me greatly, so I’m going to respond to it out of order. He says:

    Mischaracterization: Wegman wrote: “Both Esper et al. (2002) and Moberg et al. (2005) indicate that current global temperatures are not warmer that the medieval warm period.”
    Yet Esper actually said: “… annual temperatures up to AD 2000 over extra-tropical NH land areas have probably exceeded by about 0.3 C the warmest previous interval over the past 1162 years. ”
    and Moberg said: “We find no evidence for any earlier periods in the last two millennia with warmer conditions than the post-1990 period—in agreement with previous similar studies”
    Of note, the word ‘global’ doesn’t even appear in Esper (2002) – so he not only changed Esper’s temperature results – he changed their scope as well.

    In doing so, he implies the Wegman Report cited the text of Esper (2002) and Moberg (2005) then mischaracterized it. This isn’t true. The text he quotes is from a discussion of Figure 5.9, the caption for which says:

    Figure 5.9. A comparison of several different reconstructions. From D’Arrigo et al. (2006)

    And if one looks at the figure, the description is accurate (aside from conflating hemispheric/global temperatures, a common problem since Mann’s hockey stick came out). The reason this amuses me is that figure has a multitude of problems, but by misrepresenting what the text he quoted referred to, O’Neill sabotages his own argument.

    There is nothing inherently wrong with re-interpreting a paper’s results. People are free to plot Moberg and Esper’s data and draw their own conclusions from it so long as they correctly attribute their discussion to the papers’ data rather than the papers’ text, which the Wegman Report did.

    Had O’Neill bothered to take the time to understand the issues he discusses, he could have explained how Figure 5.9 is wrong and misleading. He could have explained how it relies upon a graphical trick to reach a particular conclusion. He could have explained how the text discussing the figure actually misrepresents what the figure shows.

    But instead, he set up an argument easy to knock down. All anyone has to do to rebut what he said is respond, “The Wegman Report actually plotted and described Moberg and Esper’s data. They reached different conclusions because Moberg and Esper were wrong.”

  46. biased source.Regarding the mischaracterizations, thank you. I’ve never read the report (other than in vetting your claims).

    Yes it is undoubtably a mischaracterization to say : “Both Esper et al. (2002) and Moberg et al. (2005) indicate that current global temperatures are not warmer that the medieval warm period.”

    Plagiarism: GMU’s Wegman Report report never looked at the Social Network Analysis section

    I think you need to substantiate this claim. I’ve see the theory that this was so, but nothing from GMU that acknowledges this.

    In either case, it is a mistake to claim that “Wegman was reprimanded for plagiarism by his university employer”. He was not. Even with your interpretation of the GMU report, it did not happen. He was rebuked for poor judgement on a paper that he co-authored, not for plagiarism.

    It is a fact he was found to not be responsible for any acts of plagiarism either in the report or the withdrawn paper.

    Figure 4.4 caption claims he independently verified the results

    Actually what is stated in Figure 4.4 is:

    Figure 4.4: One of the most compelling illustrations that McIntyre and McKitrick have produced is created by feeding red noise [AR(1) with parameter = 0.2] into the MBH algorithm. The AR(1) process is a stationary process meaning that it should not exhibit any long-term trend. The MBH98 algorithm found ‘hockey stick’ trend in each of the independent replications.

    Discussion: Because the red noise time series have a correlation of 0.2, some of these time series will turn upwards [or downwards] during the ‘calibration’ period6 and the MBH98 methodology will selectively emphasize these upturning [or downturning] time series.

    I can’t even see an overlap between your criticism and the actual text of the caption. In fact the more I look at it, the less rational basis for criticizing it.

    It appears he’s showing examples of proxies that have large hockey stick indices. And he’s claiming that MBH “will selectively emphasize these upturning [or downturning] time series”.

    This is true.

    By the way, other than the short discussion following it (which I’ve included), there is no other mention made of this figure. It appears to be an illustration of proxies series with large positive hockey stick indices. There’s absolutely no suggestion that this is a representative sample of proxies (or red noise series), indeed it appears to me these were the result of a sorting of hockey sticks based on their hockey stick index.

    I’m starting to wonder if you’ve even read the report now, or just stuck with DeepClimate’s hatchet jobs.

  47. Nick Stokes:

    You say you “think [you] should.” I should what?
    Quote me actually making the argument that you attribute to me.

    Nice job ignoring the fact you posted a nonsensical response. You even quoted me pointing out the nonsensical nature of your response, agreed to a nonsensical interpretation I suggested for your response, and still somehow failed to do the simple thing of saying, “Oops, I misunderstood what you said.”

    Yes, it’s true that I don’t think they should, without telling, invert the plots. And even with telling, I can’t think of a really satisfactory way to do it. I think that. I think people should wear seatbelts too. But neither of these are bthe argument that I advanced.

    This is complete and utter BS. You did not advance this particular argument in one particular blog post. You did, however, advance the argument. I’ve quoted you doing exactly that. You’ve repeatedly insisted you haven’t advanced an argument despite being presented indisputable proof you made the argument.

    You can’t say something isn’t your main point then demand everyone refrain from pointing out you said it just because it wasn’t your main point. Once you state an argument, everyone is free to discuss that argument and your involvement with it.

    Otherwise I’m just going to end every blog post with, “Nick Stokes is a dishonest idiot.” When anyone says that’s inappropriate, I’ll say, “I never said Nick Stokes is a dishonest idiot.” If they quote me calling you a dishonest idiot, I’ll say, “I’ll say Nick Stokes is a dishonest idiot if it comes up, but that wasn’t the argument I was making.”

  48. Kevin O’Neill just made the absurd remark:

    Brandon – For the purposes of reproduction/replication – yes, the two noise models are completely different.

    And with it, he’s used up the last of my patience. This comment is pure hand-waving. One of the few rules of this blog is when challenged or questioned on a factual matter, a user must fully respond to the challenge or question. I’m instituting that now.

    Kevin O’Neill, as a moderation statement, you must now demonstrate AR and ARFIMA noise models are “completely different.” If you wish to offer caveats to weaken or clarify your intention, you may, but you must provide clear documentation or evidence to support whatever position you choose to adopt. You may, of course, retract the statement instead.

    I believe this is a simple requirement. If you can’t meet it, you will be placed in moderation.

  49. Carrick:

    By the way, other than the short discussion following it (which I’ve included), there is no other mention made of this figure. It appears to be an illustration of proxies series with large positive hockey stick indices. There’s absolutely no suggestion that this is a representative sample of proxies (or red noise series), indeed it appears to me these were the result of a sorting of hockey sticks based on their hockey stick index.

    I’ve made a similar point in the past. The Wegman Report never claimed a non-representative sample was representative. It simply failed to highlight the non-representativeness of its sample. I think that’s still inappropriate, but it is not the cardinal sin some people try to make it out to be.

  50. Brandon — thanks. I should have vetted that text too. In looking at Figure 5.9 I see some issues. To prove to Kevin your relative neutrality, perhaps you should post your observations. ; – ) [spaces to prevent icky emoticonizations]

  51. Carrick, the biggest problem I have with that figure is just tracking it down. It claims to be from a paper, but in reality, it is buried in that paper’s SI, or something like that. I don’t remember exactly. I tracked it down once, and I don’t care to do it again.

    The next biggest problem I have is it ends well before the modern times. If I remember right, you have graphs in it ending in ~1980. You can’t compare modern temperatures to past temperatures by comparing past temperatures to temperatures even further in the past. That, of course, brings up the problem of spatial/temporal resolution.

    There are at least several other problems I’d bring up in Kevin O’Neill’s position, but I don’t have the interest to delve into the matter at the moment. What the Wegman Report should have done is plotted the same graph then added a note like, “Because of varying uncertainty levels and spatial/temporal resolutions, no direct comparison can be made with these reconstructions. The shapes of the reconstructions shouldn’t be taken as anything more than a general idea of results. The primary point of interest, instead, should be the number of indentical proxies used in each reconstruction as indicated by the numbers running along them.”

    By the way, I’d thank you for adding those spaces to avoid that emoticon, but I believe the one which creeps me out doesn’t have a hyphen in it. I’m not sure if the one with a hyphen in it even has an emoticon. I may have to check.

    Edit: I tested it, and with or without the hyphen, you get the same emoticon. It’s not the one that creeps me out though. It’s a much tamer one, one you can see here 😉

    The creepy one is : P without the space. I’m still seriously considering banning its use.

  52. Brandon,
    “You did, however, advance the argument. I’ve quoted you doing exactly that.”

    No you haven’t. The first quote was my response to your nutty demand that in graphing the unselected PC’s, I should turn half of them upside down in accordance with some unspecified rule. I said I wouldn’t. That’s not me stating my argument about why Wegman was wrong.

    The second is in the commentary to this very post, where you’ve already made your accusation and declared me to be dishonest (your favorite word). And I’m quite explicitly describing an argument that you say Kevin made. I said it’s a reasonable argument. That doesn’t make it my argument.

    Let me quote you from back there
    “Do you think people will just believe I lied about when I said I can provide quotes showing you made this argument? “
    You’re raising the stakes. I think you should provide those quotes.

  53. Carrick Now- “I’ve never read the report (other than in vetting your claims”

    Carrick 2011: “Mashey’s report is a jumbled mess, much worse than the Wegman report.”

    Carrick 2011::”I agree with Jeff ID that the substance of Wegman’s report, as far as it went, was factual…. That said, I don’t think it was very original, creative and yes even very sloppy.”

    Carrick 2010: “I think we can all agree this is cut from a different cloth than the type of sloppy, inadvertent plagiarism found in Wegman’s report.”

    You sure hold a lot of opinions on the Wegman Report considering you’ve never read it except to vet my claims. Some might consider this a sign of bad faith. In fact, I consider it a sign of bad faith.

  54. Nick Stokes, you said you wouldn’t “perpetuate this subterfuge” in reference to plotting the graphs only with a positive orientation. If you plotting graphs only with a positive orientation would be perpetuating subterfuge, plotting graphs only with a positive orientation must be subterfuge. As such, you labeled what the Wegman Report did subterfuge. Subterfuge is wrong, ergo, you said the Wegman Report plotting graphs only with a positive orientation was wrong.

    I’ve already laid out this logic. If there is something wrong with it, you should say what it is. Until you do, no matter how many times you say I’m wrong, nobody will believe you.

    It’s simple. You called it subterfuge. Subterfuge is wrong. You called it wrong.

    How am I wrong?

  55. Brandon – I’ve already demonstrated it. The noise model quoted has a persistence of 1.5 years. The one used is `19 years.

    Note – neither you (nor Carrick) can answer how all 12 upsloping proxies didn’t clue Wegman in to problems. Now what are the odds of that?

    Like Carrick, you are not arguing in good faith.

  56. Kevin O’Neill, the difference you cite isn’t even a meaningful description of the noise models, but leaving that aside, it does nothing to support your claim the two models are completely different. I’ll use your terminology for simplicity.

    You can create an ARFIMA with the same persistence length as an AR(.2) model. You can create an AR model with the same persistence as an ARFIMA(.2) model. Given both noise models can produce the same persistence length, a difference in persistence length cannot indicate the two are completely different.

    As such, I must reiterate my moderation statement to you. You claimed ARFIMA and AR models are completely different. Justify that claim or correct it.

  57. Brandon, wrong?
    ” As such, you labeled what the Wegman Report did subterfuge. Subterfuge is wrong, ergo, you said the Wegman Report plotting graphs only with a positive orientation was wrong.”

    There’s no ergo there. The subterfuge is Wegman’s selection process. It has the incidental effect of returning upright HS profiles, but the orientation is not why I am arguing that it is wrong. My argument is set out in the blog post, and as you conceded, the argument there is not based on orientation at all.

  58. Kevin, you’re simply being absurd now.

    You know full well one doesn’t have to perform a detailed cover-to-cover reading of a document before one can make a judgment.

    The only case where that would be true is when there was nothing further written about it, which is hardly the case here. Obviously, like you, most of my knowledge is from secondary sources. There’s nothing wrong with that unless one is trying to argue, as you appear to be, that you have a detailed knowledge of the document.

    Let’s look these detailed enumerated points of yours again. How many of these are original to you and how many are from DeepClimate’s website?

    Inquiring minds want to know.

    But I think the correct answer is “very little”

    Inquiring minds want to know also “how many of those enumerated points did you personally vet”?

    I think the correct answer is “none”. You quite obviously credulously accepted every criticism as true because it fits your narrative.

    neither you (nor Carrick) can answer how all 12 upsloping proxies didn’t clue Wegman in to problems

    I’ve answered this already. Buy a clue.

  59. Nick Stokes, do you expect anyone to believe what you’re saying? Here’s the full quote:

    “The similarity in shapes is obvious”. Only if you select so they are always the same way up. Otnerwise you’d have to explain why PC1 is not the same as the recon. And maybe they’d even ask – well, what does the recalculated recon (with centering, the real equivalent of MBH recon) actually look like? A very obvious qn, why Wegman avoided. The answer, per Wahl and Ammann, is, not very different at all.

    I see no reason why I should butcher the actual PC1 calcs to perpetuate this subterfuge.

    You specifically said the similarity the Wegman Report highlighted is only obvious if one shows only positively oriented graphs. You then said you wouldn’t “butcher the actual PC1 calcs to perpetuate this subterfuge.” The only aspect of the PC1 calculation you were discussing was the orientation of all graphs the same way.

    Exactly what part of the calculations do you expect people to believe you were talking about?

  60. Nick, can you explain what is wrong with a figure that is showing 12 hockey stick shapes? The answer is there’s nothing wrong with it.

    I’ve concluded there’s nothing really wrong with the figure, but it doesn’t show very much. (McIntyre’s GRL paper has some nice histograms that are more informative.)

    I am assuming, like Kevin O’Neal, you’ve never actually looked at the primary document you’ve criticized.

    I do need to add a bit to what I said earlier: There’s nothing wrong making judgments using secondary documents, as long as the author of them is reliable.

  61. Carrick, unless I’m mistaken, the Wegman Report’s Figure 4.2 shows the same thing you like seeing in the GRL paper. The two are mildly different due to being from different runs (repudiating the notion Wegman didn’t run the code for himself, if anyone had that thought), but they show the same thing.

  62. Brandon,
    “You specifically said the similarity the Wegman Report highlighted is only obvious if one shows only positively oriented graphs.

    Yes, after you’ve made the selection for HS shape, as Wegman had. Orientation is then a further requirement. That’s an observation, not an argument. The context is my initial para:

    “No, you’ve criticized me for presenting randomly generated PC1 shapes as they are, rather than reorienting them to match Wegman’s illegitimate selection. But the question is, why should I reorient them in that artificial way.”

    I’m not saying that orientation is all that is is wrong with Wegman’s selection. I’m saying I see no reason to do it.

  63. Nick Stokes, you did not just say:

    I’m not saying that orientation is all that is is wrong with Wegman’s selection. I’m saying I see no reason to do it.

    You said this was necessary to mislead people (subterfuge). Misleading people is wrong. If something is necessary in order to mislead people, it is necessary in order to do something which is wrong. Ergo, it is wrong.

    You are currently in a position of claiming you said it was necessary in order to do something which was wrong, but you didn’t say it was wrong. Do you seriously expect anyone to accept that?

    Should I say you’re a dishonest idiot now?

  64. Brandon Shollenberger
    “Should I say you’re a dishonest idiot now?”

    I’m sure you will. You don’t have much else to say. You have no quotes of me making the argument you claim. You just make up stuff.

    “You said this was necessary to mislead people (subterfuge).”

    Where? I said the selection was a subterfuge. I simply made the observation:

    “The similarity in shapes is obvious”. Only if you select so they are always the same way up.

    Is that untrue? Are you not in fact saying the same? Does that mean you are making the same argument you attribute to me?

  65. Carrick – Your disingenuous “independent verification” comment probably set me off. I deal everyday in verification, replication, calibration, performance tests, repeatability studies, and uncertainties. Independent replications and independently verified is a distinction without a difference. Perhaps you get off on semantic games, I don’t.

    Plus, I remember my only other substantive interaction with you – where you spewed a list of a dozen papers and *not one* of them supported your position.

    It appeared to me that you just quote mined a bunch of titles and abstracts because you’d surely never read them.

    I don’t like dealing with bad faith actors.

    The CDSA paper was retracted because of plagiarism. Wegman was the lead author. He is responsible. Wegman was reprimanded for being the lead author on a paper that had large portions of it plagiarized. Happy? these are the semantic games that fill your world.

    Many of those same plagiarized sections are in the Wegman Report.

    I’m still waiting on an answer to, What are the odds of that?

  66. Kevin, I’m still waiting for an answer.

    Have you read Wegman’s report?

    A simple yes or no will do.

    If we want to talk about what is bad faith after this, that’s fine with me. We can go back and discuss why you were quote mining me to try and prove bad faith where it was obvious your entire argument was fallacious.

    You can explain why that wasn’t an example of arguing in bad faith on your part.

    We can then go back and look at that list of papers of mine and see how accurate your memory is.

    Put another way, don’t make claims if you don’t want them to be tested. Not everybody is as intellectually lazy as you are.

  67. Nick –

    I think you’ve come closer to losing your cool than I’ve seen before.

    Keep in mind who you’re dealing with.

    This is an example of the kind of argument that Brandon makes:

    Did you know the universe began with a huge explosion? If not, you’re an idiot. So says Dan M. Kahan.

    (Kahan never said what Brandon said he said)

    Expect Brandon to fabricate arguments, assign them to others, insist that’s what they said, and double-down when challenged.

    It’s actually quite funny if you don’t get caught up in it.

  68. Kevin:

    The CDSA paper was retracted because of plagiarism. Wegman was the lead author. He is responsible. Wegman was reprimanded for being the lead author on a paper that had large portions of it plagiarized. Happy? these are the semantic games that fill your world.

    We’re back to that again?

    You made a false claim. You said Wegman was reprimanded for plagiarism.

    He was not.

    Pointing out that he was not isn’t playing semantic games, you were simply factually wrong.

    It’s that simple.

    But rather than admit you were wrong, apparently you have such a low self image, you can’t admit to error so you have to attack me personally.

    Because, climate change.

  69. Brandon – An AR(1) coefficient of 0.9 explains 81% of the variance. A coefficient of 0.2 explains 4% of the variance. Again, you probably already knew this, but must play games.

    No eminent statisitician is going to mix these up.

    No statisitician – eminent or otherwise – is going to flip a coin 12 times, have them all come up heads, and not ask to inspect the coin.

    You and Carrick can play all the word games you want – it doesn’t change the fact that Wegman’s Fig. 4.4 underscores the fact he did not do due diligence.

  70. Carrick – it wasn’t an attack – it was just a recollection. In fact, in that recollection *you* were the one who made the personal attack. So don’t play holier than though with me.

  71. Kevin:

    You and Carrick can play all the word games you want – it doesn’t change the fact that Wegman’s Fig. 4.4 underscores the fact he did not do due diligence.

    Look, even McIntyre agrees he failed do due diligence. I agree he failed to do due diligence. I imagine Brandon does too.

    So this is turning into an argument over minutiae: I think you are misinterpreting Fig. 4.4 based on something you must have read from a third party. I think there is something about the context of the figure you and Nick are missing, as in, as I believe it claims less than you think it claims.

    The point is absurdly simple so I forced myself to go look at it again and make sure I wasn’t missing something.

    Your criticism would be as if I were to go into a forest and take pictures of birds, and have you wondering why I don’t have a representative sample of all of the animals in the forest. In my case the answer is because I was photographing birds. Wegman has a figure showing hockey sticks. Because NOT all red noise is hockey sticks, it isn’t representative of all red noise. I kind of doubt anybody with experience with red noise could really get confused on that.

    Wegman says “The MBH98 algorithm found ‘hockey stick’ trend in each of the independent replications.”. He didn’t say these were representative of anything other than being hockey stick shaped. But it’s not an important figure. It’s not referenced anywhere in the literature nor anywhere else in the report, so minutiae.

    But that doesn’t mean the report was well done.

    I admit I have not read it cover to cover. I am sure you have not. Based on some misstatements, I’m pretty sure Nick hasn’t either. I can’t speak for Brandon.

    But that’s not an indictment. Secondary sources are fine to form judgements from. Like with reviews of restaurants or movies. It saves us from wasting our time on stupid things.

    I’d be really surprised given Wegman’s obvious novice status that there weren’t major gaffes.

    Anyway, there must be better examples of lack of due diligence than Figure 4.4.

    In Figure 5.9, the series don’t extend to 2005. Most of them stop much sooner. Concluding that “Both Esper et al. (2002) and Moberg et al. (2005) indicate that current global temperatures are not warmer that the medieval warm period” is dead wrong even viewed as the interpretation of data given in Figure 5.9. (But I’d still like to hear Brandon’s comments.)

    Wegman failed to supervise his co-author and somehow didn’t notice obvious plagiarisms.

    In short, the level of scholarship here is appalling.

    There is simply no need to exaggerate the extent of the problems with this report.

    I was objecting to what I see as mischaracterizations and exaggerations in your list of “indictments”, but not to the bottom line.

  72. Nick Stokes:

    I’m sure you will. You don’t have much else to say. You have no quotes of me making the argument you claim. You just make up stuff.

    You can say things like this, but it isn’t true. I’ve consistently offered a simple explanation for the point I’ve made. Anyone can see it. The only question is whether or not they believe your responses to it, including the multitdue which do not actually respond to it, such as:

    Where? I said the selection was a subterfuge.

    This is not true. I explained why it was not true. You chose not to discuss what I said, instead cherry-picking which portion of my comment to quote in order to avoid acknowledging my explanation which shows why this statement is not true. It’s cheeky to claim I “just make up stuff” while flagrantly ignoring things I say when they’re inconvenient. I am sure it is easy to believe I “just make up stuff” if you choose to just pretend I don’t say the things which justify my claims. If you actually read my comments though, you’ll find it impossible to hold that position.

  73. Carrick,
    ” I think there is something about the context of the figure you and Nick are missing, as in, as I believe it claims less than you think it claims.”

    The fig shows 12 profiles,each with a hockeystick. It says:
    “One of the most compelling illustrations that McIntyre and McKitrick have produced is created by feeding red noise [AR(1) with parameter = 0.2] into the MBH algorithm. The AR(1) process is a stationary process meaning that it should not exhibit any long-term trend. The MBH98 algorithm found ‘hockey stick’ trend in each of the independent replications.”

    Do you see “each of the independent replications” as compatible with “in each of the best 1% we could find”?

  74. Kevin O’Neill, I informed you of a simple moderation decision, that you must explain how AR and ARFIMA models are completely different as you claimed, or otherwise correct your claim. Your first attempt was obviously wrong and you’ve since abandoned it. Your current response:

    Brandon – An AR(1) coefficient of 0.9 explains 81% of the variance. A coefficient of 0.2 explains 4% of the variance. Again, you probably already knew this, but must play games.

    Doesn’t even attempt to do what you were required to do. Stating the amount of variance an AR model explains does nothing to explain how AR and ARFIMA models are completely different. Given you are clearly not even trying to abide my site’s rules, I am now placing you in moderation. As such, I direct you here.

    For people who haven’t heard me describe how my moderation works, I’ll quote the text from that post:

    I don’t like the idea banning posters or deleting comment. I am loathe to do either. As such, I am going to try a new moderation option. From here on in, users who cannot follow basic guidelines will be forbidden from writing in any topic save ones I explicitly mark as bins for them.

    If they wish to participate in other threads, they are allowed only to make comments with a single link to a comment they’ve made in such a moderation bin. This will allow them to still comment while not forcing readers to see what they say.

    Side note, this is the first time I’ve actually placed anyone in moderation here.

  75. I’m going to elaborate on the issue with Nick Stokes. He claims he never argued it was wrong for Edward Wegman to show only graphs which were oriented one way. I quoted a comment in which he said things like:

    No, you’ve criticized me for presenting randomly generated PC1 shapes as they are, rather than reorienting them to match Wegman’s illegitimate selection. But the question is, why should I reorient them in that artificial way.

    But despite saying “the question is” the orientation issue, he insists he was only talking about the selection issue. And though he then went on to say what the effect of the orientation issue is:

    “The similarity in shapes is obvious”. Only if you select so they are always the same way up. Otnerwise you’d have to explain why PC1 is not the same as the recon.

    He insists he was still only claiming the selection issue was wrong, not the orientation issue. And even after he specifically referred to the orientation issue resulting in subterfuge:

    I see no reason why I should butcher the actual PC1 calcs to perpetuate this subterfuge.

    He insists he was still only claiming the selection issue was wrong.

    The reality is he didn’t even talk about the selection issue in that comment. The only issue he talked about, the issue he said “the question is” about, was the orientation issue. Despite labeling “the question” he discussed the orientation issue, and despite specifically discussing the orientation issue multiple times, he wants everyone to believe he only called the selection issue wrong.

    That is, the selection issue he didn’t even discuss.

  76. “The only issue he talked about, the issue he said “the question is” about, was the orientation issue.”
    There’s a simple reason for that. You’re demanding that I intervene and reorient the unselected PCs that I plotted. I’m explaining why I won’t.

    You still haven’t quoted anywhere that I make the argument you claim. All you have is places where you’re hollering at me about orientation, and so I say something about orientation.

  77. Nick Stokes:

    There’s a simple reason for that. You’re demanding that I intervene and reorient the unselected PCs that I plotted. I’m explaining why I won’t.

    You still haven’t quoted anywhere that I make the argument you claim. All you have is places where you’re hollering at me about orientation, and so I say something about orientation.

    You openly admit you said something about orientation. You don’t deny you didn’t say anything about selection. Given you said the thing you were talking about was wrong (subterfuge), and given we’ve established you were talking about orientation, it seems impossible to reconcile your position.

    You admit you talked about orientation, and you admit you said the topic you were discussing caused subterfuge. The only possible conclusion is you said the orientation caused subterfuge. Subterfuge is wrong, ergo, you said the orientation issue was wrong. That you said it in response to someone else bringing the issue up is completely irrelevant.

    “Side note, this is the first time I’ve actually placed anyone in moderation here.”
    OK, I’m outta here.

    So… you’re leaving because I enforced a perfectly reasonable moderation policy on someone else who repeatedly refused to stay on topic or address a simple, factual point, a policy which allows the user to still participate? I’d love to hear how you’d explain your reason for leaving to others.

    I suspect it will involve the victim card in one way or another. Of course, I think a more fair description would involve the “running away” card.

  78. I don’t get the issue with the caption. It starts with a reference to M&M. So that it is copying one of their figures is unremarkable.

  79. I was joking when I suggested that Nick Stokes had printed an average sample, and not from the top 100. Looking at that distribution from M&M figure 2, it appears that selecting the top 100 is not just a visual effect. Is it right that even a random sample would show hockey sticks?

  80. Kevin, the sign isn’ significant. Even Nick agrees it isn’t.

    Wegman’s obviously selected 12 hockey stick shapes. They weren’t picked at random.

    The fact Wegman selected 12 that were pointed up is irrelevant. Had he picked ones that pointed down, the reconstruction would have oriented them right-side up, so orientation is not relevant.

    The fact Wegman selected 12 hockey stick shapes, then didn’t explain how he selected 12 series that were hockey sticks is relevant.

    Wegman should have explained in more detail how he generated the figure. Without that, it’s difficult to interpret the figure as anything more than “12 hockey stick shapes with large positive hockey stick index values”.

    The hockey stick index for each series would be useful to look at and compare to figure 4.2. I wouldn’t be terribly surprised if these all came from the far right end of that distribution. The selection may have been “sort by hockey stick number then choose the 12 on the right hand side”.

    The issue about AR(1) versus archived is interesting, but you need to substantiate this. Nick didn’t seem to agree with you. I haven’t looked. Other sources that I’ve found have other objections, so you might be mistaken.

    A link would be useful (but if it’s not at your finger tips please don’t feel obligated to produce it).

    You and Nick are showing an awful lot of energy about one throwaway figure. There must be some backstory that I’m not aware of that gives an otherwise forgettable figure such prominence.

  81. Okay I understood the part about the proxies now (found and read DeepClimate’s interesting post).

    These are indeed 12 series selected at random from the top 1% of the archived series.

    A line to that effect needed to have been included in the figure.

    My assumption is Wegman wasn’t aware of that, which would be evidence of a lack of due diligence if true. Otherwise, he’s being sloppy or even deceptive.

    The pseudo-proxies are ARIMA not AR(1) with a coefficient of 0.2. This would be equivalent to AR(1) with a coefficient of about 0.9. Wegman clearly didn’t understand this, which is evidence of a lack of due diligence and makes more likely he didn’t know that these are ” 12 series selected at random from the top 1% of the archived series”.

    The question of ARIMA versus AR(1) seems to me as a bit of a red herring to harp on though… the question should be “what is the persistence seen in real proxies and how do you properly model it”? Obviously real proxies contain climate signals, otherwise they wouldn’t be proxies.

    So you need proxies that exhibit “real world” persistence but are not good temperature proxies (I’m not sure there are many “good temperature proxies,” so this may be an easy fix). If you can model the proxy series with ARIMA as McIntyre has done, that may be sufficient.

  82. I should mention that Wegman is clearly using McIntyre’s archive code.

    It is a deceptive act, rather than a lack of due diligence, to say you’ve replicated something when all you’ve done is rerun a program.

    You can add that to your list of enumerated points if you want.

  83. The asymmetrical effort in trying to discredit Wegman (or McIntyre) without any commensurate effort to objectively critique MBH 98/99 is still noteworthy.

    That’s not due diligence either. Nor is it really an honest thing to do

  84. I deleted a comment from Kevin O’Neill because he chose to simply disregard what I told him about him being in moderation.

    I gave him simple instructions on how he can still participate despite being in moderation. If he refuses to follow them, I feel no compunction about deleting his comment.

    Edit: On second thought, given this is the first time a comment was supposed to be made in one of the moderation bins, I’ll be generous and show O’Neill how to do it. I’ve copied his comment to the moderation bin. Here’s a link:

    https://hiizuru.wordpress.com/2014/02/12/moderation-bin/#comment-4358

  85. MikeN:

    I don’t get the issue with the caption. It starts with a reference to M&M. So that it is copying one of their figures is unremarkable.

    Just to be clear since Anders accused McIntyre and McKitrick of dishonesty over this figure, they didn’t actually publish the figure in their paper. They wrote the code, but they only used it to pull out one proxy for demonstration purposes (and they clearly labeled it as non-representative).

    I was joking when I suggested that Nick Stokes had printed an average sample, and not from the top 100. Looking at that distribution from M&M figure 2, it appears that selecting the top 100 is not just a visual effect. Is it right that even a random sample would show hockey sticks?

    A random sample using MBH’s methodology will definitely show hockey sticks. A user commenting on Nick Stokes’s blog post discussed this and demonstrated it for multiple levels of persistence. It’s an obvious mathematical truth though. The level of autocorrelation just determines how strong the hockey sticks are.

    You can also see what a random sample would have looked like in the third figure of this post. That’s a random sample as given by Nick Stokes. I just oriented them all the same way so they could be visually compared. The bias in MBH’s methodology is evident.

    Of course, some people insist it is wrong to show all the PCs with the same orientation, but since the orientation of a PC doesn’t matter, they’re wrong.

    Carrick:

    I should mention that Wegman is clearly using McIntyre’s archive code.

    It is a deceptive act, rather than a lack of due diligence, to say you’ve replicated something when all you’ve done is rerun a program.

    You can add that to your list of enumerated points if you want.

    Except you can’t, as Edward Wegman clearly modified the code as is seen by other figures in the report. Additionally, Wegman created a mathematical demonstration of the bias of MBH’s approach. While that may not be replication, it is certainly confirmation of McIntyre and McKitrick’s work which was uniquely done.

    The pseudo-proxies are ARIMA not AR(1) with a coefficient of 0.2. This would be equivalent to AR(1) with a coefficient of about 0.9. Wegman clearly didn’t understand this, which is evidence of a lack of due diligence and makes more likely he didn’t know that these are ” 12 series selected at random from the top 1% of the archived series”.

    I see no reason to believe he didn’t understand this. The mistaken text could have easily been nothing but a typo. It is not hard to imagine someone writing the text could have written “AR” instead of “ARFIMA.” This is especially true if Wegman had one of his students write the text.

    I don’t see how one can interpret a typo as showing a person doesn’t understand something. I can just imagine if we applied that standard to blog posts.

  86. Brandon:

    Except you can’t, as Edward Wegman clearly modified the code as is seen by other figures in the report. Additionally, Wegman created a mathematical demonstration of the bias of MBH’s approach. While that may not be replication, it is certainly confirmation of McIntyre and McKitrick’s work which was uniquely done.

    Okay that’s a valid point.

    But as far as we can tell, because the code isn’t available, right?

    Anyway, I think it’s stretching it to call running a code that probably just slightly modified from M&M 2005 an independent replication. Especially if the changes were to just produce different figures.

    I’ll accept Wegman made other contributions besides rerunning the code. This isn’t technically a replication, is it? Maybe it’s “validation.”

    I see no reason to believe he didn’t understand this. The mistaken text could have easily been nothing but a typo. It is not hard to imagine someone writing the text could have written “AR” instead of “ARFIMA.” This is especially true if Wegman had one of his students write the text.

    Unless he’s issued a corrigendum, the standard is to accept he meant what he wrote: “AR”.

    That’s part of why the requirement of issuing corrections to substantive errors is a vital part of the scientific process. We need to be able to take a person at his word, and accept that what he wrote is an accurate picture of what he meant to say.

    Do you agree he should have added the sentence “12 series selected at random from the top 1% of the archived series” to his description of Figure 4.4.

    It’s interesting what a blogostorm that uninteresting figure has generated. It was obvious to me that Wegman had done some sort of pre-selection, so I am still shaking my head a bit on the amount of energy that has been spent on that one minor point.

    All the while there’s still MBH, which these guys are scrambling like ants to protect, smelling like ripened compost.

  87. I’ve been reading David Riston’s comments, what DeepClimate says about it and McIntyre’s reply to Riston’s original comment on his 2005 paper.

    It’s easy to see how people without a technical background could get really confused.

    I thought this to be an interesting claim on McIntyre’s part:

    ince the MBH98 algorithm subsequently re-scales the tree ring PC1 and applies the result in a regression step, the original scale is irrelevant anyway since remaining differences in scale merely change the regression coefficient.

    There’s some discussion about PC scaling factors in DeepClimate. I appear he’s gotten many of his talking points from Riston, without looking to see if those talking points were valid.

  88. I think that quote of McIntyre is of the obviously true variety.

    Here’s another one:

    While we strongly criticized MBH98 short-centering, we emphasized the inter- relationship of the flawed methodology and the flawed bristlecone proxies.

    This is a point I’ve made repeatedly too. When you have multiple errors, you should not assume that they they do not interact. It is invalid to add the effects of the corrections together linearly.

    This is also pretty good:

    In McIntyre and McKitrick [2005c], we show how even one or two hockey stick shaped bristlecone pine sites are sufficient to distort the decentered PCs from the North American network.

    Because of the skewed nature of the short-centerd PC calculation, Mann’s algorithm is not robust against “hockey-stick-like” noise.

    Noise is “anything besides what you are trying to measure”. So it could be red noise-like variations. Or it could be the inclusion of a series that contains a climate signal, that just happens to not be temperature.

    Because the tree rings robustly respond to precipitation, soil moisture, solar exposure and soil nutrient level in addition to temperature, and because in general one doesn’t expect these quantities to always co-vary, it actually looks like the MBH algorithm was tailor-made to fixate on non-tempearture climate signals and to produce an incoherent sum over proxies once you are outside of the instrumentation training period. That gives you a very flat handle.

    Didn’t you say that Mann spells out in his book that he preferred the short-centered PCA method over the conventional one?

    This makes me wonder if he played around with the algorithm, and noticed that if he short-centered the PCA he got a flat handle.

    And therefore climate science.

  89. Going back to report writing, but I just wanted to comment that if you really want to simulate Mann’s algorithm, you need to use series that contain coherent signals that aren’t temperature related.

    It would be easy enough to do this by modeling for each proxy site, e.g., precipitation and solar exposure as additional explanatory variable for tree ring growth (in addition to temperature). There are then various statistical methods you can use to produce Monte Carlo’d series over the reconstruction interval (I don’t prefer econometrics methods here).

    If you can show that a flat handle is a robust feature of MBH’s algorithm, you’ve basically then shown that MBH’s algorithm is not robust against non-temperature climate signals.

    People could always argue that there might be a special case where he could have gotten the right answer, but of course there is plenty of evidence that the reconstruction is invalid (when compared to modern series) and beyond that “not robust is not robust”. It is not a good algorithm to use here, unless you like specious reconstructions.

  90. Carrick:

    But as far as we can tell, because the code isn’t available, right?

    I’ve never paid attention to what Edward Wegman has and has not released. I’d obviously prefer him release any material/code he used, but people don’t even care that Michael Mann repeatedly says all of his code for his original hockey stick was made available when it wasn’t. I’m not going to get worked up if Wegman abides by the same standards everyone else in the field abides by.

    Anyway, I think it’s stretching it to call running a code that probably just slightly modified from M&M 2005 an independent replication. Especially if the changes were to just produce different figures.

    I’ll accept Wegman made other contributions besides rerunning the code. This isn’t technically a replication, is it? Maybe it’s “validation.”

    I don’t agree that’s all the changes he made did, but that’s besides the point. What exactly counts as “replication”? If you work through someone’s code and see it does what they claim it does, have you replicated their work? What if you then build upon their code? What if you derive a mathematical proof of what the code demonstrates?

    One of the common responses from defenders of Michael Mann when it came to releasing code was replication doesn’t matter because it’s just duplicating a person’s work. Under that view, what Wegman did was an independent replication.* I don’t know that that view is right though.

    Unless he’s issued a corrigendum, the standard is to accept he meant what he wrote: “AR”.

    That’s part of why the requirement of issuing corrections to substantive errors is a vital part of the scientific process. We need to be able to take a person at his word, and accept that what he wrote is an accurate picture of what he meant to say.

    *snorts*

    Sorry. If science worked as it should, I’d agree with you. I just can’t imagine actually applying this standard to climate science. I could list examples showing half the papers discussed in the Wegman Report don’t apply it. I think it is a good standard, but it is definitely not the standard of the field.

    Do you agree he should have added the sentence “12 series selected at random from the top 1% of the archived series” to his description of Figure 4.4.

    I’d probably word it differently (as I recall, only 100 series were even archived), but yeah. I’ve said so a number of times. I’ve also said he should have included an explanation of why he did it. When McIntyre and McKitrick used a cherry-picked series, they specifically said it was non-representative and explained why they were showing it. That’s how it should be done.

    *According to Mann’s defenders, papers written by Mann’s co-authors, using 80% of the same data as him, count as “independent verification.” The word “independent” has long since lost all meaning in these discussions.

  91. Carrick:

    I’ve been reading David Riston’s comments, what DeepClimate says about it and McIntyre’s reply to Riston’s original comment on his 2005 paper.

    It’s easy to see how people without a technical background could get really confused.

    I definitely sympathize with people without a technical background who try to understand these subjects. I am such a person, but I’ve spent a lot of time following the hockey stick debate. I don’t think there’s a good way for most people to get “up to speed” on the various issues.

    That’s part of why I wrote my series of posts on Michael Mann. I think they are the simplest explanations I’ve ever seen of the issues they covered. Unfortunately, I didn’t cover everything I should have. For instance, I didn’t discuss how proxies were rescaled and weighted. Oh well.

    People like David Ritson have no excuse. I understood the subject better than he did when he criticized M&M, and I was 19. The extent of my formal training in math was AP Caclulus. It’s disheartening to see people who ought to know far more than me say such stupid things.

    Because the tree rings robustly respond to precipitation, soil moisture, solar exposure and soil nutrient level in addition to temperature, and because in general one doesn’t expect these quantities to always co-vary, it actually looks like the MBH algorithm was tailor-made to fixate on non-tempearture climate signals and to produce an incoherent sum over proxies once you are outside of the instrumentation training period. That gives you a very flat handle.

    There is strong reason to believe the bristlecone proxies, which were the source of the hockey stick PC, showed hockey sticks because they were recovering from damage. When the tree recovers from damage, there’s a huge growth spurt in the damaged portion. That causes tree rings far wider than seen anywhere else in the tree.

    Didn’t you say that Mann spells out in his book that he preferred the short-centered PCA method over the conventional one?

    I don’t recall him saying that. He defended the methodology he used, but he said in retrospect, he wouldn’t have used it because it created an avenue for people to attack him. I might be forgetting something though. It’s been a while since I read his book.

    This makes me wonder if he played around with the algorithm, and noticed that if he short-centered the PCA he got a flat handle.

    And therefore climate science.

    I doubt it. There is a lot of evidence to suggest Michael Mann doesn’t understand what he’s doing. Not only does it show up in his code which has tons of strange, incompetent problems, but it constantly shows up in his writing.

    I think the most plausible interpretation is Mann either miscoded or misunderstood PCA when writing his code. It got results he liked so he assumed it was right. That’s why he labeled it “conventional” PCA even though it obviously was not.

    There are too many examples where Mann said things that make him sound incompetent and unaware. If he had actually known what he was doing, I think he’d have done a better job of covering it up.

  92. Brandon:

    I’ve never paid attention to what Edward Wegman has and has not released. I’d obviously prefer him release any material/code he used, but people don’t even care that Michael Mann repeatedly says all of his code for his original hockey stick was made available when it wasn’t. I’m not going to get worked up if Wegman abides by the same standards everyone else in the field abides by.

    Wegman promised to Barton that he would release his code. Dicussed by Riston here, and pertinent quote of Wegman:

    Our report called for disclosure of federally funded work. Material based on our report is being prepared for peer review journals at present. It is not clear to me that before the journal peer review process is complete that we have an academic obligation to disclose the details of our methods. Nonetheless, I assure you that as soon as we are functional again, I will create a website that fully discloses all supporting material related to our report to the extent possible.

    I admit I found this comment by Riston funny enough to laugh out loud at:

    I resent your question as to which side I am on. I am a scientist not a debater and dishonesty is dishonesty whoever it may come from.

    A contributor to RealClimate who regularly has engaged in Team RC obstructionism “resenting” that somebody perceives him as having a side!?

    Heh! Indeed.

    I don’t agree that’s all the changes he made did, but that’s besides the point. What exactly counts as “replication”? If you work through someone’s code and see it does what they claim it does, have you replicated their work? What if you then build upon their code? What if you derive a mathematical proof of what the code demonstrates?

    At the least you update it to correct for errors.

    I think the point is, if you are trying to validate somebody’s work, and you base that validation on their code, it’s hard to judge the quality of the validation process without seeing the code. So in an odd way, in order to determine the degree of independence of Wegman to McIntyre, we needed to see what he had done differently.

    Not knowing almost makes whatever code-based validational analysis he did useless.

    (For really suspicious minds, we can establish that McIntyre really used this code to produce his results.)

    The point about reusing proxies is a good one. But inspire of that, they really couldn’t be viewed as valid replications. If they did, the series they found during the reconstruction period would agree.

    The fact they find the same trend in the instrumental period is virtually meaningless. Nobody would try and publish a reconstruction that didn’t agree with a significant subset of the instrumental data, so (if for no other reason) agreement with the instrumental data is the result of a screening process.

    Sorry. If science worked as it should, I’d agree with you. I just can’t imagine actually applying this standard to climate science. I could list examples showing half the papers discussed in the Wegman Report don’t apply it. I think it is a good standard, but it is definitely not the standard of the field.

    It’s definitely not the standard in the field, if the MBH corrigenda is any example We need to have an objective standard here. But I understand your point.

    Requiring that people spell out in plain language what they mean, and producing clarifications with there are issues with interpretation, is not a harsh requirement.

  93. Carrick:

    Wegman promised to Barton that he would release his code. Dicussed by Riston here, and pertinent quote of Wegman:

    I don’t get why he’d say that then not follow through with it. It’s pretty stupid. He should have been more like the Team and just refused to share data or code!

    A contributor to RealClimate who regularly has engaged in Team RC obstructionism “resenting” that somebody perceives him as having a side!?

    Aye. It’s even worse for people who know about his criticism of M&M’s work. I don’t see how anyone could consider him merely an honest scientist rather than a person fighting for one side.

    At the least you update it to correct for errors.

    Oh what a world it would be if this were true.

    I think the point is, if you are trying to validate somebody’s work, and you base that validation on their code, it’s hard to judge the quality of the validation process without seeing the code. So in an odd way, in order to determine the degree of independence of Wegman to McIntyre, we needed to see what he had done differently.

    There’s a huge difference in this stance and the stance which has been offered previously. A lot of the time, people say Edward Wegman didn’t validate/replicate/whatever M&M’s claims. What you’re saying is we can’t know (how well) he did it. I have little problem with the latter. It’s the former I don’t get.

    I don’t get how the same people can claim to know what Wegman did and did not do while complaining he didn’t release the material which would let us know what he did and did not do. They’ve already decided what the evidence would show even though their conclusion doesn’t make any sense given parts of the Wegman Report they just ignore.

    (Which is probably due, at least in part, to most of them having never read it.)

    It’s definitely not the standard in the field, if the MBH corrigenda is any example We need to have an objective standard here. But I understand your point.

    Requiring that people spell out in plain language what they mean, and producing clarifications with there are issues with interpretation, is not a harsh requirement.

    Yup. It’s a standard I think everyone should be held to. I just can’t operate under it when I know it isn’t accepted by basically anyone in the field. If I have to work with deliberately misleading statements and blatant misrepresentations when trying to understand the papers in this field, I’m going to give some latitude to the Wegman Report.

    This is reminscent of the Soon & Baliunas controversy. Those two were criticized for doing things the paleoclimate field happily accepts from people on the right “side.” Even the people criticizing S&B did the exact same things they condemned S&B for doing. There was a huge controversy, with people protesting, writing editorials and even resigning, all because S&B did the exact same thing everyone else had either been doing, or approving of, all along.

    I agree the Wegman Report deserves criticisms for many points. I just can’t get up in arms about it when I know the only reason people want me to is it came from the wrong “side.” I didn’t care much about the report when it came out, and I’m still quite apathetic about it.

  94. By the way, I’m curious if Kevin O’Neill and Nick Stokes are both going to stay away now. I think my moderation policy is perfectly reasonable, but both of them seem to take issue with it. I don’t get it.

  95. I don’t get it.

    Of course you don’t, Brandon. Their thinking “doesn’t make sense.” How could you get it, if it “doesn’t make sense?”

  96. Brandon –

    Just a warning – if you decide to read the following comment, then I will be forcing you to waste your time.

    I was just trying to agree with you How could you “get it” if their behavior “doesn’t make sense”?

    Of course, then again, you might re-read what they wrote and consider that if it “doesn’t make sense” to you, you might be able to rethink it. Step outside your own paradigm for a minute, and try to consider it from the perspective of someone else. That way you might “get it.”

    Or you might want to ask for clarification (which may be too late now, as they might not be reading). That might also help you to “get it.”

    Just some unsolicited advice, Brandon. Take it for what it’s worth.

  97. Alright Joshua, I’m tired of tolerating your incessant spam. From here on, make comments relevant to the post you’re commenting on or don’t make any comments at all.

    Do so and you are welcome to post. Fail to do so and I’ll place you in moderation.

  98. Nick Stokes asks how we would react if Mann had done something like this, filtering out the top 12 of the top 100 out of 10000. I wonder how Nick Stokes would react to criticism of Mann that he imagines we would make. Surely he would argue it makes no difference, putting up the random selection he showed above, which is also full of hockey sticks.

  99. I can’t think of any examples where so much bluster is raised over absolutely nothing. I don’t see any evidence that I’ve withheld criticism from Wegman, where there was obviously something wrong. Both Brandon and myself made what I think is the legitimate criticism of this one (not properly labeled).

    I’ve never seen a more obvious example of the herd mentality shown by Mann’s defenders than the outrage sparked over such a minor issue like this.

  100. Brandon, my final comment on Wegman—there needs to be academic standards. Simply because some authors fail to meet those standards, and we have people on blogs actively enabled this errant behavior, doesn’t mean of course the behavior is valid.

    When I see this behavior acted out by others, I think it’s beholden on me to criticize it there too. I try to be a good spokesperson for what should be accepted norms, rather than just somebody looking for places to take pot-shots.

    So being able to criticize Wegman appropriately is about me, not about them, and certainly not about this pointless internet battle between true believers and true deniers.

  101. I agree Carrick. If nobody were criticizing the Wegman Report, I’d be much more inclined to. As it stands though, plenty of people are criticizing it. Pretty much nobody is defending it. If we can all agree about the problems of the report, why do we need to keep talking about it?

    I’ll respond to exaggerations of those criticisms when they come up, and if I ever see people say there are no legitimate criticisms of the report I’ll speak up, but otherwise, the subject should be laid to rest.

    Which is what should have happened with Michael Mann’s original hockey stick. If people would have accepted its faults, nobody would be talking about it anymore. It’s only because people chose to defend the inexcusable that criticisms of it are still relevant.

  102. Brandon, I agree had the MBH paper been properly buried when it should have been, this issue would have been over a long time ago.

    As I see the problem, there are a group of people who like this statement “global mean temperatures are now higher than they had been in more than 1000 years.” And they see repudiation of a very bad study as repudiation of that statement.

    Of course there is much better evidence that the statement is true from newer papers, even if we accept that these still contains massive, easily correctable flaws. This is apparently not a field that accepts criticism very well, so progress is very slow.

    It’s apparently a bizarre notion to some of these people that you need not be skeptical of a result before you would criticize the method by which it was obtained.

  103. I agree with everything you said Carrick except that “there is much better evidence that the statement is true from newer papers.” I’d argue the evidence given by those newer papers is every bit as bad as the evidence given by MBH. I’ve tried to study each millennial hemispheric or global reconstruction as it was published. In doing so, I’ve concluded the same basic problems exist in all of them.

    again, I guess one could argue they provide “better evidence” in that the problems with some of them are less obvious. That makes them better for a cause (as they’re harder to shoot down). That doesn’t make them better evidence in actuality though.

    I don’t have a problem believing modern temperatures (on a hemispheric scale) are higher than they’ve been in the last 1000 years. I just don’t believe we have the evidence to demonstrate such. And even if we did, none of the attempts at doing so have come remotely close to actually proving anything. The rampant cherry-picking alone is enough to discredit the attempts we’ve had thus far.

  104. I guess I’m a little less skeptical than you are about newer attempts, probably because my hunch is “they can’t all be that bad.” Perhaps they really are.

    Still, my perspective is that in order to truly understand research results, you must replicate them. So I’m open minded that I may not be skeptical enough here.

  105. I’m not sure how you can see both MBH and Mann 2008 be accepted, and even heavily promoted, within the field and think “they can’t all be that bad.” Maybe I’m just more cynical than you. When I first started following the hockey stick controversy, I fully expected the field to reject MBH once its problems became known. After ten years of it adamantly refusing to, sinking to every level of self-delusion imaginable to defend it, I’ve given up any faith I ever had in the field.

    By the way, I’ve long had an open offer to discuss any of the millennial temperature reconstructions people think might lend credence to Michael Mann’s work. It’s still open. If people want to examine more aspects of the field, I’d be happy to help.

    As a side note, this field is what led me to study science as a whole in more detail. Doing so has convinced me there may be plenty of good scientists doing legitimate work, but science as presented to the public is incredibly untrustworthy. The public face of science in society is no better than modern day witchcraft. I think anti-Vaxxers and the like are nutjobs, but I can’t say they’re worse than anyone else. Just look at CVS talking about how it will no longer sell tobacco products because it wants people to be healthy at the same time it has entire aisles of herbs and vitamin supplements which are completely useless to 99% of the people who buy them. Then take a look at science education programs which have people like Michio Kaku spout utter nonsense with the full weight of “science” behind them.

    I’d rather have witchcraft. At least then you can burn people at the stake when you find out their products are lies.

  106. I don’t put any weight in Mann 2008. Seeing how he behaved with his other work, makes him an “unreliable witness” IMO. When I compared his results against the other curves, I was a bit surprised his EIV product agreed so well.

    The claim has been made if you remove all of the questionable reconstructions, you get the same curve with a slightly reduced effect during the mediaeval warming period. That claim needs to be vetted of course. Perhaps his work is best seen as a sensitivity test on how badly you can screw up and still get (more of less) the right answer.

    In noisy data with poorly understood models, it is probably easier to Replicate While Under the Influence of Stupid or RWUIS, than it is with more precise measurements with better understood models relating data and theory.

    Were I to start with any of these studies, it would be Loehle. It’s a simple concept and so it should be easily replicable and improve on. Simple things done well can be very powerful. Including a empirically based model that connects regional scale temperature to global ones would help (as it is, regions where you have polar amplification are going to get weighted more heavily in his summation.)

    Were I to replicate, I’d need to start with a survey of low-frequency proxies. Understanding the individual proxies likely involves reading the papers that accompany them, so that could be a lengthy process.

    There is an amusing level of black humor generated by watching the attacks on Loehle’s paper, when most of these could as easily be pointed towards other papers, a point made by Steve McIntyre made this point here

    Totally agreed about supplemental nutrients in general, which I do think amounts to witchcraft. If a supplement is actually important, get it via a dietary source.

  107. A few clarifications/corrections:

    • This made no sense “The claim has been made if you remove all of the questionable reconstructions […]”

    It should read:

    The claim has been made if you remove all of the questionable proxies, you get the same curve with a slightly reduced temperature during the mediaeval warming period.

    • When I say “right answer” I don’t mean an “accurate result”. I mean the result you should get starting with a set of proxies. Systematic biases will be present.

    • I should have added that fields where it is easy to RWUIS appear to attract more derps.

  108. Carrick:

    I don’t put any weight in Mann 2008. Seeing how he behaved with his other work, makes him an “unreliable witness” IMO. When I compared his results against the other curves, I was a bit surprised his EIV product agreed so well.

    I’m not that surprised. There’s a tendency for answers to converge in a field like this. There are a variety of factors which cause it, but the most obvious is it is easier to get results published if they agree with people’s expectations.

    When we have documentation showing Michael Mann and others used the peer review process to try to manage what sort of results got published, I can’t take agreement between results which did get published as meaning much.

    Were I to start with any of these studies, it would be Loehle. It’s a simple concept and so it should be easily replicable and improve on. Simple things done well can be very powerful. Including a empirically based model that connects regional scale temperature to global ones would help (as it is, regions where you have polar amplification are going to get weighted more heavily in his summation.)

    My only problem with this choice is you picked pretty much the only paper for the “skeptic” side of the dispute. That doesn’t do much to advance the knowledge of the “consensus” position on temperature reconstructions. It’s a good choice for understanding the subject in general, but it won’t do much to explain why some people think we shouldn’t have faith in any of the results.

    Were I to replicate, I’d need to start with a survey of low-frequency proxies. Understanding the individual proxies likely involves reading the papers that accompany them, so that could be a lengthy process.

    Understanding individual proxies? What a shocking idea. I’m not sure anyone in the field has ever heard of such a thing. (insert emoticon here)

    There is an amusing level of black humor generated by watching the attacks on Loehle’s paper, when most of these could as easily be pointed towards other papers, a point made by Steve McIntyre made this point here

    Yup. The same thing happened with the Soon and Baliunas paper. It seems any time someone creates a reconstruction these people don’t like, they immediately spot all the problems that exist in the field. It’s hard to understand how they manage that yet fail to figure out what they’re doing wrong.

    Totally agreed about supplemental nutrients in general, which I do think amounts to witchcraft. If a supplement is actually important, get it via a dietary source.

    I’m not opposed to supplements in their entirety. I have no problem believing there are people with vitamin (and other) deficiencies who can’t meet all their needs through dietary choices (or would have a difficult time doing so). I have no problem believing supplements could help them.

    But people who buy dietary supplements generally don’t have such deficiencies. They buy the products because they’ve been taught the products are good for everyone. Doctors and scientists have endorsed this idea, both implicitly and explicitly.

    As though that wasn’t bad enough, once people accepted supplements in general were good for everyone, it was easy to convince people of all sorts of other things about supplements. It’s crazy how many people genuinely believe science says things like, “You’re feeling tired? Take a B12 pill. It’ll give you energy!”

    Millions of people are being fleeced on what scientists/doctors have to know to be fraud, and practically none of them speak up. Some of them actually support it. And that’s just one case. I’ve seen the same pattern of behavior in many different cases. Michael Mann’s hockey stick, Stephan Lewandowsky’s psychological nonsense, Cook et al’s consensus paper. I’d say they’re pretty much standard fare for what the public sees of science.

    I’m not meaning to rant, but I find it so strange to hear people talk about the ideals of science when lay people are pretty much never presented any indication those standards are followed. Maybe they are in some, or even most, fields, but how is anyone supposed to know that?

  109. Brandon, Loehle’s paper is in decent agreement with Ljungqvist, Moberg and Mann 2008 EIV—as long as you remember that they aren’t on the same scale that is.

    As you know, different temperature regions have different amplification factors. Land has a larger trend than ocean, more polar land locations tend to have larger amplification than more equatorial ones. So even if there were scaling and offset biases (which there are), the overall magnitude of variability (and mean value of the series) is affected by the geographical distribution of proxy locations.

    If Mann were trying to replicate a result it would be MBH 99 and not Moberg, who he viscerally attacked when it first came out. So it was a bit of a surprise he validated their earlier work (in some weak sense). Ironically, the series that is most like Loehle 2007 … is Mann 2008 EIV.

    People who are intellectually honest (I would include Zeke in that category), will admit that Loehle agrees well with the other series—as long as you recongnize that none of the series have the same scaling and offset factors.

    Tamino had a very poor effort trying to discredit Loehle, where he failed to recognize this issue. I doubt he tried very hard, because he was just looking to discredit Loehle, rather than arrive at an objective truth.

  110. Carrick:

    Brandon, Loehle’s paper is in decent agreement with Ljungqvist, Moberg and Mann 2008 EIV—as long as you remember that they aren’t on the same scale that is.

    Citing agreement between Loehle and Moberg is a bit silly. Moberg used 11 low frequency proxy, all but a couple of which were used by Loehle as well. I think Loehle used 9/11 of them, making up 9/18 of his proxies? It was something like that. Whatever the exact values, similarlity when your data overlaps that much is practically meaningless.

    I point that out to make a point. One could argue there’s less reason to expect bias to affect the convergence of results given Loehle’s results. However, even if one feels that way, they’re still stuck with the lack of independence in results.

    If Mann were trying to replicate a result it would be MBH 99 and not Moberg, who he viscerally attacked when it first came out. So it was a bit of a surprise he validated their earlier work (in some weak sense). Ironically, the series that is most like Loehle 2007 … is Mann 2008 EIV.

    I don’t believe this is true. There is plenty of reason to believe Mann disliked people for one-upping him even if he agreed with their conclusions. MBH99 was written directly in response to other people publishing a reconstruction which extended further back in time than Mann’s first hockey stick (he’s said this himself).

    By 2005, it was abundantly obvious people wouldn’t accept the results of MBH as the final answer. The lack of variance in his reconstruction was not accepted within the community. Backing off to results more like Moberg’s was a compromise which allowed Mann to keep all his talking points while appeasing a lot of people. I don’t believe there is any evidence Mann was unhappy with Moberg’s results. If anything, his issue was Moberg published them first.

    (Well that, and Moberg didn’t do as much as he’d like in combining his reconstruction with the temperature record to create the talking points Mann likes.)

    People who are intellectually honest (I would include Zeke in that category), will admit that Loehle agrees well with the other series—as long as you recongnize that none of the series have the same scaling and offset factors.

    I have no problem with this acknowledgment. My problem is I don’t think Loehle’s reconstruction is very informative. It was created with a tiny number of proxies whose quality, representativeness, and skill are not known. The result is a reconstruction we cannot possibly hope to compare to instrumental temperature records due to all sorts of issues (including resolution and distribution). I don’t think we can even hope to quantify the uncertainty levels of it.*

    When you break it all down, reconstructions basically agree there was a LIA and an MWP. What they don’t agree about is the timing or magnitude of either. It’s good we’ve moved past MBH’s practical denial of the MWP/LIA, but the extent of our current agreement isn’t very extensive.

    As it stands, reconstructions basically just agree there’s a cubic function over the last ~1500 years. That’s nice, but when their inflection points can vary by 50% along both axes, we don’t have much information. And that’s without considering any issues of systematic biases.

    *We can, of course, hope to quantify the uncertainty given the sample set used. That just tells us little when we have no idea what uncertainties/biases may be arise from the sample selection process.

  111. By the way, the issue of resolution reminds me of a point that’s always amused me. Temperature reconstructions are generally compared to general temperature indexes, not land-only temperature indexes. This strikes me as humorous because results are predominantly (sometimes even entirely) based upon land-only proxies. Authors could enhance the “strength” of their results by choosing different temperature indexes to use for comparison.

    But if they did, their results would often look unrealistic. People can accept humans have caused temperatures to reach unprecedented levels, but there’s a degree of “unprecedented” they won’t accept.

    It’s like how Cook et al were safe in finding a 97% consensus, but their results wouldn’t have worked if they found a 99.x% consensus. That’s why they needed to define “consensus” in such a strange way. Had they only defined their “consensus” as including papers which reject the greenhouse theory, their result would have been too high. Had they defined their “consensus” as only including papers which say humans cause 50+% global warming, their result would have been too low. It was only by mixing the two (and creating a nonsensical definition for their “consensus”) they were able to get the results they wanted. Doing so let them get high results which weren’t too high.

    (Based on the discussion we can see in their forum, John Cook knew the outcome of each definition and chose the one he chose knowing it would produce the desired results thanks to the fact he went through the abstracts prior to the ratings being performed.)

  112. Brandon:

    Citing agreement between Loehle and Moberg is a bit silly. Moberg used 11 low frequency proxy, all but a couple of which were used by Loehle as well. I think Loehle used 9/11 of them, making up 9/18 of his proxies? It was something like that. Whatever the exact values, similarlity when your data overlaps that much is practically meaningless.

    To be clear, I’m not arguing about the accuracy of the results or the independence of the proxies:

    Rather I am thinking about method verification (that is verifying the functional equivalency of methods) rather than results validation (which speaks to the accuracy of the reconstruction in this case).

    If you want to verify that different methods are equivalent, you run them on the same or similar data sets (I can this method verification). If you want to validate the results, you run the same method on different data sets.

    As you say, about half of Loehle’s proxies appear in Moberg’s low frequency reconstruction. So this isn’t technically a pure “methods verification” test. Interpretation becomes a problem if they don’t agree, because the difference could be either that the methods are different or the results aren’t robust across data sets.

    I’m being somewhat cautious here by saying their agreement should be viewed purely as a methods verification. However, I don’t know how much prescreening happens before a particular proxies series is stamped with the official”genuine temperature proxy” seal of approval. Even if different data sets nominally, I don’t know how much the data collectors judgment about expecting a MWP, LIA followed by current warming affected his decision making as to what constitutes a “true” temperature proxy.

    I think the important thing here is that these methods involve summing over a linear combination of the proxies. If you have a few proxies with large weights (one of the slams against Mann’s earlier work), it doesn’t matter if you have 10 proxies, 200 proxies or 10,000, if effectively, after applying the linear weighting, just a few proxies are driving the shape of the curve.

    What’s good about Loehle is he selected 18 proxies that had been independently calibrated and simply averaged them. The point though is he used equal weights.

    There is also a decent agreement with the other methods, including Mann’s EIV. For Mann to agree, when we all know about Yamal and Tiljander being included, does mean that Mann’s method is robust against his own extreme sloppiness. (Where’s Kevin O’Neill to make the call for due diligence on this one?)

    Beyond that, the promise of this is … dirt simple methods seem to work as well for the low frequency portion of the signal. That suggests it’s eventually a tractable problem. IMO if it requires a space program like effort to get it to work, it will never work.

    I’ll note again the irony that we have are two functionally equivalent methods, with Loehle being demonized and Moberg being cannonized by the climate activist community.

    =========

    Regarding Moberg, there was a positive review of this on RealClimate, but conspicuously Mann was not an author. I think there is more floating around on his reaction at the time, but see this for example.

    My memory may be flawed, but I seem to remember early push back against the larger amount of variability seen in Moberg.

    I think you give the US climate group too much credit for their behavior at the time. As far as I can see, they were perfectly happy to tag along behind Mann on this one. It was mostly the Germans and Dutch, IMO, who were pushing for more realistic reconstructions than the pure averaged red-noise, obviously junk, reconstructions of MBH 98 and 99.

    ========

    It’s like how Cook et al were safe in finding a 97% consensus, but their results wouldn’t have worked if they found a 99.x% consensus.

    Yes this is a very good point. Since Cook was interested in propaganda value, he was interested in the best number to sell. Had he used the proper definition, I think he would have found 99 + a fraction percent.

    (Based on the discussion we can see in their forum, John Cook knew the outcome of each definition and chose the one he chose knowing it would produce the desired results thanks to the fact he went through the abstracts prior to the ratings being performed.)

    I admit I don’t remember that discussion. Interesting. So my guess that he did fish for the method that gave the “best” result might have some validity.

  113. Rather I am thinking about method verification (that is verifying the functional equivalency of methods) rather than results validation (which speaks to the accuracy of the reconstruction in this case).

    Ah. In that case, doesn’t Moberg’s methodology basically just average the low frequency signals? If I remember/understand correctly, wavelet transformations were used to split low and high frequency signals apart.* That results in low frequency signals which are effectively just smoothed versions of the proxies. Average those together, and you’re still basically doing the same thing Loehle did (in regard to the low frequency signals). It’s not surprising the results would “validate” each other. They’re the same thing in broad terms.

    But you can’t argue for methodological validation for Mann 2008 or the Ljungqvist reconstructions. They used notably different data sets than Loehle and Moberg. We could assume similar results validate their data and methodology, but that’s a hard assumption to justify.

    I’m being somewhat cautious here by saying their agreement should be viewed purely as a methods verification. However, I don’t know how much prescreening happens before a particular proxies series is stamped with the official”genuine temperature proxy” seal of approval. Even if different data sets nominally, I don’t know how much the data collectors judgment about expecting a MWP, LIA followed by current warming affected his decision making as to what constitutes a “true” temperature proxy.

    This is a huge problem, and it’s what bothers me about the Christiansen and Ljungqvist approach. I think they came up with a methodology better than what is seen in most reconstructions. They then used the proxies other people had decided are temperature proxies. They didn’t verify those proxies were appropriate, much less representative. This resulted in them using many cherry-picked proxies even though they didn’t actively cherry-pick them.

    But in a more general sense, there are at least half a dozen well-known cases of cherry-picking involving important proxies in temperature reconstructions. There are multiple cases where one version of a proxy is chosen over another purely because of a difference in what they show. Then there are “temperature proxies” we have no reason to believe actually proxy temperature but are said to because they give the “right” answers.

    And it’s not like any of this is even a secret. Gordon Jacoby, attributed as the source for the ever important Gaspe series, explained why he chose to only archive some data by saying, “I maintain that one should not add data without signal.” With this justification, he only the 10 most “temperature sensitive” sites of the 36 he collected. The other 26 were ignored and lost because they didn’t give the results he expected. He openly acknowledges this is what he did, promoting it as a good thing.

    Rosanne D’Arrigo, a co-author of Jacoby, once gave a presentation in which she said something to the effect of, having lots of data is great as you can cherry-pick what you need. She explained cherry-picking is what you have to do if you want to make cherry pie.

    These two are responsible for data which is commonly used in temperature reconstructions, some of which is important for published results (like the Gaspe series). They’ve also published several reconstructions of their own. They’re proud members of the Hockey Stick Team. And they not only admit to cherry-picking, but say it’s a good thing!

    There is also a decent agreement with the other methods, including Mann’s EIV. For Mann to agree, when we all know about Yamal and Tiljander being included, does mean that Mann’s method is robust against his own extreme sloppiness. (Where’s Kevin O’Neill to make the call for due diligence on this one?)

    I don’t agree. The fact someone who screwed up got the same results as other people who used different data and different methodology does not mean his screw up had little or no effect on his results. That’s one possible explanation, but there are many others.

    If Mann’s methology were robust as you claim, his results wouldn’t change if we corrected his mistakes. We don’t know that to be true. In fact, there’s good reason to believe it isn’t. He himself has shown his results are significantly different if you remove the Tiljander series and tree ring data. I’m pretty sure the same is true if you replace “tree ring data” with “bristlecones.”

    If I’m right, Mann’s methodology isn’t robust. The only reason you find the agreement you cite is he used data which was inappropriate for his purposes. The fact using bad data lets him get a similar answer to Moberg, Loehle and others does not validate anything.

    I think you give the US climate group too much credit for their behavior at the time. As far as I can see, they were perfectly happy to tag along behind Mann on this one. It was mostly the Germans and Dutch, IMO, who were pushing for more realistic reconstructions than the pure averaged red-noise, obviously junk, reconstructions of MBH 98 and 99.

    Hrm? I didn’t think I gave them any credit for their behavior. All I said is people realized MBH’s hockey stick was too straight. There were a lot of people complaining because the MWP and LIA were practically non-existent in that reconstruction. By 2005, I don’t think many of the Team genuinely believed that was right. I think some even published results suggesting the variance was greater than MBH said.

    I think the backlash against Moberg wasn’t because of the results. I think it was largely because Moberg beat others to the punch. That said, that e-mail reminds me of another issue. Mann didn’t like the fact Moberg said tree rings were not reliable indicators of long-term trends as that directly challenged his work. I think those two points were a far larger reason for any backlash against Moberg than the fact he got results largely similar to what Mann would go on to publish a couple years later.

    I admit I don’t remember that discussion. Interesting. So my guess that he did fish for the method that gave the “best” result might have some validity.

    I’ll have to go find it again. What I found most interesting about it is it didn’t seem to cause him to select the definition he went with (which I still say wasn’t even a definition because it is internally contradictory). The final discussion about what definition to go with came later, and it has no reference to this point. Instead, Cook et al discuss other reasons to pick between different definitions.

    It’s possible John Cook had this point in mind and simply didn’t mention it, but it’s also possible he didn’t realize the import of what he was doing.

    *As I recall, Moberg et al broke each proxy into many different frequencies then separated the frequencies into two groups which they averaged together. Plus I think they split their proxies into two groups depending on which half of the hemisphere they were in. These points would have some effect on their final result, but I don’t believe they would cause their results to be meaningfully different from a simple average.

  114. Figure 2 and 5

    Brandon, as you have returned to the core of my original question, that nick seemed to take offence at, i will post a link to an article I wrote last year.

    http://wattsupwiththat.com/2013/08/16/historic-variations-in-temperature-number-four-the-hockey-stick/

    Figure 2 shows cet, considered a good but by no means perfect proxy for global temperatures by many scientists plus the met office and de bilt, overlaid on various Paleo proxies.

    It is evident that paleo proxies are a very coarse sieve through which the highly variable annual and decadal instrumental temperatures readily fall. Thus the highly variable nature of our climate is entirely lost

    Figure 5 shows glacier advances and retreats over the last 3000 Years. The hockey stick sails serenely through them all with barely any deviation in temperature shown.

    Why should highly selective tree rings with a short growing season and highly susceptible to its micro climate, show anything very much more than a general indication of precipitation during the summer?

    Of course tree rings trump instrumental temperatures andcontemporary observations from multiple sources for reasons that still elude me

    Tonyb

  115. Tonyb, I’ve only skimmed that post a bit so far, but you are wrong to say:

    Figure 2 shows cet, considered a good but by no means perfect proxy for global temperatures by many scientists plus the met office and de bilt, overlaid on various Paleo proxies.

    It is evident that paleo proxies are a very coarse sieve through which the highly variable annual and decadal instrumental temperatures readily fall. Thus the highly variable nature of our climate is entirely lost

    You did not show any paleoclimate proxies. You didn’t show proxies at all. You showed reconstructions. The two are radically different. Some of the “proxies” you refer to are actually made up of hundreds of proxies!

    The issue of temperature reconstruction resolution is an important one. The issue of what sort of temporal scales is something which needs to be addressed. It doesn’t say one word about the proxies themselves though. One could easily combine a hundred proxies with high frequency components and wind up with a reconstruction with no high frequency signal. The fact a reconstruction lacks a certain resolution does not justify criticizing the resolution of the proxies which went into it.

  116. Brandon:

    Ah. In that case, doesn’t Moberg’s methodology basically just average the low frequency signals? If I remember/understand correctly, wavelet transformations were used to split low and high frequency signals apart.* That results in low frequency signals which are effectively just smoothed versions of the proxies. Average those together, and you’re still basically doing the same thing Loehle did (in regard to the low frequency signals). It’s not surprising the results would “validate” each other. They’re the same thing in broad terms.

    Yes, in very broad terms. All of the methods, in my understanding, eventual just perform a weighted average. Loehle is the only one who uses the “on label” calibration values to weight the proxy values.

    There are numerous things that happen when you regress against a noisy time series to estimate a scaling factor. (I can go into the part of this that I understand.)

    Eventually, the simple average becomes a way to check the “more sophisticated models” preferred by the climate scientists. Even if they get slightly different weighting factors using their “sophisticated methods”, the values shouldn’t be far off from what is seen here.

    I don’t agree. The fact someone who screwed up got the same results as other people who used different data and different methodology does not mean his screw up had little or no effect on his results. That’s one possible explanation, but there are many others.

    It’s a fact it agrees (in a statistical sense) with the other reconstructions. The rest is open to interpretation. Remember though, I’m focusing on the low-frequency portion of the global signal. That’s just one metric. If you look at the pattern of warming/cooling, these kind of errors can have big effects on that pattern (see the issues with BEST), even if they end up not biasing the global mean (“very much”).

    These points would have some effect on their final result, but I don’t believe they would cause their results to be meaningfully different from a simple average.

    Again it’s related to what happens when you calibrate against series that have poor signal-to-noise ratios. If you do the method correctly, they should agree with the simple calibration weighted average.

    It’s interesting that one of the criticisms of Loehle is that it was “too simple”. I think, given the paucity of data, the problem is the others are “too complex”.

  117. Brandon

    In the first paragraph of the article you will note that I used the term paleo proxy reconstructions.

    That is the term that Dr curry asked me to use when I wrote the original article referenced in the link I gave you. I was merely abbreviating What I wrote and referenced, as writing on an iPad is a bit of a pain and brevity is easier than being long winded.

    The point I was making is that ppr, because of the way they are constructed! do not reflect the actual variability the real world experiences on an annual and decadal basis and as such they give a misleading impression of the past.
    Tonyb

    Tonyb

  118. Brandon:

    The issue of temperature reconstruction resolution is an important one. The issue of what sort of temporal scales is something which needs to be addressed. It doesn’t say one word about the proxies themselves though. One could easily combine a hundred proxies with high frequency components and wind up with a reconstruction with no high frequency signal. The fact a reconstruction lacks a certain resolution does not justify criticizing the resolution of the proxies which went into it.

    The way I would diagnose the frequency content of the reconstructions is look at their spectral content:

    Figure.

    What you expect from atmospheric variability is a spectrum that varies as the period to some power (≥ 1). This is equivalent to varying as one over the frequency to the same power of course.

    If you get a large period (low frequency) “roll off”, this is a sign that your actual resolution is worse than the sampling interval that you’ve chosen.

    This happens with Mann 2008 EIV, which you can see has an effective resolution of about a 100-year interval.

    Loehle & McCollough have this funny short-period (high-frequency) structure (a series of nulls). People who do signal processing will recognizing this as the windowing effect associated with interpolation of his data. Probably these nulls can be avoided with a noiseless interpolation method (e.g., Fourier method), but as published you have an effective resolution of about 50-years.

    Another can go wrong is “noise amplification” This is related to the inversion process for computing the OLS coefficients, and occurs whenever you have a poorly conditioned inversion matrix. Basically the RMS amplitude of the noise in the reconstruction can be related to the RMS noise in the signals being combined times the square-root of the maximum to minimum eigenvalues of the matrix used to perform the inversion. (This is where regularization schemes like singular-value decomposition come in. )

    What happens with data series where noise amplification is important is a short-period “plateau”. We see that neither Moberg nor Ljungqvist seem affected by this artifact.

    The other thing that happens when you regress against a series with noise is scaling and offset biases. For the time domain signal, in order to compare the different series against each other, it’s necessary to tranform them to a common baseline and scaling factor. If you don’t do this, Loehle ends up looking “odd” but that’s because it’s scaling factor is larger than the others (probably this is related to the simple method used for proxies).

    When I compute the spectra, I usually do a Welch-periodogram with linear detrending. For these data, I used 400-year windows with a Welch taper function (for 1/f type signals, it works slightly better than the Hann taper, which is the more conventional function for spectral periodograms).

    Obviously for frequency domain analysis, the offset doesn’t matter, but I did have to rescale them to the same “pseudo temperature” scale. Here are the scaling factors I used (these are for fluctuation amplitudes, they are squared when you scale the power-spectral-density estimates):

    Ljungqvist 1.56762
    Mann 2008 1.23029
    Moberg 2005 1.45768

  119. For clarification those factors are what you have to multiply these other distributions to get them to the same physical scale used in Loehle & McCollough. If L&M accurately portrays the real temperature scale (if the calibrations were done correctly it would be), then one over these constants is the amount of “descaling” associated with the “sophisticated methods”. Mann 2008 EIV is slightly better in terms of preserving scale, which is expected, because that’s what EIV (errors-in-variables) is designed to do.

    I should repeat my note that even if you didn’t have descaling issues, because there is global variation in the scale (land is larger, more polar is larger), there are still spatial sampling issues that should cause the different estimates to deviate from each other, simply as a consequence of under-sampling the global temperature field. With this you’ll also get

    Another point:

    One of the big problems with these reconstructions they all do a linear sum over proxies. This is only a reliable method when your proxies scale with temperature, that is there is no frequency variation in the amplitude response. (That is the calibration constant is assumed to be independent of temperature.)

    As soon as you have a variation in the magnitude of the calibration constant with frequency, for casual and linearly stable processes, the log of amplitude response can be related to the phase response through a Hilbert transform. This goes under the moniker “minimum phase systems”, but what is means is when you have variations in the amplitude of response, the phases will vary also.

    For proxies like tree rings, which respond separately to quantities like precipitation, sunlight, soil moisture and soil nutrient level as well as temperature, it is important to recognize that there are often co-variations between these different quantities. If there weren’t tree-rings would probably be useless. The trouble is that there is no reason to expect that the co-variations seen at periods of e.g. 4 years (ENSO peak) would have anything to do with co-variations over periods of centuries.

    So in general, if you calibrate the low-frequency portion, and you use plenty of tree rings in your reconstruction, you should see an incoherent sum (the components add out of phase and cancel as if hey were noise). This would result in a roll-off at the frequencies where the assumption of constant amplitude, as computed from low frequencies, is violated. This is a possible explanation of Mann 2008 EIV roll-off.

    Moberg gets away from this problem by not using the tree-rings in the low-frequency portion of his reconstruction.

    Loehle is combining proxies that have really poor temporal resolution with ones that have better temporal resolution in a simple average over proxies. Even if he didn’t have issues with interpolation creating high-frequency artifacts, this would still lead to a roll-off at high-frequencies. [You can combine the series correctly, but converting them into truncated Fourier series, before averaging. This is one obvious improvement of what he did.]

    Ljungqvist..I’ll have to go back and remind myself of what he’s done, but there’s no reason to favor Moberg’s method over his, based on what I’m looking at here.

  120. Note: I fixed the legend in the figure on spectra.

    Also while Ljungqvist does better than Mann, it does appear there is some high-freuqency descaling relative to Moberg.

    Final point:

    All of these methods are assuming a linear relationship. That’s almost certainly violated with processes like ENSO.

    As simple example, you can compute the linear correlation between zonally averaged temperature and ENSO 3.4. When you do this, you see something like this:

    Figure

    This seems to suggest that correlation with temperature (“ENSO teleconnections function”) is limited to ±30° of the equator. That’s strictly true as long as you restrict yourself to linear processes.

    What needs to be done is to develop a framework based e.g., on a Wiener series, that includes higher-order nonlinear interaction terms, to explore this further….

    But this issue of nonlinearity is what worries me so much about these so-called “sophisticated methods”…I would expect these to be more sensitive to the failure of the assumption of linearity than a straight sum over series. This is because these methods are trying to compute regional scale variations, and it’s precisely here where you start to see big issues with nonlinearity in the interaction of climate. Briefly the problem arises when the nonlinear component is large compared to the linear one in the region where you’re trying to relate the proxy to local temperature.

    When you average over the globe, there’s something called the EQ-NL theorem which basically states that as long as the nonlinear interaction remains perturbatively small compared to the linear one, there is an equivalent linear model that will describe the transfer function between the system input and output.

    Put another, for global mean temperature, you’re more likely to be able to get away with the assumption of linearity than you are, when you start partitioning the globe.

    I think I’ve hit all of the issues now that relate to my concern over the “sophisticated methods” not converging to the simple average method used by Loehle.

    Somewhere in the stuff I’ve been reading today, Steve McIntyre makes a similar point to me, that the simple methods are actually better, because they are more amenable to statistical modeling. Were I to try and build up a complex statistical model, I would certainly start with a simple one, and iterate on that, rather than just spring up some complex Rube Goldberg machine from whole cloth.

  121. Tonyb:

    In the first paragraph of the article you will note that I used the term paleo proxy reconstructions.

    That may be, but I’m responding primarily to what you wrote here. Links are there to support what one says, not replace it.

    That is the term that Dr curry asked me to use when I wrote the original article referenced in the link I gave you. I was merely abbreviating What I wrote and referenced, as writing on an iPad is a bit of a pain and brevity is easier than being long winded.

    I don’t get how you were referring to reconstructions by saying “paleo proxies” yet then discussed glacier proxy data without transition. It’s even worse given the next paragraph said:

    Why should highly selective tree rings with a short growing season and highly susceptible to its micro climate, show anything very much more than a general indication of precipitation during the summer?

    Distinguishing between proxies and hemispheric reconstructions is an important thing. You may have done so correctly in your post, but a person is not going to be inclined to read a post if your comment describing it makes glaring mistakes.

    The point I was making is that ppr, because of the way they are constructed! do not reflect the actual variability the real world experiences on an annual and decadal basis and as such they give a misleading impression of the past.

    That’s a good point, and it’s one I’ve discussed a number of times. I may go back and read your post later to see what you have to say about it.

  122. Carrick:

    Yes, in very broad terms. All of the methods, in my understanding, eventual just perform a weighted average. Loehle is the only one who uses the “on label” calibration values to weight the proxy values.

    I wasn’t looking at “in very broad terms.” I meant it quite simply. There were a few processing steps which keep Moberg from being purely a simple average, but I thought they had a far smaller effect than in nearly any other reconstruction. In fact, I thought there was practically no weighting of proxies involved.

    It’s a fact it agrees (in a statistical sense) with the other reconstructions. The rest is open to interpretation. Remember though, I’m focusing on the low-frequency portion of the global signal. That’s just one metric. If you look at the pattern of warming/cooling, these kind of errors can have big effects on that pattern (see the issues with BEST), even if they end up not biasing the global mean (“very much”).

    While I’m fine with what you say, it doesn’t say anything about whether or not the agreements indicates validation, the topic I was discussing. That two papers reach the same conclusion does not inherently mean one validates the other.

    You can reach the wrong conclusion in many different ways. You can reach a right conclusion for the wrong reason. In either case, a wrong answer does not validate anything.

    For clarification those factors are what you have to multiply these other distributions to get them to the same physical scale used in Loehle & McCollough. If L&M accurately portrays the real temperature scale (if the calibrations were done correctly it would be), then one over these constants is the amount of “descaling” associated with the “sophisticated methods”. Mann 2008 EIV is slightly better in terms of preserving scale, which is expected, because that’s what EIV (errors-in-variables) is designed to do.

    Wait, simple averaging preserves the temperature scale? That doesn’t seem right.

    Ljungqvist..I’ll have to go back and remind myself of what he’s done, but there’s no reason to favor Moberg’s method over his, based on what I’m looking at here.

    Honestly, I don’t know what was done in that one. There were something like four reconstructions he was an author or co-author on, and I don’t remember which ones did what. I think the last one is the one I’ve spent the most time with. I’d have liked it if not for a dozen or more of their proxies being inappropriate.

    Well that, and I was a bit annoyed about the authors deciding to stop responding to my e-mails after I showed their explanation for why they used one proxy contradicted the source they used to justify it. I don’t understand how you can e-mail a person a paper claiming it says one thing even though it clearly does not then suddenly stop communicating if they point it out.

    *grumbles*

    I think I’ve hit all of the issues now that relate to my concern over the “sophisticated methods” not converging to the simple average method used by Loehle.

    Somewhere in the stuff I’ve been reading today, Steve McIntyre makes a similar point to me, that the simple methods are actually better, because they are more amenable to statistical modeling. Were I to try and build up a complex statistical model, I would certainly start with a simple one, and iterate on that, rather than just spring up some complex Rube Goldberg machine from whole cloth.

    While I agree with the things you’ve said in your comments, I can’t bring myself to care much about most of it because methodologies cannot correct for a bad data set. I think we need to be able to justify why we used each proxy we used before worrying about how we’ll use them. Otherwise we wind up with these huge messes that take forever to sort out only to boil down to, “The data responsible for your conclusion should never have been used.”

    In theory, it’s worth discussing all these points. In practice, the data just isn’t there for it to matter much. There are maybe fifty proxies that have actually mattered for temperature reconstructions. I’d say there are clear issues with half of them. I’m probably being generous on both counts.

  123. Brandon:

    There were a few processing steps which keep Moberg from being purely a simple average, but I thought they had a far smaller effect than in nearly any other reconstruction. In fact, I thought there was practically no weighting of proxies involved.

    They start with raw proxy scores which they have to convert into temperatures.

    Put simply, what you obtain for the calibration constants amounts to the weights to the raw proxy scores that you use to generate the reconstruction. So it’s weighted, and the process of selection of weights matters.

    Because you are regressing against noisy data, the issues with bias in the regression come into play, as does the issue of spurious correlation, which can drastically affect the weights (again correlation constants) used in the averaging process.

    While I’m fine with what you say, it doesn’t say anything about whether or not the agreements indicates validation, the topic I was discussing. That two papers reach the same conclusion does not inherently mean one validates the other.

    Not to pick nits, but I’m referring to verification of the methods used, which is a much softer conclusion than that they validate each other. I agree the fact that the series are in agree, by itself, isn’t validation of the results. (I don’t mind repeating that point, because it’s an important one.)

    Validation would be more like using truly independent data sets on single or suite of methods, and showing that the results are robust (technically that they are in statistical agreement with each other). That hasn’t happened yet.

    I can’t bring myself to care much about most of it because methodologies cannot correct for a bad data set.

    Agreed but, recognize that is a different aspect of research.

    It’s actually better to tune your reconstruction methods on data you aren’t planning on using to arrive at your conclusions with.

    I prefer to have the methods spelled out, including verification and validation testing, before I start analyzing the data I want to use for publication. That greatly reduces the chance of tuning your results to match your expectations.

    [To give an example, I just finished a talk where I reproduced a result for the talk using an new data set collected specifically for the talk, because I didn’t want to show data that potentially were cherry picked. The analysis software is run by scripts that automatically generate answers from the collected without intervention. So this is a robust way to prevent me from fooling myself, or from showing “typical results” that are actually very non typical. ]

    The next phase, after you’ve established that a robust methodology exists and it has been selected, is to develop a framework for proxy selection. I can get into what I think is needed for that, but it’s a separate issue from what I’ve been discussing.

    A number of years ago, looking at the “Battlestar Galactica level of complexity” reconstruction algorithms, I kind of despaired that a workable methodology even existed. Having usable data (in some sense) isn’t helpful, if there’s no way to extract information from it in a meaningful way.

    I’d say there are clear issues with half of them. I’m probably being generous on both counts.

    We’d have to examine whether the clear issues make a difference. If you have a precipitation proxy, and you know that precipitation associated with ENSO is correlated to temperature for the region your proxy is located in, then at least for periods associated with ENSO, you also have a temperature proxy.

    I think this is why it’s important to study the proxies you want to use for the reconstruction, rather than put them through a mann-o-matic processor, for example.

  124. Carrick:

    They start with raw proxy scores which they have to convert into temperatures.

    Put simply, what you obtain for the calibration constants amounts to the weights to the raw proxy scores that you use to generate the reconstruction. So it’s weighted, and the process of selection of weights matters.

    Because you are regressing against noisy data, the issues with bias in the regression come into play, as does the issue of spurious correlation, which can drastically affect the weights (again correlation constants) used in the averaging process.

    That’s not true. Moberg’s wavelet transformation would have transformed every proxy into a unitless series. I can see how the standardization might cause some proxies to be weighted differently due to a different in noise levels, but I can’t see how weights would be affected by regression or correlation issues. The proxies weren’t regressed on anything.

    The final output of the methodology was scaled against modern temperatures, but that’s the only time temperature is ever used for anything in Moberg’s methodology.

    That is, unless I’m really off-base about Moberg’s methodology. I don’t think I am, but I guess it’s possible.

    Not to pick nits, but I’m referring to verification of the methods used, which is a much softer conclusion than that they validate each other. I agree the fact that the series are in agree, by itself, isn’t validation of the results. (I don’t mind repeating that point, because it’s an important one.)

    But it doesn’t verify the methodology either. Because the data is different between Loehle/Moberg and Mann 2008, similar results don’t verify methodologies. You can easily have two methodologies that’d produce different results when applied to the same data set yet produce the same results when applied to (certain) different data sets.

    Suppose Method A gives the answer 2 when applied to Data Set X. Method B gives the answer 2 when applied to Data Set Y. Does one result verify the methodology of the other?

    Of course not. For all we know, Method B would give the answer -973 when applied to data set X. -973 is a far cry from the 2 Method A gives for Data Set X, meaning they give radically different answers for the same data set – the exact opposite of verification.

    I prefer to have the methods spelled out, including verification and validation testing, before I start analyzing the data I want to use for publication. That greatly reduces the chance of tuning your results to match your expectations.

    Certainly. I think it’s important to do. It’s just not a topic which interests me much. There’s enough work for me in just understanding all the various methodologies and data sets used.

    I’d love it if someone worked through all the stages of getting an ideal approach figured out. I just don’t want to be that person.

    A number of years ago, looking at the “Battlestar Galactica level of complexity” reconstruction algorithms, I kind of despaired that a workable methodology even existed. Having usable data (in some sense) isn’t helpful, if there’s no way to extract information from it in a meaningful way.

    My thing is if you have a good data set, you can just use a simple approach like plain averaging. It won’t be ideal, but it’ll work. You’ll likely be able to refine your results from there, but you won’t have to worry about unknown biases or anything like that.

    We’d have to examine whether the clear issues make a difference. If you have a precipitation proxy, and you know that precipitation associated with ENSO is correlated to temperature for the region your proxy is located in, then at least for periods associated with ENSO, you also have a temperature proxy.

    I actually brought the same point up with Steve McIntyre when he complained about one precipitation proxy being used as a temperature proxy. I later wound up talking to Christiansen and Ljungqvist about their use of the proxy. They made basically the same point, explaining how the people who collected the series thought precipitation and temperature were correlated in the region.

    I then read the paper they cited, and it turns out they were right people thought there was such a correlation. Only, the correlation described in the paper was opposite what C&L said. In other nearby areas, the correlation was half what they said, half opposite (depending on season).

    The point being, I agree with you about the idea, but my comments already account for it.

    I think this is why it’s important to study the proxies you want to use for the reconstruction, rather than put them through a mann-o-matic processor, for example.

    Definitely. One thing I’ve long hated is I lack journal access so tracking down and reading up on the references for some proxies has been difficult. It’s actually the main reason why I never tried a project I wanted to do. I thought it’d be cool if there was a web site with pages for the various proxies used in temperature reconstructions. It’d list location, period covered, sampling rate, authors’ descriptions and (ideally) a plot of the proxy.

    I think that’d be a far more admirable goal for scientists in this field than yet another paper using “novel” statistical methods on an arbitrarily chosen data set.

  125. Brandon

    Sorry for the imprecision and confusion. I have been very distracted by the scottish referendum which had huge implications for us in England and have had little sleep the last couple of nights. Best not to get involved with complex terminology and mix up a number of different issues under those circumstances!

    The glacier proxy data is my own invention and brings together thousands of glacier references over the last 3000 years compiled by ladurie in particular by also others such as Pfister. I have no means whatsoever to create my own tree ring or coral data but felt that translating glacier data into a graphic would be useful as it provides another means of examining the warm and cool periods we can observe.It however lacks precision and resolution which illustrates the problems with expecting ppr to act like a thermometer.

    Reading the comments on your post it is evident that ppr have enormous questions over them but they have acquired, to those that decide policy, much credibility and the Mann hockey stick is quoted to this day by our MP’s.

    Anyway the referendum is decided and the Union stays together. hopefully I can make up for the one hours sleep last night tonight and will not confuse others such as yourself by running together several concepts in what was intended to be a passing reference to the comment I made several days ago.

    All the best

    Tonyb

  126. No I don’t think your moderation policy is reasonable. Anyone who gets placed in moderation will just stop posting.

    Beyond that, you could be a little more polite. If memory serves, you were very surprised at the level of vitriol at ClimateAudit some years back. I think it was directed at you and agreed at the time. You seem to be becoming like those posters.

  127. MikeN:

    No I don’t think your moderation policy is reasonable. Anyone who gets placed in moderation will just stop posting.

    I don’t know how you can be so sure they’ll make that one. People at RealClimate post despite having their comments thrown into the Borehole. There are plenty of people who post at sites knowing their comments will likely get deleted. Under my moderation policy, they can still participate. People will just have to choose to view their contributions. Why is that bad?

    And more importantly, why are you focusing on that portion of it? Sometimes deleting comments or banning users is the right thing to do. Even if what I did was equivalent to banning users (I don’t agree it is), you haven’t said anything to indicate why I’d be wrong to do so. I recently put Joshua on moderation in the same way. Are you going to tell me that was a bad decision? I think plenty of people would disagree.

    Incidentally, even if one feels what I did were equivalent to banning people, there’s a huge difference. People I place in moderation are free to talk about it here. If they feel my decision was inappropriate, they can tell people so. They can explain why they feel I was wrong or unfair. People can read their complaints, judge for themselves and even respond if they’d like. That’s more freedom than I’ve seen at any other blog.

    I can’t think of a single site that will allow users on moderation to talk to one another about why they were moderated. Can you?

    Beyond that, you could be a little more polite. If memory serves, you were very surprised at the level of vitriol at ClimateAudit some years back. I think it was directed at you and agreed at the time. You seem to be becoming like those posters.

    I’ve been the same way for the last ten years. My policy is simple: Insults always come with justifications. I’ll strive to never say, “You’re an idiot, therefore you’re wrong.” I will sometimes say, “You are wrong for reason X, you idiot.” The reason is simple. I don’t think people who behave well deserve to be treated with the same level of kindness as people who don’t. People who behave well should be treated nicely. People who don’t behave well should be treated less nicely.

    As for the Climate Audit thread you refer to I don’t think you’ll find my behavior is close to comparable to how people behaved in it:

    http://climateaudit.org/2010/06/23/arthur-smiths-trick/

    And that’s even with the significant number of comments deleted from that thread, which is quite a handicap.

  128. Tonyb:

    Sorry for the imprecision and confusion. I have been very distracted by the scottish referendum which had huge implications for us in England and have had little sleep the last couple of nights. Best not to get involved with complex terminology and mix up a number of different issues under those circumstances!

    That’s alright. Mistakes happen, especially ones of wording. Having read your article now I can see you did originally distinguish between reconstructions and proxies.

    When I saw your comment here and skimmed the post, it seemed like it was going to be more like what Tom Curtis posted here. Curtis wrote lengthy comment criticizing MM05 which repeatedly criticized them by conflating (pseudo) proxies and reconstructions. It’s hard to find motivation to read things which are so stupidly wrong. Of course, it turns out yours wasn’t what I expected.

    The glacier proxy data is my own invention and brings together thousands of glacier references over the last 3000 years compiled by ladurie in particular by also others such as Pfister. I have no means whatsoever to create my own tree ring or coral data but felt that translating glacier data into a graphic would be useful as it provides another means of examining the warm and cool periods we can observe.

    Is there a writeup somewhere for it? I can’t recall reading about glaciers used to proxy temperature before. I have no idea how you’d calibrate them/combine their records.

    It however lacks precision and resolution which illustrates the problems with expecting ppr to act like a thermometer.

    Reading the comments on your post it is evident that ppr have enormous questions over them but they have acquired, to those that decide policy, much credibility and the Mann hockey stick is quoted to this day by our MP’s.

    Yup. The lack of temporal resolution in reconstructions is a tragically understated point. Examining the reconstructions as time series shows their resolution is limited, and that assumes what higher frequency signals exist in them are genuine. There’s an argument to be made much of the high frequency signal in (at least some) of these reconstructions is little more than noise.

    And that’s just a matter of methodology. Even with perfect data, we’d still have that problem. We don’t have perfect data though. Reconstructions routinely use bad data. It’s a disgrace they’ve been promoted as much as they have. If science was working properly, the conclusions of every millennial reconstruction would be incredibly cautious.

    Actually, the conclusions of most of them would be, “We don’t have a publishable result.”

  129. Brandon

    I suppose if you say ‘I don’t know’ or ‘we don’t have a publishable result’ that the funding for your research might quickly dry up..

    This business of annual and decadal temperatures being highly variable and falling through the coarse sieve of the ppr is one that is not mentioned often enough and is an area of my active research as I try to push the CET historic record back further.

    I am writing up a paper about glaciers and have produced a better graphic. Glacier changes is a very well documented area (e.g see Le Roy Ladurie ‘times of feast times of Famine’) but one which has been largely set aside over the last 20 years in favour of computer models.

    tonyb

  130. I should clarify, my statement about moderation policy was only in response to your statement that you wonder if they will stay. I am surprised to see Stokes in moderation, but haven’t followed the thread closely enough to see if I agree/disagree with a decision.

  131. TonyB, that’s pretty much it. People always respond, “Well if you think they did a bad job, publish your own work.” One problem with that is you can publish wrong results claiming to find an answer, but you generally can’t publish correct results which say it’s impossible to know the right answer. Science is biased that way, both in publication and in and funding.

  132. MikeN, I think you may need to follow threads more commenting.* I didn’t put Nick Stokes on moderation. I put Kevin O’Neill on moderation; Nick Stokes responded by saying he was leaving.

    As for wondering if they would stay, that seems a reasonable thing to ponder. If people thought O’Neill was contributing useful or informative insights, they would follow links to see what he had to say. I know if I was presented this option at any blog I’ve been banned from, I’d take it. And it’s certainly reasonable to wonder if Stokes will comment again. I didn’t place him in moderation. Of course I’m curious who I will see comment on my blog.

    *Please understand I say this with a lighthearted tone. I would normally insert a smiley to indicate this, but I’m convinced one of the emoticons is trying to eat my soul, and I keep forgetting which one it is. I don’t want to risk it.

  133. Brandon: Tony recomended this thread because of interest I’ve had in his work; he rightly thought the thread would be helpful to me.

    I have one question. I believe that the MM red noise test resulted in over 99% of the 10,000 runs being hockey sticks, which they defined as having a hockey stick index of 1 or >1. I read through the thread quickly tonight but didn’t see these numbers used. Are they not relevant?

    Thank you Brandon and Tony,

    Richard

  134. rls, those numbers aren’t particularly relevant. They’re largely arbitrary. A trivial reason is we could choose a different way to measure the “hockey stick index” (which would produce different results). The more important issue, however, is the numbers depend upon the parameter you select.

    For instance, the amount of autocorrelation in a series is determined by a parameter. The more autocorrelation in your series, the more hockey sticks you’ll get when you apply MBH’s methodology. If you choose a really high value for the autocorrelation parameter, you’ll get a lot. If you choose a really low value, you’ll get far fewer. The results will always be biased toward having higher HSI values it should, but the exact amount of that bias depends upon the parameter.

    Similarly, MM05’s simulation was designed to emulate a particular set of 70 tree ring series. You don’t have to do that. You could use a smaller number of series. You could use larger number of series. MM05 emulated the series it emulated because those were the most important series in MBH, but you can show a bias in MBH’s methodology with any set of series.

    For a shorter version, the bias is real, and we can demonstrate it in many different ways. The numbers you refer to are the results of just one of those ways.

  135. rls, no prob. I’ll give a basic rundown of how this works.

    Autocorrelation is how similar a series is to itself. That is, if it is already going up, autocorrelation means it is more likely to keep going up. If it is going down, autocorrelation means it is more likely to keep going down. If there is autocorrelation in a series, the series isn’t entirely random. It is more likely to have excursions up and down.

    The MBH methodology is biased to find such excursions, but only in the modern period. If a series has a peak in 1650, the MBH methodology doesn’t care very much. If a series has a peak in 1950, the MBH methodology will care a great deal. It will give far more weight to the peak in 1950 than the one in 1650. Because it cherry-picks peaks in the modern period like this, it cherry picks hockey stick shapes.

    The reason my previous comment discusses parameters is the more autocorrelation there is, the more peaks there will be, and the bigger those peaks will be. The more there are, the more the MBH methodology has to cherry pick from. The bigger they are, the bigger the effect of the cherry picking will be. (If you give 100x the weight to a series with a peak of 2, it’s like saying that peak is 200. If you do the same for a series with a peak of 20, it’s like saying the peak is 2000.)

    Hopefully that helps. The key here the MBH methodology is biased. Specific numbers only matter in that they can help us quantify how large an effect the bias has. That’s good to know, but only if you care. Most people are probably fine with just knowing the methodology is biased.

  136. While I’m at it, it’d be helpful to understand the bias we’re talking about is not for the reconstruction itself, but for a particular step in creating it. Feel free to skip this if you aren’t interested.

    The bias we’ve been talking is not for the reconstruction itself. It is in a step used to create some of the proxies which go into the reconstruction. Once you have those proxies, you still have to combine them in some way. To easily see MBH’s methodology of combining proxies is biased, look at the figure in this post:

    https://hiizuru.wordpress.com/2014/02/18/manns-screw-up-3-statistics-is-scary/

    It shows all 22 proxies that go back to 1400 AD. Only two have a hockey stick shape. One of them is NOAMER PC1, which is the proxy we’ve been talking about here. The other is known as Gaspe, which is actually part of the NOAMER data set. It is in the same data set used to create NOAMER PC1. Only, it was duplicated and artificially extended so it could be used on its own as going back to 1400 AD. This shows you don’t need complicated methodologies to cherry-pick your proxies. You can also just do it by hand. (See here for a full description.)

    Anyway, if only two of 22 proxies have a hockey stick shape, the final reconstruction should not have a hockey stick shape. It did because MBH used a strange process to combine their proxies. They said we know modern temperatures went up thanks to the instrumental record so proxies which go up in the same period must be the good proxies. The bigger the increase the proxy shows, the more weight it deserves. That basically guarantees a hockey stick shape because it gives tons of weight to any series with a hockey stick shape.

    The key is remembering how the two steps interact with one another. MBH needed at least one proxy with a hockey stick shape to have their reconstruction be a hockey stick. They managed to get two. One (Gaspe) they got by duplicating it from another data set, artificially extending it and using it on its own. That’s manually cherry-picking. The other is NOAMER PC1, which they created by using the process we’ve discussed being biased toward producing hockey sticks.

  137. Brandon

    I found the link to screw up statistics especially interesting.

    I was hoping to find graphs from a variety of people including Mann which split the proxies up into their categories and gave the overall global temperature result for that category as far back as possible. That is to say for example a global tree ring temperature back to 1000Ad , global borehole temperature to 1200AD, corals to 1500Ad etc.

    I can not readily tell whether those graphs you posted include that sort of detail. It is the end result I am interested in at present as I wish to see how well they reflect the variability of the instrumental record plus my own reconstruction of CET to 1538. Thanks

    Tony Brown

  138. tonyb, glad to hear it. I wrote a series of posts on Michael Mann, and I think I did a good job of explaining most of the central points in a clear manner anyone can understand. It’s good to know they weren’t a complete waste of time!

    I was hoping to find graphs from a variety of people including Mann which split the proxies up into their categories and gave the overall global temperature result for that category as far back as possible. That is to say for example a global tree ring temperature back to 1000Ad , global borehole temperature to 1200AD, corals to 1500Ad etc.

    I doubt you’ll find that. Most types of proxies don’t have enough information to create a reconstruction even with the errant methodologies used in the field. Even Mann’s papers, which (mis)use the most proxies, can’t get reconstructions (that pass even his weak tests for “validity”) without tree ring data. Plus, most multiproxy reconstructions aren’t even global. Most are on a hemispheric or smaller scale.

    If you want reconstructions with a decent spatial scale that are limited to individual proxy types, you’ll need to restrict yourself to 1600+, if not later. Otherwise you’re going to be stuck with small, regional reconstructions.

    I can not readily tell whether those graphs you posted include that sort of detail. It is the end result I am interested in at present as I wish to see how well they reflect the variability of the instrumental record plus my own reconstruction of CET to 1538. Thanks

    Those series are individual proxies which are supposed to measure temperatures of a particular parts of the map. There are a few different proxy types in there. If you want more information, there’s a page. about the data used in MBH. It has a list of the proxies used in each step, plus data files listing those proxies. It also has a file describing the types and locations of the proxies.

    Though I can’t say how many of those proxies actually manage to reflect temperatures.

  139. Thanks Brandon

    I had hoped that someone else had already done the work for me, but sadly that rarely appears to be true!

    I’ve discounted the viability of tree rings so am now looking specifically at boreholes. It seems strange when proxies are preferred to instrument readings

    tonyb

  140. Those proxies are referred to instrumental records because they are instrumental records. MBH used something like 22 instrumental records as proxies.

    Mann 2008 used a bunch of tree ring records, I believe named Luterbacher, as proxies even though the modern portion of them had been calculated from instrumental records.

    I’ve never understood how instrumental records can be used as paleoclimate proxies, but there you have it.

  141. Brandon

    The reason for my interest was a graph showing borehole data posted by Fan. It shows the Earth warming for 300 years which is very much as I read it with my own reconstructions. I find it difficult to believe however that they can derive a proper temperature signal using borehole data, but….

    The borehole data graph is the third button along the top.

    http://www.earth.lsa.umich.edu/climate/index.html

    That is why I wanted to break the paleo proxy reconstructions down into their component parts because if boreholes show a 300 year warming but the overall reconstruction it is used within shows a slight downward trend over 800 years there must be a very cool signal coming from one or more of the other paleo proxy groups used.

    I’m busy the next few days but will hope to get on to this again at the end of the week

    tonyb

  142. That’s an interesting site. I wish they gave more information on the reconstruction you’re talking about though. It says it is “a global perspective of surface temperature change over the last five centuries, averaged from 979 individual reconstructions,” but do they really mean they just averaged the 979 individual records? That raises all sorts of potential problems.

    Plus there are the standard questions about how they calibrate their series to temperature. I think it’s funny one page says:

    It is a kind of direct temperature – temperature study. Therefore, it is free of any uncertainties due to conversion from proxy data to temperatures.

    It’s true they don’t have to convert proxy data to temperatures as part of their reconstruction process, but that doesn’t mean it is “free of any uncertainties due to [such a] conversion.” The fact the uncertainties arose in a previous step doesn’t make them disappear for the next one.

    I definitely like the page for it providing the data. I’m just not convinced of the accuracy of their analysis of it. I guess what I ought to do is look up the papers it references.

  143. Brandon

    You have hit on one of the reasons I don’t like global averages whether of sea levels or, as in this case, temperatures.

    Adding everything together and averaging it means all the nuances are lost. For example in a record of 100 temperatures 80 might be cooling very slightly 10 might be static and 10 are warming strongly, meaning a net average that is warming.

    There are many places in the world that illustrate the term ‘global warming’ is misleading.

    I did an article on this subject several years ago

    http://diggingintheclay.wordpress.com/2010/09/01/in-search-of-cooling-trends/

    Interestingly Dr Curry has a lot of time for borehole data but much less for tree rings.

    knowing little as yet on the former I cant comment. Having looked in depth at tree rings I just don’t get why anyone should consider them to be a good thermometer.

    tonyb

  144. Brandon

    In your screw up 3 post you link to individual graphs of 22 proxies which are not labelled, but you show the ones used in text form here

    http://www.nature.com/nature/journal/v430/n6995/extref/PROXY/datalist1400.txt

    Can you confirm if these are all tree rings or are they a mix of proxies? Which graph relates to what location and if not a tree ring what type of proxy is it?

    I am looking at the borehole data and am trying not to duplicate the readings they give with other proxies. Mann’s work is confusing enough anyway but when you examine the groups of proxies and then the individual ones within that grouping AND then look at the composite end result it gets even more confusing .

    Tonyb

  145. tonyb – The MBH98 proxies are enumerated and described here. While the file names are not given there, one can make educated guesses that the “quelc*” files are the Quelccaya ice core records, “svalbard.dat” is the Svalbard ice melt, “westgreen-o18.dat” is Penny Greenland O18. I think the others in that list are all tree-ring records, or PCs from a group of tree-ring records.

  146. HaroldW

    Thanks for that.

    I have done work on these before but the trouble is that one really has to get back into the right mind set again and almost start right from scratch whilst concentrating hard in order to follow all the various strands. You can’t dabble in the intricacies for a few minutes then come back to it weeks later and pick it up again straight away.

    tonyb.

  147. Tonyb, there is more than just that. Some of those are temperature reconstructions not proxies, and some are instrumental records. Some are I think precipitation proxies as well.

  148. Your saying that PCA won’t produce hockey sticks from random data doesn’t make sense. If there is enough random data, there will be a hockey stick for PCA to mine.

  149. MikeN, PCA requires a certain degree of commonality within series to extract a “signal.” It’s unlikely you’ll get a hockey stick from white noise because of the odds of having multiple white noise series with a hockey stick shape. It’s not impossible, but it generally won’t happen. For all practical purposes, white noise will not lead to hockey stick shaped proxies via PCA, whether it’s proper PCA or MBH’s screwed up version. Red noise, where there is some persistence, will.

    That said, I believe simulations were done which show MBH’s process is biased enough you can see it with white noise if you run at least 10,000 or so simulations. If I’m remembering right, that shows it can produce hockey sticks out of white noise, just at an incredibly low rate.

  150. MikeN, given a long enough time series, you will eventually get hockey sticks from PCAs using ordinary white noise. The issue is that the actual noise (that is the portion of the series that is not a temperature signal) in the tree ring proxies likely will produce PCAs at a much higher rate than if you assumed white noise.

    In other words, using white noise is a very bad method for testing the validity of your algorithm in this case. It will likely dramatically underestimate the “false positive rate” (type 2 errors) associated with the actual data set.

    Brandon, I believe Lucia performed simulatinos using white noise originally.

  151. I don’t remember lucia doing any PCA simulations. Are you sure you’re not thinking of her tests for the screening fallacy? She definitely used white noise there, and you don’t need red noise for the screening fallacy to be a problem.

    (I wish I knew a more general term for the class of approaches the screening fallacy is in. Weighting series by their correlation to the instrumental record isn’t actually screening out proxies, but the two are highly related. I’ve described it as the screening fallacy on steroids, though there’s bound to be a better description.)

  152. Brandon, yes I was thinking screening fallacy. For white noise, I don’t think there’s a difference using centered PCA, is there?

    By the way, I’ve put a comment up on ATTP’s blog. Probably this was a dumb thing to do but I did it anyway. I’ve included a copy below incase they borehole my comment.

    I could have mentioned that the direct relationship between Loehle and Moberg is substantially influenced by your comments on this post. Mentioning this probably would have ensured that the post would never see daylight, otherwise I would have.

    So I’m mentioning it here.

    ———————————————

    Since some work I did came up here, I thought a few comments were in order regarding a graph of mine that is being discussed here. What I am providing is informational. I am not interested in food fights, but if there are genuine issue that people want to raise I will respond to those.

    First, the curve being discussed is very similar to that found by Zeke:

    http://rankexploits.com/musings/2010/comparing-proxy-reconstructions/

    who includes a comparison with Loehle.

    SKS has a similar figure:

    http://www.skepticalscience.com/ljungqvist-broke-the-hockey-stick.htm

    but does not include a comparison with Loehle.

    There is nothing remarkable about the temperature series plotted in my figure. For the ensemble, I resampled all of the temperature series to a 10-year period (cubic-spline) and computed their arithmetic mean and standard deviation.

    I did one thing that will be somewhat controversial to people who aren’t versant in climate proxy lore, which is I rescaled the temperature proxies to match each other (hence “pseudo-temperature”), and adjusted them to a common baseline (this should be non controversial, though food throwers can likely think of specious arguments to fling). I used Loehle because, due to the simple arithmetic average over precalibrated proxies, it is less likely to suffer from scaling bias effects. However, any of the proxies would work as well.

    I will provide some justification for this below, but briefly, if what you want to really know is global mean temperature, there are multiple reasons to not expect the temperature scales to be the same between reconstructions (different spatial sampling and scaling bias associated with the algorithms used).

    If you don’t rescale, there are two things that happen. One is that reconstructions with larger absolute scales (relative to the “true” global mean temperature scale) will be weighted more heavily than reconstruction with smaller absolute scales. The second is that the standard deviation of the ensemble will be larger. If you don’t shift to a common baseline, the main effect is an increase in the standard deviation of the ensemble.

    In terms of “cherry picking”, my selection criterion was for 2000-year global or north hemispheric reconstructions. I believe that I’ve included all peer-reviewed reconstructions that met this criterion.

    Mann 2008 stops at 200AD and was not included in earlier versions of the graph for that reason. I got interested in how it did, and was pleased to see that it tracked well with the other long-duration reconstructions.

    The only other reconstructions of similar length are Christiansen and Ljungqvist (2012) and Hegerl et al. (2007). C&J seemed “too close” to L2010 so I stuck to the former reconstruction, and Hegerl cuts off at 558 AD (a bit too high for what I was looking at).

    The more recent PAGES 2k isn’t global or north hemispheric, so not usable for a meta-analysis.

    MBH 98 was added in response to comments from Boris on Lucia’s blog. I was making what should be an obvious point to any critical-thinking person, and is known anyway from the literature, which is the MBH 98 suffers from a complete loss of low-frequency information. (This makes the use of MBH98 to make comparisons about the relative warmth of the modern era to the MWP totally useless.)

    Regarding Mann 2008, the EIV is Mann’s preferred reconstruction. From Mann’s paper:

    By contrast, we find in these experiments that the EIV reconstructions are significantly more skillful, given a particular synthetic data network. Where the two methods no longer yield reconstructions that agree within uncertainties, it is therefore likely that the EIV reconstruc- tion is the more reliable, although with the caveat that this finding has been demonstrated only under the assumptions implicit in the pseudoproxy analyses (e.g., that proxies have a linear, if noisy, relationship with local temperature variations). For this reason, we place greatest confidence in the EIV reconstructions, particularly back to A.D. 700, when a skillful reconstruction as noted earlier is possible without using tree-ring data at all.

    I did look at Mann’s CPS as well, as as Mann observes, for the early period, this does not agree well with his EIV method. As I see it, there is vey little value in a meta-analysis of this sort in including non-preferred reconstructions.

    Moberg is an obvious inclusion. It even gets the RC stamp of approval:

    http://www.realclimate.org/index.php/archives/2005/02/moberg-et-al-highly-variable-northern-hemisphere-temperatures/

    Ljungqvist, by any standard I’ve seen, is one of the more carefully done reconstructions, so it’s inclusion is a no-brainer. In the metrics I looked at, it has met or exceeded all of the other reconstructions. I’ll show one result below.

    Regarding Loehle, since there is some confusion about this—partly due to Craig Loehle’s own words and his badly flawed first cut at this paper—because it cuts off in 1935, Loehle and McCollough does not demonstrate that temperatures in the MWP were warmer than current temperatures.

    If you compare Loehle & McCollough to Moberg, analytically there is virtually no difference between what Loehle did than what Moberg did, for the low frequency portion of Moberg’s reconstruction, other than a difference in weighing of proxies.

    Secondly, the argument over coverage for low-frequency applies equally to Moberg. Moberg has 11 low-frequency proxies (9 of these are used by Loehle). Loehle uses 9 further, all of which are considered to be temperature proxies and have published calibration values by their authors. While I agree that we can and should quibble over which proxies should be used (but I think Gavin is not the right person to be relying on for proxy selection, nor am I), fundamentally there is nothing wrong with the approach used in this paper. In fact, it’s relative simplicity and good agreement with more complex algorithms, rather than contradicting other work, acts as a form of verification that the more complex algorithms are not losing low-frequency information.

    Loehle is limited to a low-frequency reconstruction (the rolloff is around a 50-year period) because of proxy selection, for which we expect and observe (with a few notable exceptions) a high degree of correlation between measurement on different locations of the Earth for long duration measurements. Since I was only looking at the low-frequency portion of the reconstructions, issues raised about the spatial sampling of Loehle (which are similar in any case to the issues that exist for the other reconstructions) are largely irrelevant.

    When considering the effect of spatial sampling on the low-frequency portion of the reconstruction, we need to be aware of the effects of polar and land amplification on the estimated global (or hemispheric) reconstruction. As most of you know, land warms (and cools) more rapidly than ocean due its lower thermal mass. For reasons I can partly explicate, polar (land) regions are more sensitive to changes in forcing than more tropical ones.

    The effect that sparse sampling has, for any reconstruction’s low-frequency portion of the signal, is that there will be a scaling bias for the “temperature scale” of a given proxy reconstruction compared to the temperature scale associated with global temperature.

    In addition, methods like Composite-Plus-Scale are prone to an overall scaling bias and offset bias for the reconstruction period. (Offset bias occurs when the temperature scale during the reconstruction period is offset relative to the temperature scale of the calibration period.)

    There are issues for high-frequency reconstructions because different areas start responding out of phase with respect to each other. However this is a fairly high-frequency phenomenon and most reconstructions get around this issue by low-pass filtering and only retaining periods longer than 10-years.

    Nonetheless, we can get some idea by looking at spectra of the various reconstructions (I used a 200-year window with Welch tapering for the reconstructions), which I am showing below:

    While the reconstructions have been scaled to match the low-frequency portion of Loehle & McCollough, the two temperature series are shown with no scale adjustment. Given uncertainties in the relationship of the pseudo-temperature scale to the global temperature scale, the level of agreement was a bit surprising to me.

    Also — note that Moberg does not agree well with the other high-frequency reconstructions. Loehle predictable rolls off steeply below 50-Hz.

    Ljungqvist and Mann 2008 EIV appear to agree well with each other (given uncertainty) and with the global temperature series, both in slope and in magnitude.

  153. Carrick, as I understand it, there’s actually a small bias toward creating hockey sticks when you use white noise because white noise can generate series with something of a hockey stick shape for PCA to pick out. It’s the same effect as with red noise, just more uncommon and far more muted.

    For the comment you submitted, do you remember which thread you posted it in? I’m curious if they let it stand or not. I don’t see anything in it which should bother them (since you left me out of it), but there’s no telling when people apply moderation arbitrarily. I could go look through the recent threads, but I’m sober, and I’m not sure I could take it.

    By the way, that comment reminded me of something I meant to ask before. When making that graph you show, did you try examining different subsets of the reconstructions to see if the results were consistent throughout? I’m curious because some reconstructions have different amounts of proxies for different periods, and it’d be interesting to know how (or if) that affects their spectra. That’s especially true for the Mann 2008 reconstruction you show as it uses a significant amount of instrumental data (calling into question its agreement with instrumental temperature record).

    Another thing I’m curious about is how does the rescaling you performed affect the agreement you show? My impression is the agreement at periods of 100 and above are largely guaranteed by the rescaling, but I’m not sure just how the lower periods are affected.

  154. Oh lord. I just tried looking for your comment Carrick, and one of the first I skimmed through had this remark by KR:

    AndyL – The Tiljander sediments were discussed as a potential issue in MBH, the bristlecones were an acceptable (although discussed) proxy at the time and have been further validated since then. Again, you are shot-gunning nonsense.

    Donald Graybill, the guy who collected the bristlecone data, specifically warned we couldn’t tell if they were appropriate for use as a temperature proxy. KR says they “were an acceptable… proxy at the time” despite this, offering no explanation why Graybill’s concerns were unjustified.

    The NAS panel specifically said bristlecones should be avoided in temperature reconstructions. KR says they “have been further validated since then” without explaining why specific warnings against using the data “validated” them.

    I’m getting the impression KR thinks anything which gives an answer he likes “validates” his answer.

  155. Brandon, it was on this thread. Apparently I have been quietly banned from the site as it has not appeared yet. Glad I cross-posted here. If these people continue with their off-base comments, I suppose a front-page blog post to clear the air is warranted.

    When making that graph you show, did you try examining different subsets of the reconstructions to see if the results were consistent throughout?

    I did a correlational analysis against Ljungqvist:

    Another thing I’m curious about is how does the rescaling you performed affect the agreement you show? My impression is the agreement at periods of 100 and above are largely guaranteed by the rescaling, but I’m not sure just how the lower periods are affected.

    I’m glad you asked, because I got a detail wrong. I said:

    I used Loehle because, due to the simple arithmetic average over precalibrated proxies, it is less likely to suffer from scaling bias effects. However, any of the proxies would work as well.

    I computed a single scaling constant and offset by regressing the series against Ljungqvist from 1000-1900. It wouldn’t make sense to use Loehle for the regression, because it’s a lower quality reconstruction and SNR matters here.

    Anyway, here’s the constants I found:

    SERIES OFFSET SCALE
    LoehleMcC.temp 0.24 1.200
    ljungqvist2010 0.00 1.000
    mann08.temp.cps -0.28 0.448
    mann08.temp.eiv 0.10 1.014
    mann1998 -0.11 0.513
    moberg2005 -0.04 1.071

    Also, you probably won’t ever use this, but here is an archive of the files for the ensemble. I’ve added a Readme.txt explain some of the files.

  156. I should have mentioned that these are the regression coefficients for each series compared to Ljungqvist 2010. To compute the scaled version of the series, you use the formula:

    Tscaled = (T – offset)/scale

    By the way, if we were to regard Loehle as being an accurate temperature scale, we can redo the series-offset-scale table as:

    SERIES OFFSET SCALE
    LoehleMcC.temp 0.00 1.000
    ljungqvist2010 -0.24 0.833
    mann08.temp.cps -0.52 0.373
    mann08.temp.eiv -0.14 0.845
    mann1998 -0.35 0.428
    moberg2005 -0.28 0.892

    I don’t know if you followed the discussion on Lucia’s blog about scale and offset biases. While that discussion relates to CPS, what we saw was typically a negative temperature offset and scaling biases less than one.

    It is curious that we see the same thing here.

  157. Carrick, thanks. I had actually wondered about the details of how you made the figure showing MBH is an outlier, but it never stood out emough I thought to ask. It seems it was about what I thought.

    The correlational analysis you posted is interesting to me. I think it does well to show modern reconstructions are converging to a particular picture. Of course, I don’t think that means the picture is right. As you probably know,I don’t think results in the field converging means anything more than that’s the “accepted” result. I think if different people were working the field, we might well find their results converge to a different picture.

  158. Oh, on the moderation thing. I’m not sure if you’re banned or not. I’ve heard from a couple people Anders disappeared comments they wrote without any note. That includes two people who had never commented their before. In neither of those cases were they “banned.” Their comments just went into the moderation queue for a while then vanished.

    When you’re truly banned, your comments won’t even show up in the moderation queue. That’s the thing to look for. And even then, you can’t assume anything off one comment. To be safe, you need to try a second comment which is short and doesn’t contain any words (or names) which might be filtered.

    It wouldn’t surprise me if you are banned, but I’m very hesistant to cry “censorship.” There are too many quirks which can come into play. As an example, you may remember Nick Stokes was having trouble at all WordPress blogs because he had been flagged as a spammer by Askimet.

    Incidentally, if you would ever like to post something related to the hockey stick debate which is being censored elsewhere, you’re welcome to here. This isn’t a high traffic blog, but I’d be happy to let people write posts here if it might help move discussions forward.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s