Denying the Obvious

Sometimes people deny something that seems so obvious I don’t know how to respond. Today on Twitter, several people have been discussing issues related to paleoclimate reconstructions. The details aren’t important for this post. What is important is one participant, Jim Bouldin, claimed people have falsely accused others of cherry-picking. He specifically mentioned Steve McIntyre, so I tried to elicit more information, asking:

The second tweet pointed out one person even used the phrase cherry-picking to describe what they had done, including a link to this post to support that claim:

I don’t have the exact words here. (I’ll edit it if I get better notes.) But, for certain, D’Arrigo put up a slide about “cherry picking” and then she explained to the panel that that’s what you have to do if you want to make cherry pie.


The other link in the tweet is to a post which showed a person said:

We strive to develop and use the best data possible. The criteria are good common low and high-frequency variation, absence of evidence of disturbance (either observed at the site or in the data), and correspondence or correlation with local or regional temperature. If a chronology does not satisfy these criteria, we do not use it. The quality can be evaluated at various steps in the development process. As we are mission oriented, we do not waste time on further analyses if it is apparent that the resulting chronology would be of inferior quality.

If we get a good climatic story from a chronology, we write a paper using it. That is our funded mission. It does not make sense to expend efforts on marginal or poor data and it is a waste of funding agency and taxpayer dollars. The rejected data are set aside and not archived.

As we progress through the years from one computer medium to another, the unused data may be neglected. Some [researchers] feel that if you gather enough data and n approaches infinity, all noise will cancel out and a true signal will come through. That is not true. I maintain that one should not add data without signal. It only increases error bars and obscures signal.

It’s difficult to imagine a more straightforward admission of cherry-picking. One person openly said you have to cherry-pick in order to make cherry pies. The other person said data which doesn’t give the expected/desired results is not kept.

Bouldin’s response to my tweets was:

Let’s check that claim. The first person I’ve referred to is Rosanne D’Arrigo. Her Columbia University profile says she works at the “Tree Ring Lab.” In fact, she’s a senior researcher at a university Tree Ring Research Laboratory. I think it’s safe to say she’s “in dendroclimatology.” The same is true for the other person I referred to, Gordon Jacoby, who previously worked at the same lab.

I could also cite the publication history of these two, or the tree ring data they’re responsible for, but I think it’s indisputable they are “in dendroclimatology.” I think it’s also indisputable one of them openly described what she had done as cherry-picking. I think it’s also quite clear this:

Fifteen years is not a delay. It is a time for poorer quality data to be neglected and not archived.

Is an admission of cherry-picking. Even if one chooses not to use some data they collect, there is no excuse for destroying that data. I think it’s obvious that is cherry-picking.

Why does Jim Bouldin deny it? I don’t know. Maybe he thinks destroying data which doesn’t give the expected/desired results is okay. Maybe, somehow, he thinks that isn’t cherry-picking. Maybe he’s just in denial and refuses to look at the links people provide to support their claims.

I don’t know, but the “denial” idea seems most likely right now. After Bouldin denied my accusation, I responded:

He didn’t respond. He kept talking on Twitter, but he ignored this tweet and a couple other tweets of mine. That’s when I began writing this post. I decided not to post this though, thinking I’d try on Twitter again in order to prevent things from escalating more than necessary. As such, I tweeted:

There was a little interlude, then Bouldin tweeted:

I don’t know how one could argue the examples I discussed in this post aren’t examples of dendroclimatologists admitting to cherry-picking. Maybe I’m missing something though. That doesn’t matter. Whether or not one agrees with my interpretation of the evidence, it is absolutely undeniable I provided that evidence.

And yet, Bouldin denied it anyway. When I pointed out his denial was silly, he tweeted:

I agreed, telling him I’d have a post up within the hour. This is it, and this is the “coherent and succinct statement of [my] position”:

That data gives an expected answer does not mean that data is correct. That data fails to give an expected answer does not mean that data is incorrect. One cannot simply ignore data which fails to give an expected answer. One certainly cannot destroy data which fails to give an expected answer so only “good” data exists.

Gordon Jacoby and Rosanne D’Arrigo openly admitted to only caring about data which gave them the results they expected. That is an open admission of cherry-picking.

Advertisements

16 comments

  1. one more example of cherry picking, by Esper this time.

    http://climateaudit.org/2006/03/07/darrigo-making-cherry-pie/#comment-45703

    However as we mentioned earlier on the subject of biological growth populations, this does not mean that one could not improve a chronology by reducing the number of series used if the purpose of removing samples is to enhance a desired signal. The ability to pick and choose which samples to use is an advantage unique to dendroclimatology. That said, it begs the question: how low can we go?

    It is important to know that at least in distinct periods subsets of trees deviate from common trends recorded in a particular site. Such biased series represent a characteristic feature in the process of chronology building. Leaving these trees in the pool of series to calculate a mean site curve would result in a biased chronology as well. However if the variance between the majorities of trees in a site is common, the biased individual series can be excluded from the further investigation steps. This is generally done even if the reasons for uncommon growth reactions are unknown. Esper et al, 2003]

    My emphasis.

  2. Thanks Les Johnson. I had forgotten about that one. I think you should have emphasized a bit more though. My favorite part of that quote is:

    this does not mean that one could not improve a chronology by reducing the number of series used if the purpose of removing samples is to enhance a desired signal.

    I’d love to hear Jim Bouldin explain how saying it’s okay to remove data in order to “enhance a desired signal” is not admitting to cherry-picking.

  3. Bouldin might plausibly argue that the referenced comment by Rosanne D’Arrigo was not “evidence” but “hear-say”. If the actual slide or text of an article by D’Arrigo making the same comment were available, the cite would be better than a cite to someone’s inexact recollection and recounting of that comment.

    I myself am not picky about hear-say, supposing that it’s a form of evidence even if a less reliable form than direct testimony. But the legal system which excludes this lesser form provides a basis for the very very picky to reject claims based on such evidence.

    Otherwise it’s all pronoun sensitive. I’m selective, you’re inconsistant, she’s cherry-picking.

  4. Pouncer, it doesn’t really help either as a defense.

    One can reasonably ask how in the world it came to be that the testimony of the witnesses were not recorded.

    That’s patently ridiculous.

    I’ve had a few run-ins with Jim, usually when he didn’t want to bother wasting time being accurate in what he claimed.

  5. Carrick, I’ve noticed Jim Bouldin seems to like dismissing things as “incoherent” when he simply doesn’t feel like dealing with them. An example which amuses me is he said the screening = cherry-picking concept is “strongly promoted at McIntyre’s blog.” After mentioning the examples I discuss in this post, I asked:

    Bouldin responded:

    Apparently the concept is “strongly promoted” yet so “incoherent” he can’t even say what it is. I’m not sure how that works.

    By the way, you should read this tweet of his. It’s not relevant to the topic of this post as it is about the screening fallacy in general, but it’s fascinatingly weird.

  6. Is not the issue with Bouldin and many others who obfuscate rather debate their ‘belief’ system?, namely that on message results justify decisions made by researchers and their auditors.

    Bouldin ‘believes’ that the decision makers did the right thing, so debate is unnecessary, nay even a waste of time.

    A statement like this

    As we are mission oriented, we do not waste time on further analyses if it is apparent that the resulting chronology would be of inferior quality

    could translate to

    As we are mission oriented, we do not waste time extracting complex signal from noise if it is apparent that the resulting chronology would be off message

    Quis custodiet ipsos custodes?

  7. Gras Albert, I don’t know if what you describe is true for Jim Bouldin. It is certainly true for Gordon Jacoby (the guy you quote) and many others, but I think we may be seeing something else with Bouldin. I don’t get the impression Bouldin is saying these things to defend “the message.”

    Which isn’t to say I think Bouldin’s comments are made for legitimate reasons. For instance, he said his continued defense of Mann 2008’s screening in the CPS methodology is wrong. It is indisputable screening in CPS introduces a bias. It is easy to figure out why and trivially easy to demonstrate. Even if one hadn’t been aware of the issue, Bouldin was provided a link to lucia’s demonstration of the bias multiple times. Despite that, when I criticized the screening, Bouldin’s only response has been:

    Actually, I need to take that back. He also addressed the issue by tweeting a single word, “Bullshit.” I don’t think that’s any better than the red herring in the tweet I show though. Bouldin could have attempted to address the argument. Instead, he hand-waved it away and changed the subject. That prevented him from having to deal with simple demonstrations of bias in the methodology he defends.

    But the problem goes beyond that. Bouldin’s tweet referred to an argument nobody in the conversation had made. When asked about this, he claimed to have gotten it from Steve McIntyre. I asked for a citation, saying I don’t think McIntyre has ever made that argument.* He, of course, ignored that. The reality is McIntyre has often discussed hockey sticks arising from Michael Mann’s improper implementation of PCA, but PCA (in any form) is not simply “screening.”

    As far as I can see, Bouldin simply, and willfully, ignores things he doesn’t want to deal with. He willfully ignored the demonstration of the bias he denies exists. This post shows he willfully ignored the evidence I provided, to the point he denied me having provided it. And after willfully ignoring things inconvenient to his views, he likes to call what people say “incoherent” as though it’s their fault he fails to understand things he refuses to look at.

    I don’t get it. I think it’s stupid. I think it indicates a serious problem, at least in his ability to hold a reasonable conversation. I just don’t see it as him trying to defend “the message.” I don’t even know what “message” he’d be defending with this sort of behavior.

  8. “Fifteen years is not a delay. It is a time for poorer quality data to be neglected and not archived.”

    “Is an admission of cherry-picking. Even if one chooses not to use some data they collect, there is no excuse for destroying that data. I think it’s obvious that is cherry-picking.

    Why does Jim Bouldin deny it? I don’t know. Maybe he thinks destroying data which doesn’t give the expected/desired results is okay. Maybe, somehow, he thinks that isn’t cherry-picking. Maybe he’s just in denial and refuses to look at the links people provide to support their claims.

    #################

    Devil’s advocate is a fun game. So I will play devil’s advocate

    First what is Cherry picking?

    Let’s start with Wikipedia

    ‘Cherry picking, suppressing evidence, or the fallacy of incomplete evidence is the act of pointing to individual cases or data that seem to confirm a particular position, while ignoring a significant portion of related cases or data that may contradict that position. It is a kind of fallacy of selective attention, the most common example of which is the confirmation bias.[1] Cherry picking may be committed intentionally or unintentionally. This fallacy is a major problem in public debate.[2]”

    I would argue that the first meaning is the one that most people think of. We cherry pick the Good evidence, the evidence that confirms our belief. The second of course follows from the first in that if we choose to look at a subset of data, we are also ignoring the rest.

    The quote above ;”Fifteen years is not a delay. It is a time for poorer quality data to be neglected and not archived.”
    is not evidence of cherry picking. The data are being rejected and dumped because they are of low quality. Not because they
    disconfirm the hypothesis.

    A simple example:

    Suppose I want to investigate whether republicans are stupid so I give some IQ tests to 1000 republicans.

    900 of the tests come back completely filled in and signed. The names and signatures match voter registration. They are republicans.
    100 of the tests come back and the signatures and names dont match to voter registration.

    Now I do my report: for starters I throw out the 100 because they are of poor quality. checking them I see that they indicate republicans
    are 2 digit IQ. Now I look at my 900. and I decide to focus only on the samples that come from the rural south. I cherry pick the data.

    deciding to rule data out because it is of low quality ISNT ON ITS FACE cherry picking. Now of course if it happens that
    “low quality” is OPERATIONALLY determined by a test that focuses on my hypothesis, then it is effectively cherry picking.

    So for example, If I looked at those 100 samples with bad signatures and used 50 of them because they had low IQ scores
    then I would be using QC to cherry pick.

    Destroying data that doesnt meet QC is not cherry picking. It might not be the best practice. Using QC as ploy to identify data you dont like
    is a different matter.

    So, on its face: ‘”Fifteen years is not a delay. It is a time for poorer quality data to be neglected and not archived.” is not evidence of cherry picking. ON ITS FACE its not evidence of anything. How about : “Fifteen years is not a delay. It is a time for data that doesnt fit our hypothesis to be neglected and not archived.” That might be evidence that one should look at what they did. But it too is not evidence of cherry picking.
    Actual evidence of cherry picking requires access to all the cherries, not just comments about the cherries.

  9. Brandon, your Jacoby quote is an excerpt from a Jacoby letter posted up in one of the first Climate Audit posts here http://climateaudit.org/2005/02/06/jacoby-1-a-few-good-series/.

    Jacoby and D’Arrigo 1989 (Climatic Change) had stated that they had sampled 36 northern boreal forest sites within the preceding decade, of which the ten “judged to provide the best record of temperature-influenced tree growth” were selected. I requested the editor of Climatic Change to require Jacoby to provide the data that he had selected out, so that I could test the significance of the Jacoby HS relative to screening red noise. (This was one of the earliest CA observations that you could get a HS from persistent red noise simply by screening against a trend.)

    Jacoby refused to provide the data. His explanation included the statement that one of the criteria for not keeping a chronology was lack of “correspondence or correlation with local or regional temperature”

    The inquiry is not asking for the data used in the paper (which is available), they are asking for the data that we did not use. We have received several requests of this sort and I guess it is time to provide a full explanation of our operating system to try to bring the question to closure.

    Speaking for myself and immediate colleagues who have been involved with my research: Most of our research has been mission-oriented, dendroclimatic research. That means to find climatically-sensitive, old-aged trees and sample them in order to extend the quantitative record of climatic variations. Also, to relate these records to the real world and investigate the climate system and its functioning.

    The first part produces absolutely-dated time series of tree-ring variations. We try to sample trees at sites where there is likely to be a strong climatic signal, usually temperature or precipitation. Sometimes we are successful, sometimes we are not. We compare the tree-ring series to climate records to test what the climate signal is. We sample latitudinal treeline and elevational treeline looking for temperature-sensitive trees with both a high-frequency and low-frequency response to temperature. A high-frequency temperature response to summer is most frequently found at these extreme locations. However, trees have much more information if one finds trees with a good communal high and low frequency variations that correspond or correlate to local or regional temperatures for longer seasons. There is abundant information to explain the physiological processes in cooler seasons and why trees can respond to more than just summer season. The sampling and development of a tree-ring chronology is an investment of research energy, time, and money.

    The best efforts in site selection and sampling do not always produce a good chronology. It is only as the samples are processed and analyzed that the quality, or lack thereof becomes evident. First is the dating: this is enabled by high-frequency common variation among the trees. The dating is achieved and tested by various methods. Then the chronology is developed from the correctly dated ring-width measurements and evaluated. Testing: Is there a common low-frequency signal among the trees? At a good temperature- sensitive site with good trees, there is. We conduct common period analyses of the low- frequency variation within the cores samples from a site.

    Sometimes, even with our best efforts in the field, there may not be a common low-frequency variation among the cores or trees at a site. This result would mean that the trees are influenced by other factors that interfere with the climate response. There can be fire, insect infestation, wind, or ice storm etc. that disturb the trees. Or there can be ecological factors that influence growth. We try to avoid the problems but sometimes cannot and it is in data processing that the non-climatic disturbances are revealed.

    We strive to develop and use the best data possible. The criteria are good common low and high-frequency variation, absence of evidence of disturbance (either observed at the site or in the data), and correspondence or correlation with local or regional temperature. If a chronology does not satisfy these criteria, we do not use it. The quality can be evaluated at various steps in the development process. As we are mission oriented, we do not waste time on further analyses if it is apparent that the resulting chronology would be of inferior quality.

    If we get a good climatic story from a chronology, we write a paper using it. That is our funded mission. It does not make sense to expend efforts on marginal or poor data and it is a waste of funding agency and taxpayer dollars. The rejected data are set aside and not archived.

    As we progress through the years from one computer medium to another, the unused data may be neglected. Some [researchers] feel that if you gather enough data and n approaches infinity, all noise will cancel out and a true signal will come through. That is not true. I maintain that one should not add data without signal. It only increases error bars and obscures signal.

    As an ex- marine I refer to the concept of a few good men.

    A lesser amount of good data is better without a copious amount of poor data stirred in. Those who feel that somewhere we have the dead sea scrolls or an apocrypha of good dendroclimatic data that they can discover are doomed to disappointment. There is none. Fifteen years is not a delay. It is a time for poorer quality data to be neglected and not archived. Fortunately our improved skills and experience have brought us to a better recent record than the 10 out of 36. I firmly believe we serve funding agencies and taxpayers better by concentrating on analyses and archiving of good data rather than preservation of poor data.

    Unsurprisingly, I thought Jacoby’s argument was ludicrous, writing at the time:

    It would be my position that, if they picked 10 of 36 sites, they used all 36 sites in their study. Imagine this argument in the hands of a drug trial. Let’s suppose that they studied 36 patients and picked the patients with the 10 best responses, and then refused to produce data on the other 26 patients on the grounds that they didn’t discuss these other patients in their study. It’s too ridiculous.

    I appealed to both Climatic Change and NSF without success on either front. Jacoby recently died. While the amount of archived material has considerably expanded since 2005, it remains incomplete and I wonder whether Jacoby’s unarchived work will ever be located.

    At the time, I also learned http://climateaudit.org/2005/02/09/the-updated-gaspe-series/ that updated data had been obtained at Gaspe that did not show the big HS of the series used in MBH98 (and Jacoby and D’Arrigo 1989) but had not been reported. I asked Jacoby and D’Arrigo either to archive or provide me this data. They refused on the basis that they preferred the earlier (big HS) series. My efforts to get the data were flatly refused. The newer data was blended into D’Arrigo et al 2006, but they refused to provide it even then. It was quietly archived in 2012, over 20 years after collection and six years after publication of D’Arrigo et al 2006 (by which time they could say that the article was out of date).

    Jacoby and D’Arrigo (and Briffa) were among the first authors to try to use tree ring chronologies as temperature proxies – prior to them, they had mostly (though not always) been used as precipitation proxies. Many years later, tree ring chronologies from Jacoby-D’Arrigo and Briffa remain disproportionately represented in multiproxy studies (e.g. PAGES2K Arctic) and within those collections, they are more “successful” in obtaining chronologies with a significant difference between 20th century (blade) and shaft values.

  10. Steve McIntyre, yup. I actually linked to that post. I’ve always found it incredible. If you only look for series which correlate to your desired answer, you’re bound to get your desired answer. If you then destroy the rest of the data, nobody can ever hope to check the validity of the answer you came up with.

    And sadly, the idea you aren’t “using” data if you filter it out seem to have become widely accepted. It’s not even limited to paleoclimate. It’s been used to justify not releasing data for Lewandowsky’s work, and it’s how Cook et al’s defenders justify them having never released any data for several hundred of the papers they rated.

    It’s disturbing to think it could be accepted in many other parts of climate science as well. There are lots of topics I don’t follow. Who knows how often people use such a crazy argument?

    Jacoby and D’Arrigo (and Briffa) were among the first authors to try to use tree ring chronologies as temperature proxies – prior to them, they had mostly (though not always) been used as precipitation proxies. Many years later, tree ring chronologies from Jacoby-D’Arrigo and Briffa remain disproportionately represented in multiproxy studies (e.g. PAGES2K Arctic) and within those collections, they are more “successful” in obtaining chronologies with a significant difference between 20th century (blade) and shaft values.

    I’ve always found it darkly amusing cherry-picked series like those consistently find their way into these studies while many other potential series don’t. At some point I think they’ll find a limit to what they’re willing to do to get the “right” answer.

    Then again, maybe they won’t. I’ve seen the response to Soon & Baliunas where the two were criticized for conflating temperature and precipitation… by people who repeatedly use precipitation proxies as temperature proxies (sometimes explicitly so).

  11. That is an interesting coincidence. To tell the truth, I had barely even looked at the head post that exchange happened in. Until you mentioned it, I had no idea it was about anything Stephan Lewandowsky had written.

    That I mentioned Soon & Baliunas is another strange coincidence. The Lewandowsky article that post was about refers to a previous article also by Lewandowsky, one in which he gives the standrad narrative about it being a garbage paper and people resigning because of that.

    I regret not having paid more attention now. That series of articles discussed in that post show an… interesting worldview.

  12. I’ve been meaning to do a retrospective on Soon and Baliunas for a while, as CG2 provides a lot more insight than CG1. Surely one of the most despicable (and undercommented) Climategate incidents was Wigley’s observation (CG2-682) during the anti-Soon-Baliunas that they appeared to have a point on precipitation proxies but the important thing was to ensure that they were denied credit:

    Well put! By chance SB03 may have got some of these precip things right, but we don’t want to give them any way to claim credit.

  13. Yup. I hadn’t paid much attention to the Soon and Baliunas controversy until sometime in the last couple years. I regret that. I think the events surrounding that paper are some of the most troubling I’ve seen in climate science.

    That it’s still spun in favor of the people who behaved horribly is despicable.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s