TCP Reanalysis, An Example

Hey guys. I believe the performance issues of the site for my TCP Reanalysis project is on should be fixed now. With that taken care of, I figure now would be a good time to discuss the sorts of results this project can generate.

For one example, we can see raters rated two papers in opposite ways. Here is the summary of endorsement ratings for two papers (lower value indicating stronger endorsement of the “consensus”):

id 1 2 3 4 5
7  0 1 3 1 0
38 0 2 4 1 0

And quantification ratings:

id 1 2 3
7  0 4 1
38 0 7 0

As you can see, the raters don’t agree. Both papers were rated as endorsing and rejecting the greenhouse effect (with one also being rated as minimizing the effect). To see which rating is right, we can look at the abstracts. Number seven, Aerosol Size Confines Climate Response To Volcanic Super-eruptions, says:

Extremely large volcanic eruptions have been linked to global climate change, biotic turnover, and, for the Younger Toba Tuff (YTT) eruption 74,000 years ago, near-extinction of modern humans. One of the largest uncertainties of the climate effects involves evolution and growth of aerosol particles. A huge atmospheric concentration of sulfate causes higher collision rates, larger particle sizes, and rapid fall out, which in turn greatly affects radiative feedbacks. We address this key process by incorporating the effects of aerosol microphysical processes into an Earth System Model. The temperature response is shorter (9-10 years) and three times weaker (-3.5 K at maximum globally) than estimated before, although cooling could still have reached -12 K in some midlatitude continental regions after one year. The smaller response, plus its geographic patchiness, suggests that most biota may have escaped threshold extinction pressures from the eruption.

As far as I can see, the abstract neither endorses nor rejects the greenhouse effect. I don’t see any reference to it. To check my intepretation, here are three comments left by raters:

Temperature changes linked to volcanic eruptions.
It’s all about volcanos
Modeling study. If the Earth system is less responsive to aerosols, then climate sensitivity may be lower also.

I agree with each of those comments, but I don’t believe any of them make a case for rating the paper as anything other than neutral. I’m sure the paper itself takes a position, but I don’t think the abstract gives us enough information to be sure what position that is.

Abstract 38 says:

Conspicuous global stable carbon isotope excursions that are recorded in marine sedimentary rocks of Phanerozoic age and were associated with major extinctions have generally paralleled global stable oxygen isotope excursions. All of these phenomena are therefore likely to share a common origin through global climate change. Exceptional patterns for carbon isotope excursions resulted from massive carbon burial during warm intervals of widespread marine anoxic conditions. The many carbon isotope excursions that parallel those for oxygen isotopes can to a large degree be accounted for by the Q10 pattern of respiration for bacteria: As temperature changed along continental margins, where similar to 90% of marine carbon burial occurs today, rates of remineralization of isotopically light carbon must have changed exponentially. This would have reduced organic carbon burial during global warming and increased it during global cooling. Also contributing to the delta(13)C excursions have been release and uptake of methane by clathrates, the positive correlation between temperature and degree of fractionation of carbon isotopes by phytoplankton at temperatures below similar to 15 degrees, and increased phytoplankton productivity during ice-house conditions. The Q10 pattern for bacteria and climate-related changes in clathrate volume represent positive feedbacks for climate change.

One rater left this comment:

Assumes significant warming from CO2 and that increased CO2 and CH4 during warm periods creates a positive feedback.

Unfortunately, the rater who rated the abstract as rejecting the consensus didn’t leave an explanation. At the moment, I’m inclined to agree with the comment above. I could be wrong about either abstract though. With this system, we can look at the disagreements for ourselves and discuss them.

We can look at other things as well. For example, here are three different raters who have notably different rating patterns (rater ids removed):

1  2  3  4
6  5  2  0
0  0 15  0
6  6  8  0

The differences might change as the sample size increases, but this shows how easy it can be to check for systematic differences. If there was more data available right now, I’d even be able to make graphs showing the influence of each rater.

And all that can be done with little effort. With more effort, it could even be made automated and integrated into a web page. I think that’d be pretty cool.

In other news, I’m considering changing the sample set available for rating later today. I’m thinking it’d be more interesting if I only offered papers rated by Cook et al as taking a position. That would prevent people from having to read so many papers which say nothing relevant. What do you guys think?

Also, if I do that, should I leave this set available too? I could leave this up as a demo set for people to play with/practice on/discuss standards with. It might be useful to have something like that when designing guidelines for this.



  1. You might consider some statistic for measuring internal consistency of ratings. Consult with somebody who’s knowledgeable about this since I’m not expert enough to recommend anything.

    As for the demo set of papers, it would be better to have a sample that covers the whole range of rating responses from opposition to endorsement, with irrelevant thrown in to clear the palate. You don’t want to bias this rudimentary “training” exposure with just one flavor. A demo serves to frame the exercise so that early ratings don’t suffer as much from naivete or other influences.

  2. Gary, internal consistency is one of the big reasons I want to do this project. I find it fascinating how many ways there are to look for patterns. There are obvious checks of internal consistency, like Cohen’s Kappa, but there are much more nuanced approached. I would love to get to use them. And while I’m confident I know a number of interesting approaches, I’d always be open to hearing new ones. I’m not going to ask around just yet though. I’m still not sure how much effort this project is worth.

    On the issue of the demo, I agree this sample isn’t ideal for training purposes. What I may do is set it up as a demo, but add in more entries to balance out the distributions.

  3. I’m guessing the first paper was rated by Cook as endorsing? After all, it says ‘global climate change’ and thus assumes it exists.

  4. MikeN, nope. It’s actually kind of interesting. Cook et al only rated 10 of these 50 papers as taking a position (all endorsing the “consensus”). The other 40 were rated neutral. Participants in the reanalysis have seemed to rate more papers as endorsing than Cook et al. Only 15 of the 50 have been universally rated as neutral.

    Part of that may be because of the reconciliation phase Cook et al used, but it’s something I want to look at more closely. It’d be interesting if we found Cook et al excluded many papers which should have been included.

  5. Brandon
    Why don’t you try a split sample? One third rates all papers, one third rates Cook’s 1-3+5-7, and one third rates Cook’s 4s.

    I would suspect that the third batch of raters would get frustrated with the irrelevants and throw in information where there isn’t.

  6. It’d take a large sample set to make that viable, and it would be unfair to the raters. Why should one rater have to get a less interesting sample set than the others?

    The best I could see doing is creating three sample sets like you describe then assigning people papers from a random sample set whenever they load the page. That wouldn’t really address people getting fed up though.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s