Skeptical Science Leak Was Not a Hack

I’m calling it. The leaked Skeptical Science forum was not due to a hack.

I admit I can’t know that for sure. It’s impossible to prove a negative. However, I can observe the fact we have no evidence to support the idea this was because of a hack. I mean that. There is no evidence. What Skeptical Science claims to be evidence isn’t.

We can see this by applying some critical analysis to Bob Lacatena’s latest post, especially its update. Before we do though, I want to point out I’m cutting the snark for this post. This is us viewing his argument with pure analysis, no insults:

I misunderstood the timeline of the logging feature in the Skeptical Science code. The feature was actually implemented in early 2010. It does not, however, substantively change the reasoning. It would not have been possible to construct the deleted comments and user table contents from the logs, as they include information prior to 2010.

If you haven’t read his post or followed the discussions elsewhere, Skeptical Science had an unprotected directory on its website named logs/. This directory was created in 2010, and it logged every SQL query made at Skeptical Science from that point on. It stored each day’s logs in a file anyone could access if they had the right URL. We can see the URL of one such file was stored by the WayBack Machine (an internet archiving service). Anyone could have found that URL by trawling the WayBack Machine’s entries for Skeptical Science.

That means anyone could have stumbled upon the publicly accessible Skeptical Science logs which tracked every SQL query from 2010 on. With that in mind, let’s examine what the supposed hacker released:

1) A forum going back to March 2010.
2) A simple .gif file created in March, 2010.
3) A comment.html file, containing deleted comments back to 2007.
4) A users.csv file containing users back to the site’s beginning (in 2007?).

The notable thing here, and something which always puzzled me, is the forum only went back to March of 2010 even though it must have existed prior to that (it was an active forum and there are no introductory posts for it). If a hacker stole the entire Skeptical Science database, why would he have only released a subset of the forum comments? And why is it that subset just so happens to cover the exact period Skeptical Science had publicly viewable logs for?

That’s a heck of a coincidence. Or maybe it isn’t. We have only two items showing the hacker had information from prior to March 2010. He had a file listing deleted comments, and he had a file listing user information. Both of these files contained information from prior to March, 2010. According to Bob Lacatena, they are out proof the leak wasn’t generated just from the publicly viewable logs.

But are they? Does the fact an individual had information from prior to 2010 mean he stole more than the logs of transactions from 2010 on?

No. It is perfectly possible the information in those two files was re-transmitted sometime after its creation. For example, what if John Cook corrupted his user database somehow? What would he do? Odds are he’d restore it from a backup. That’s easy to do with SQL. Cook could have uploaded an entire table in one command. And if he did, that entire table could show up in a log of his SQL queries.

Similarly, suppose John Cook kept a copy of all deleted comments in an unwieldly format. Now suppose he eventually decided to store in a table in his database so it was more convenient. Is it unreasonable to think he might have used SQL to transfer those comments some time after they were first made? Of course not.

There are dozens of possible reasons John Cook might have retransmitted data after that data was first created. He could have been performing housekeeping, restructuring his database, restoring a backup, or any number of other things. All of these could have resulted in pre-2010 information being stored in logs created in 2010 or later. With that in mind, we should revisit the comment which leaked the Skeptical Science forum. Specifically, we should note it says:

These files detail everything that happens on the site, from forum conversations to user accounts. I have collated some of the data in a more readable form.

Why has SkS chosen to publish all this on the public internet? Is it the first step towards transparency, or a catastrophic error? This is what I first intended to ask Mr. Cook.

Thankfully I realized what my question would have looked from the climate ethics perspective- highly inappropriate and unethical. I would have been seen as a denialist attacking Mr. Cook’s work with these bizarre claims about database logs.

This “leak” is just a format conversion of already public material.

Bob Lacatena claims “this statement is a falsehood,” but we have no evidence to indicate such. This person’s story fits perfectly with the evidence we have. It’s possible the story is a lie, but it’s also possible the story is the truth.

The only other argument Bob Lacatena has to claim this was a hack is his narrative of the hack which supposedly happened to Skeptical Science. In this narrative, Lacatena refers to a list of evidence, but none of that evidence can be seen or verified by anyone. All we have is his word it exists. We cannot assume his narrative is correct based upon nothing but his word.

Moreover, even if Lacatena’s narrative is true, nobody has done anything to link that supposed hack to the release of the Skeptical Science forum. Skeptical Science could have been hacked by one person while another person released material he gathered from publicly accessible logs (after reformatting it so it was legible).

We cannot know the truth. Skeptical Science refuses to provide us any actual evidence. As such, all we can go on is what we can see and read. When we do so, and when examine it, the only conclusion that can be reached is there is no conclusive evidence as to what happened.



  1. Yeah. This so sounds plausible. The “two things happening at the same time” theory was already in the cards when we read parts I and II. Now that they admit the logs started in 2010, that would seem to explain why the forum logs don’t go back before 2010. Otherwise, the “break” at that point is inexplicable. If “The German” got the whole database, why not divulge stuff before 2010? Why would he pick 2010– the date that seems to just happen to coincide with the date when John Cook started posting all the SQL requests publicly?

    And as you note: if Cook re-generated his user file (for whatever reason) that would end up in this SQL logs– and that would include “old” stuff.

    So, the “disclosure was a leak” theory is still not knocked out of contention. I’m back where I was before which is “we don’t know”.

  2. The crazy part is they’ve posted something like eight thousand words now, and they haven’t provided a story that supports their view much less a single shred of evidence to make us believe it.

    I find that mind-boggling.

  3. From Bishop-Hill

    Mar 25, 2012 at 5:01 PM | Tom Curtis
    “For this to have not been a hack, that means the forum must have been open to the public since mid 2010, which is simply not the case”

  4. The SkS server was down for maintenance only a week before the database was leaked. If that maintenance required a full restore of the database then the “hacker” didn’t need to be watching the log directory for very long.

    From Facebook

    Your site was down earlier, just FYI.
    March 13, 2012 at 3:52pm · Like

    Skeptical Science Scheduled maintenance. I believe the glossary plug-in may be being implemented.
    March 13, 2012 at 4:09pm · Like

  5. 4) A users.csv file containing users back to the site’s beginning (in 2007?).

    Precisely what info was contained in User.csv? How do we know it’s back to the beginning?

    Note: User.csv file was incomplete. Also, in March 2010, Cook announced there had been a hack of some sort and suggested everyone change their password. The reaction to that could, hypothetically, have resulted in numerous people (especially regulars) visiting and changing their user information. When one updates, they first login. They get to the update page and then enter “Username, password, email and (optionally) url). Lots of people might have updated.

  6. Haven’t read the whole thing – sorry don’t have the attention span right now – but just want to pick up on something.

    “(it was an active forum and there are no introductory posts for it)”

    If you search for *.* in the forum folder you get all files and folders contained. Order this list by name (not date). It’s rather convenient they used a file naming convention which places the date first formatted such that alphabetic is also date ordered.

    The first such named file is “2010-08-08-Welcome to the Authors Forum.html”. It starts:

    Welcome to the Authors Forum. This is a private forum available only to Skeptical Science users who’ve been upgraded to Author status. There are several ways to use the forum. [list follows]

    Within a week of that we have “2010-08-14-Getting to know Skeptical Science authors.html”, where everyone introduces themselves.

    So to me it does seem the tree hut started 2010-8-8.

  7. Lucia –
    That would certainly peg the irony meter, if Cook’s 2010 concern about hacking caused passwords to be changed and thereby exposed them.

  8. With the benefit of some sleep after a long drive to ski country I understand your post more clearly this AM.

    Nothing to add except to say this is much more plausible than the tale we’ve been reading.

  9. lucia, there’s a user ID (auto-incremented integer), username, user level (I believe this is tied to forum access), e-mail address and registration date for each person. There’s also binary entries for scientist and moderator. Other fields were filled in for some people, such as first name, last name, website and bio. I believe that’s all the file contained.

    As for how we know it goes back to 2007, we can see registration dates that far back, and since their given with an auto-incremented integer, we can tell the order is right. John Cook is the first user (though his registration date got changed somehow), and several other known early members follow shortly after. One interesting thing about the file the ID column shows when values are missing. The missing values in that column either refer to data which was deleted from the database, removed by the person who leaked the data or was never collected by him.

    HaroldW, that’s actually a common trick for stealing information. If you can’t get somebody’s login information, trick them into resetting it and monitor them when they do. It’s especially worrying because of how easy it can be to listen in on phone calls or watch someone type via binoculars or the like.

    DGH, glad to hear it! I was always suspect of their story, but until this recent admission, I couldn’t articulate a plausible alternative because of the lack of information.

  10. Danderson, first off, sorry about the delay in approving your comment. I slept in today.

    As for what you observed, interesting. I wouldn’t have figured the Authors subforum would predate the rest. Different people had different amounts of access to the forum so I just figured that introductory post was for a subset of the people they had invited. It turns out they must have created the tiered system sometime after the forum was first created.

    That means we do have the entire forum. That means it isn’t a coincidence we have what we have. In that case, I’d say we don’t have affirmative evidence the leak wasn’t due to a hack. We’re back to being stuck with no real evidence in either direction.

  11. Brandon,

    That means we do have the entire forum. That means it isn’t a coincidence we have what we have. In that case, I’d say we don’t have affirmative evidence the leak wasn’t due to a hack. We’re back to being stuck with no real evidence in either direction.

    It also means that the MySQL statements creating comments, login in and so on at the Forum were always published in logs– from day 1 of its existence.

    The users.csv sounds like it contains more information that one would expect to get reading SQL queries involved in changing passwords…. maybe. Presumably the registration dates would not be involved in any update command. But of course this is not certain. The precise command used to update the database sort of depends on what the person coding thought to do. (On must never discount the possibility that things were coded in an unusual or even needlessly complex way that passes more information than required to create a successful update. Both can happen.)

  12. OK, I’ve never really looked at it in terms of sub-forums except for purposes of categorization. We do find in the Authors ‘Welcome’ John describes reviewing and writing rebuttals being a feature, and they do appear in their own sub-forum folders. Also “This is a private forum available only to Skeptical Science users who’ve been upgraded to Author status” implies that becoming an author is the primary-first promotion step for normal public forum users (who then access the tree hut).

    I’m seeing now there are other perspectives wrt this but still feel it’s more probable by Authors they mean the whole lot (minus Admin – in Admin you can read about that special access).

  13. lucia, yup. That’s the hilarious part. Even if we accept Skeptical Science was hacked, and even if we accept the one responsible for the hack is the one who released the forum, the information released is still (pretty much?) just information Skeptical Science had already made publicly accessible.

    That means the most they can say is they were hacked by a guy who released information they had already released.

    Ooh. So scary.

  14. Danderson, there are a number of people who commented in that forum but have never written an article for Skeptical Science. That’s why I took it as referring not to everyone, but only to authors. It seems strange to think they might have an “Authors” subforum which gives access to non-authors. They might have though.

    Incidentally, John Cook has previously said the forum is tiered. As I recall, he’s said translators have access only to the Translators subforum and most people don’t have access to the Moderation or Admin subforums.

    None of that really matters though. I agree the entire forum was released. I just misunderstood its initial structure.

