Survey Response Analysis, Replicable Results

Some people doubt the results I published in an earlier post are valid. To help reconcile views, I offer code and data in an immediately accessible form.

I use the r programming language. I highly recommend it to anyone wishing to do statistical work. Its learning curve is not insignificant, but it is far more powerful than something like Excel. For people who are used to mathematical programming languages, I still recommend r as it is open and free.

I’ve previously posted the responses from my survey. This is a cleaned version. I’ve removed two responses which were incomplete and added headers. Each header corresponds to a question as such (with a 1-5 scale for disagree/agree for the first eight questions and a 1-5 scale for bad/good for the last five):

Survey_Headers

In r, you can load the data with the command (make sure the file is in your working directory):

responses = as.matrix(read.csv("Survey_Data_Clean.xls"))

The as.matrix command is not necessary to read the data. It merely reformats the data into a more convenient form for a few future steps. Before we get to them though, use this command:

require("Hmisc") || install.packages("Hmisc")

This command ensures the Hmisc package is loaded. That package provides us the rcorr command, which we’ll then apply to the data:

calc = rcorr(responses)
correlations = calc$r
significance = calc$P

The rcorr command returns the pairwise correlations for all columns of data, as well as the significance level for them. To make things simpler, we put each into its own variable. Rather than match up correlation and significance scores visually, we’ll use a simple command to match them up:

ifelse(significance<.01,round(correlations,2),NA)

This command displays all correlation scores which are significant at the 99% level (p<.01), replacing the rest with NA. TO make things more legible, it rounds the displayed scores to two digits.

From here, one can do any sort of testing one might want. r offers a lot of options, and to show one, we can reproduce the results published in the previous post. All it takes is this command:

rcorr(responses[,c("Real","Threat","Infallible","Genocide")])

Adding and removing columns is as easy:

rcorr(responses[,c("Threat","Genocide","Aliens")])

Which gives the table:

         Threat Genocide Aliens
Threat     1.00     0.13   0.16
Genocide   0.13     1.00   0.45
Aliens     0.16     0.45   1.00

Showing people who believe global warming is a serious threat have a statistically significant correlation with people who believe they’ve been abducted by aliens.

Advertisements

One comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s