This is the second post in a row commenting on something Rob Salmond has posted over at Polity. I’m not trying to pick a fight – he’s just being a bit unfair to Key Research in this instance, and he has also managed to touch on something that grates me slightly. As I’ve blogged before, the potential sources of error in a survey are numerous – anyone pointing to just one or two is seeing the tip of the iceberg.
In this post, Rob calls the recent Herald on Sunday/Key Research poll a rogue poll. He offers three points in support of this:
- The sample of 500 was small.
- Key Research only weighted by age and gender.
- The results were odd.
Let’s take these one at a time.
Is a sample of 500 small?
No, it’s not. I’ve posted on this before.
The ideal sample size depends on what you want to do with the results. A sample size of 500 is still fairly robust at the overall sample level. As can be seen in the chart, the maximum margin of error at the 95% confidence level only increases by +/- 1.3 percentage points over a sample of 1,000.
The main disadvantages of a sample of 500 are:
- Relative to a sample of 1,000, your sub-group analyses will be less robust
- Relative to a sample of 1,000, results over time will fluctuate more due to random sample variation.
If you’re not comparing results over time, and it’s mainly the overall result you’re interested in (rather than the results for subgroups), then a sample size of 500 is a useful one. Yes, the Herald story did claim the result for Labour was an increase over the previous one a year ago (the sample size was 1,000 for that poll, by the way). Assuming other aspects of the methodology were consistent, the difference in the result for Labour between the two polls is statistically significant at the 99% confidence level. It is fair for the Herald to claim the results are different between the two polls.
A sample size of 500 is not a good reason for saying a poll is rogue.
Key Research only weight by age and gender
Firstly, Key Research say their results are representative by age and gender. They don’t say they weight by age and gender. It’s entirely possible they solely used fieldwork quotas, so there was no post-stratification weighting involved at all (this is similar to the method used by Reid Research for the TV3 poll).
Secondly, if Key Research did only weight by age and gender, how is this bad exactly? All survey researchers need to make a careful decision when weighting their data. Weighting corrects for bias, but it increases variance (the margin of error). Basically, you need to inspect your data, collected via your given methodology, and make a decision about whether it’s worth sacrificing variance to correct for a certain bias.
I have no issue at all if Key Research decided to weight by only age and gender. This is not a good reason for saying a poll is rogue.
The results were odd.
Yes, they were unusual relative to both the recent Roy Morgan poll and the trends we’ve observed for minor party results for some time. This is a valid reason for suspecting a poll is rogue. It’s not proof, because all polls contain error and no measurement is precise. But based on what we’ve seen recently I’d be a bit cautious about the conclusions I draw from this poll. Personally I’d call the results interesting, rather than rogue, because I don’t really know much about the poll methodology and, well, at times the Roy Morgan results seem a wee bit more volatile than other continuously running surveys I’ve seen.
So if I agree, what was it that troubled me?
What troubled me was Rob’s assessment of the poll method based on just two high-level details, one of which was an assumption. Neither of those two points, by themselves, together, or in combination with the out-of-the-ordinary results, are evidence for the poll being rogue, or the method being bad. Based on the information we have, the poll results alone are the only good reason for suspecting it might be a rogue poll.
It would be great to have more information about how Key Research conducted their poll. That’s the only suggestion I would offer them. It would be good, for instance, to know if and how they filtered party vote preferences by likelihood to vote.
One of the reasons I write this blog is to convey that the pollsters and survey researchers I know (including those working for competing companies!) take their jobs very seriously. They don’t just sit around calling up people from the phone book. Some spend hours thinking about sources of error, and considering ways to reduce it, cancel it out, or otherwise adjust for it. They won’t always get it right, but that’s the nature of measurement in a context where there are so many variables!
Anyone at all can bang a survey together. It’s very difficult to do a survey well.