In trying to explain the results of the Scottish Referendum, the Telegraph, a major British newspaper, claims that the unemployed supported independence. The evidence? Aggregate results that show Scottish counties with high unemployment rates supporting independence at a higher rate than counties with low unemployment rates. Not to be outdone, BBC, the source of record in Britain, has argued that party affiliation did not predict voting behavior in the referendum? The evidence? Counties with Labour members of parliament (the party was strongly against Scottish independence) were no more likely to support continued union than counties with SNP representatives (the Scottish nationalist party pushing for independence). Both examples are straightforward cases of an ecological fallacy – aggregate-level data is being used to explain individual-level behavior – a big no-no in the social sciences. The question I want to address is why such fallacies are so common (and they are certainly not limited to the British media as anyone paying attention to CNN post-election coverage has undoubtedly figured out).
How obvious are the ecological fallacies in post-referendum coverage? Let’s examine the Telegraph story. In that article, we see a chart with unemployment rates in three counties in which the independence vote had majority support and the unemployment rates in three counties where the No vote dominated. There is seemingly a strong correlation. The three Yes counties have an unemployment rate of roughly 10%, while the three No counties average roughly 5%. Based on this data alone, the author claims that the unemployed disproportionately voted Yes. However, the data shows no such thing. It is possible that the unemployed voted Yes, while the employed swung in the opposite direction. Based on the aggregate-level data, it’s also possible that 55% of both the employed and the unemployed voted for independence in the three Yes counties. We might have good reason for believing one of those interpretations over the others, but those reasons are external to the data presented. Without individual-level data, we simply have no way to test the accuracy of each interpretation.
The question is why, other than due to a lack of statistical sophistication, are such analyses so common in major news publications. I would argue that the rationale mirrors that of the opponents of the scientific approach in international relations, Hedley Bull chief amongst them. In simple terms, the fallacious logic produces results that are consistent with common sense. We could think of many reasons why people who are poor and/or unemployed would want to live under a different political system, particularly when that means living in a polity far more supportive of redistribution, as is the case in Scotland. We might seek to confirm or refute this common sense with individual-level polls, but those have not been particularly accurate for such a rare event
The bigger problem, as those studying international relations know all too well, is that there is a common sense explanation that might argue the exact opposite. The unemployed might be more risk-averse, for instance, and prefer the certainly of current benefits to the potential of larger benefits offered by a future independent Scottish government. It is also entirely possible that the employed middle-class in poverty-stricken areas might be unhappy with the quality of Britain’s services and thus choose to opt for an independent Scotland, even if they are personally doing fine. By sticking to common sense rhetoric at the expense of statistical accuracy, the media are reinforcing the common sense approach to politics and cheating their viewers/readers out of a more nuanced debate about why some people might want to preserve a 300-year union and others might not. With that said, the US is only a month and a half from its own election that will produce its own abundance of ecological fallacies. To expect otherwise would be to defy common sense.
So what do we know about the individual-level determinants of voting in this referendum? The quick answer to that is very little before the election, but more after post-election polls. First, we know that age matters, but likely due to an intervening variable: political affiliation. 16-17-year-olds, who don’t have the right to vote in British elections and tend to lean to the left, were allowed to take part in the referendum as part of a compromise made by the British Prime Minister and the leader of the SNP. A post-election poll shows this group supporting independence by a 3-1 margin, but the sample size for this age cohort is only 14. A survey with a much larger sample size, albeit taken before the spike in support for independence and including 14-17-year-olds, actually showed under-18s supporting continued union by a large margin. At the other extreme, we know that voters over the age of 65 overwhelmingly opposed independence, but those voters also tend to support non-nationalist parties, particularly the strongly pro-union Conservative Party. This is particularly ironic in light of a “common sense” reason for Scottish independence being the supposed mistreatment of Scotland by Margaret Thatcher. In fact, the people who would have suffered from that mistreatment are currently in their 50s and 60s.
On the topic of political affiliation, the Ashcroft poll shows that political affiliation did matter and quite substantially so, contrary to the BBC article relying on aggregate data. In Table 2, we see that 86% of SNP supporters voted for independence, in contrast to 37% of Labour supporters. Did employment status matter as much as suggested by the Telegraph aggregate data? While this question can’t be answered directly (due to a lack of data), we can see the reasons that the Yes supporters provided for wanting independence. In Table 5 of the Ashcroft poll, we see that jobs were the 6th biggest reason provided for voting yes or no, and given as one of the major three reasons for their vote by only 20% of the respondents (in contrast to 45% worried about the state of the NHS, a concern that is clouded more by partisan politics than one’s social status). In sum, we see that the aggregate data provided a common sense story that is at least partially refuted by the limited individual-level data that is available. Though it might be tempting to use aggregate data where individual data is non-existent, the end result is worse than saying nothing at all.