As an election forensics analyst, I have frequently been called upon to explain and defend pattern evidence indicating the targeted mistabulation of votes as a probable cause of pervasive anomalies and disparities. While far more detailed explanations can be found in a number of studies my colleagues and I have conducted, I think it may be useful to set out the fundamental bases for reliance on the approaches we have taken and the conclusions we have reached.
First a word about the need to rely on such “indirect” methods of election verification. It is not something that has been thought about or talked about much, but the vote counting process in the United States is designed for concealment. Most absurdly, the code that counts or miscounts votes has been ruled a corporate trade secret that cannot be divulged or examined under any circumstances.
Nor does the concealment stop with the code. All the “hard” evidence—memory cards, programming code, server logs, and actual cast ballots—is strictly off-limits to the public and, in most cases, to election administrators as well. Given that two corporations supply nearly 80 percent of the hardware and software used to count votes in the U.S., and given that the handful of equipment suppliers and their handful of programming/distributing satellite contractors actually constitute a consolidated and easily targeted closed (indeed virtually hermetic) system, no comfort should be taken from erroneous reassurances that the process is somehow too “decentralized” to be vulnerable to either hacking or insider manipulation.
It is precisely because of the secretive nature of the American vote counting process and because all the hard evidence is inaccessible, that the forensic investigation of election security and authenticity perforce has come down primarily to numerical, statistical, and pattern analysis. Following along after the election circus with a forensic pooper-scooper may not strike you as the best way to try to insure democracy; but until the public reclaims its right of access to voted-on ballots and the counting process, it just happens to be the only way we’ve got.
That said, such numerical, statistical, and pattern analysis is relied upon routinely in fields ranging from aerospace to economics, climate study, epidemiology and disease control. It is also routinely applied, often with the sanction of the government of the United States, to elections pretty much everywhere on Earth other than in the United States, periodically leading to official calls for electoral investigations and indeed electoral re-dos. Exit poll disparities have factored in the overturning of elections from the Ukraine to Peru and are relied upon for validation of votecounts in Western democracies such as Germany.
Disparities, whether in a bank audit or in an election, require explanation. Which is to say, when measurements of what is ostensibly the same phenomenon fail to agree, there exists some cause: one or both of the measurements are inaccurate. In the case of elections, it has generally been assumed that the votecounts are accurate and any other incongruent measure is therefore erroneous. There is, however, given the known vulnerabilities of the vote counting process to manipulation, little or no basis for that assumption of accuracy. As for the other, incongruent measures, there is also no reason to assume their accuracy. If they are to serve as baselines for assessing the accuracy of vote counts, these other measures must themselves be validated. Much of our work as forensic analysts goes into that process—and it is that work that is often ignored or misunderstood in the rush to dismiss red flags and “protect the shield” of our elections.
When official votecounts come out to the right of other measures of voters’ intent—such as exit polls, pre-election polls, post-election polls, and handcounts—forensic analysts refer to it as a “red shift.”1 Since 2002, when the computers took over the counting, the red shift has been pervasive: election after election, in competitive contests bearing national significance,2 the official votecount has been to the right of every baseline measure. We very rarely see the reverse, which we would call a “blue shift.” There is a tremendous amount of data and it all points in the same direction.3 It is critical to grasp the enormous difference in probative value between a single statistical red flag and this years-long parade of unidirectional red flags. The latter rules out chance, glitches, flukes as cause, leaving only systemic inaccuracies and distortions of either the votecount or the baseline measures as possible explanations for the pattern.
From a forensic standpoint, as noted above, much of our work goes into determining whether those baselines from which the official votecounts keep diverging are themselves valid. Naturally, if you simply assume all votecounts are valid, you would then look for reasons to dismiss any data that disagrees with them. You could, for example, disparage all the incongruent exit polls as “off again” because they “oversampled Democrats.” However, we have examined exit poll samples and other baselines closely and found that such is not the case—the problem is definitely not that all these other measures of voter intent are chronically incompetent or corrupted.
In 2006, for instance, we examined the national exit poll sample and found that it was to the right of every other independent measure of the national electorate. We knew, therefore, that the massive red shift we found in the 2006 election could not have been a function of a faulty (i.e., left-skewed) exit poll baseline, leaving mistabulation of the votes as the only explanation for the shift that could not be discounted.4 We went further in 2006 (and again in 2008) and, recognizing that competitive races are natural targets for rigging (the outcome can be altered with a modest manipulation, yielding a high reward/risk ratio) while noncompetitive races are not (much higher risk factor: to alter the outcome you have to shift too high a percentage of votes to pass the smell test), we compared competitive with noncompetitive races relative to an identical baseline. We found that the more competitive a race the more likely it was to be red shifted—the correlation was dramatic.5
In 20106 we were able to compare hand-counted to computer-counted ballots in a critical U.S. Senate race (Massachusetts: Coakley vs. Brown) and again found an outcome-altering red shift of the computer-counted votes, one that we were unable to explain by any factor other than strategically mistabulated votecounts.7
More recently, in 2016, our analysis of the respective party primaries found that, while the exit poll results were consistently accurate throughout nearly all the Republican primaries, they were wildly and broadly inaccurate in the Democratic primaries, exhibiting a pervasive intra-party “red shift” to the detriment of Bernie Sanders. It seems very unlikely that the same pollsters, employing the same methodological techniques and interviewing voters at the same precincts on the same days, would be competent and consistently successful with Republicans but somehow incompetent and consistently unsuccessful with Democrats.
In the 2016 general election, the critical “swing” states that provided Trump’s electoral college majority—including Wisconsin, Pennsylvania, Ohio, and North Carolina—were among the most egregiously red-shifted of all the states, with poll-votecount disparities far outside the margins of error.8 As in the election of 2004 and the 2016 primaries, it was the overall contrasting pattern that was most remarkable—as the National Exit Poll, which impounded the many “safe” states where manipulation was not suspected, was not red-shifted outside the margin of error. That is, the pollsters “got it right,” except in those states with close Trump victories that produced his Electoral College majority. We can of course choose, at our peril, to believe that, election after election, such things “just keep happening.”
E2016 in fact offered up a quintessence of what is wrong with whatever debate there is over indicators of electoral foul play, and the general under-appreciation of the subtlety of forensic analysis. Much was made of the apparently egregious over-representation of college graduates in the National Exit Poll sample. With an “Aha!” that could be heard on Mars, the poll was declared “garbage” and tossed hastily and permanently in the shredder because 50 percent of its respondents had declared themselves to be college grads. The impact of education level on candidate choice was modest (about the equivalent of gender and far below race), but this did not stop the critics from fastening on the 50 percent figure (which it must be said would not even have been available to fasten on were the exit polls as opaque in their revelations as are the votecounts), which they calculated implied an unrealistic rate of turnout among college grads.
What the scoffing and whewing herd apparently failed to notice was that the exit poll they had just trashed—along, it soon became clear, with every other exit poll ever conducted or to be conducted in the United States—was accurate! That’s right, accurate. The unadjusted National Exit Poll approximated Clinton’s popular vote victory margin within 1.1 percent. It was accurate enough as to require hardly any adjustment—and, if it hadn’t been for the major disparities in the Trump table-run battleground states, would not have required any adjustment at all.
How, then, to read this riddle? How could a poll with such an apparent demographic goof wind up so close to the mark? What no critic apparently understood, or wanted to understand, is something very basic and essential to exit poll methodology: multiple stratification (weighting). Exit pollsters know enough not to expect equivalent response rates across race, age, gender, income, education, and partisanship groups. They use data-rich models, as in many other sciences, to weight their samples accordingly. It has been my observation that the aggregate impact of these multiple weightings—because they are grounded at least in part on demographic data derived from prior elections’ exit polls that have been adjusted rightward to congruence with red-shifted votecounts—tends to be rightward. That is, there are factors in the exit pollsters’ weighting algorithm that tend to chronically push the sample a few points to the right. The over-representation of college grads pushed the sample a point or two to the left. Such weightings tug against one another—so, for example, the sample might wind up over-representing the college-educated but under-representing non-white voters. The art and science of exit polling lies in getting those balances right, and they’ve sure enough had a lot of practice (in fact, prior to the computerized voting era, the main problem with exit polls was that they were so accurate that the pollsters had to agree to withhold their results until polls had closed in order not to discourage late-day turnout).
It’s a complex process and you could say, I suppose, “the secret’s in the sauce” (although, again, this sauce is far less secret than the votecounts themselves—the numbers are there to inspect and compare, at https://www.cnn.com/election/2016/results/exit-polls/national/president for the adjusted National poll and on this website for the unadjusted screencaptures, along with a more complete analysis of the polling methodology).
But you can also say “the proof’s in the pudding.” The fact is that the National Exit Poll—the one torn apart by a posse of critics sorely lacking in understanding of exit poll methodology, many of whom have been hell-bent on discrediting exit polls as a verification tool since 2004—got it right, while the exit polls by the same firm, using the same methodological “sauce,” in the critical battleground states table-run by Trump, were way off, all in the same direction. That is a damning second-order comparative, and the best evidence we can get from a system determined to withhold all its “hard” evidence, a process designed for concealment. So far, to my knowledge, no one has established a benign explanation for this, or numerous other, telling patterns of disparity.9
I hope that any reader troubled by the evidence summarized here, and/or by its facile dismissal by those who would prefer not to grapple with its implications, will take the time to examine the studies included in the “Evidence and Analysis” chapter of CODE RED, all of which are fully accessible to the non-statistician. In conclusion, the key point is that it is not just a few instances or an equivocal pattern, nor can it be attributed to skewed baselines—it is pervasive. It is difficult to look at all this data gathered together and not emerge gravely concerned that elections have been systematically manipulated and strongly moved to further investigate that possibility.
- The term “red shift” was in fact coined by this author in reference to the exit poll-votecount disparities favoring Bush in E2004; it has been adopted into general usage when describing such disparities.
- Contests bearing national significance include, obviously, the federal elections for president, Senate, and House (where majority or super-majority control is at stake), but also key governorships, state administrative posts, and state legislative control. On rare occasions a contest without direct bearing on national politics—an example being the 2011 Wisconsin Supreme Court election—will take on national significance as a proxy or bellwether test of political strength. The special election for Georgia’s Sixth Congressional District in June 2017 is a more recent example of a contest with great proxy significance.
- Much of this evidence is contained within studies compiled in CODE RED, including: The 2004 Presidential Election: Who Won the Popular Vote? An Examination of the Comparative Validity of Exit Poll and Vote Count Data (2004); Landslide Denied: Exit Polls vs. Vote Count 2006, Demographic Validity of the National Exit Poll and the Corruption of the Official Vote Count (2007); Fingerprints of Election Theft: Were Competitive Contests Targeted? (2007).
If, in noting the “ancient” dates of these studies, you encounter any temptation to comfort yourself with a “that was then, this is now” qualification, note as well that nothing of significance has been done at any point along the way to give current elections any more protection than the ones subject to these early forensic examinations. Evidence of a vulnerability and its exploitation gathered from 2004 applies with equal force to 2018, and indeed indefinitely until such time as the counting process is made public and observable.
- Simon J, O’Dell B: Landslide Denied: Exit Polls vs. Vote Count 2006, Demographic Validity of the National Exit Poll and the Corruption of the Official Vote Count (2007), http://electiondefensealliance.org/files/LandslideDenied_v.9_071507.pdf.
- Simon J, et al: Fingerprints of Election Theft: Were Competitive Contests Targeted? (2007), http://electiondefensealliance.org/files/FingerprintsOfElectionTheft_2011rev_.pdf.
- Simon J: Believe it (Or Not): The Massachusetts Special Election for U.S. Senate (2010), http://electiondefensealliance.org/files/BelieveIt_OrNot_100904.pdf.
- It is greatly instructive to compare this analysis with one undertaken, also in Massachusetts, of the 2016 Democratic primary. That analysis, undertaken by a less careful colleague, concluded that Bernie Sanders “inexplicably” far outperformed Hillary Clinton in the towns where the votes were hand-counted. It was fatally flawed, however, by failing (indeed, not attempting) to account for the demographic (particularly racial) divergences between the hand-count and computer-count towns, which provided a ready explanation for the seized-upon disparity. By contrast, in our 2010 analysis of Coakley-Brown, we took great pains to examine the impact of any such demographic (and political) divergences and thus were able to validate the baseline (handcounts) from which the computer tabulations diverged. This difference goes to the essence of evaluating the credibility of forensic analyses.
- Florida and Michigan, which completed Trump’s improbable table-run of must-win states, fell out of this dramatic-red-shift group only because of a geographical oddity: a tiny piece of each state crosses from the Eastern to the Central time-zone. This means the polls at the extreme western tips of the Florida panhandle and the Michigan upper peninsula close an hour later than those in the rest of the state—and exit polls are not posted until that time, an hour after the polls have closed in 99 percent of each state. This in turn permits the “adjustment” of those polls almost all the way to congruence with the votecounts prior to public posting, effectively eliminating the tell-tale red shift.
- Exit poll-votecount and similar comparisons constitute an extrinsic analysis, comparing the votecounts with alternate measures of voter intent. Analysts have also, more recently, begun to employ a powerful tool of intrinsic measurement known as Cumulative Vote Share (“CVS”) analysis, “intrinsic” because it measures and analyzes only the votecount itself (see, e.g., the work of Wichita State University statistician Elizabeth Clarkson, at www.bethclarkson.com). A peculiar and consistent pattern has emerged from analysis of precinct-level votecount data from suspect elections: the cumulative vote share of the candidate who is the suspected beneficiary of votecount manipulation unexpectedly increases with increasing precinct size. This “CVS Upslope” does not appear to reflect either demographic or partisanship tendencies of the precincts. It does, however, fit perfectly with what would be a highly rational tactical decision to shift votes in larger rather than smaller precincts: the “splash” made by a vote theft of equal size is correspondingly smaller and less noticeable the larger the pool from which the votes are taken.