http://www.washingtonpost.com/wp-dyn/content/article/2009/06/20/AR2009062000004.html
Interesting probability-based approach to the Iran election results.
And...also incorrect if I'm not mistaken. For instance:
Not so in the data from Iran: Only 62 percent of the pairs contain non-adjacent digits. This may not sound so different from 70 percent, but the probability that a fair election would produce a difference this large is less than 4.2 percent.
62 percent means that 18 out of the 29 provinces.
I'm calculating a 23% chance to get 18-or-less, and a 10% chance to get specifically 18. I dunno how they managed to get 4%.
Of course, my numbers are working with an incorrect assumption of theirs:
To check for deviations of this type, we examined the pairs of last and second-to-last digits in Iran's vote counts. On average, if the results had not been manipulated, 70 percent of these pairs should consist of distinct, non-adjacent digits.
That would be true if 9 and 0 were adjacent to each other. I'm guessing they're not in psychological terms, which means it's actually 72 percent. (Which would change my numbers to 16% of elections would hit 18-or-less provinces, and 8% would hit specifically 8 provinces)
The numbers look suspicious. We find too many 7s and not enough 5s in the last digit. We expect each digit (0, 1, 2, and so on) to appear at the end of 10 percent of the vote counts. But in Iran's provincial results, the digit 7 appears 17 percent of the time, and only 4 percent of the results end in the number 5. Two such departures from the average -- a spike of 17 percent or more in one digit and a drop to 4 percent or less in another -- are extremely unlikely. Fewer than four in a hundred non-fraudulent elections would produce such numbers.
This feels like BS too, although my calculation is not 100% accurate here. It's about a 6% chance for a spike to 17 percent, and about a 6% chance for a drop to 4 percent. However, what he seems to be looking for is outliers, in which case it's a 12% chance to be above 17% OR below 4%.
What's the chance of two 12% events occurring? Well here's the thing: there's 10 digits. They'll have on average 1.2 of these events (not exactly, mind--I did say in this case my calculation is an estimate and not exact). So...the chance of two outliers is...34%.
As a point of comparison, we can analyze the state-by-state vote counts for John McCain and Barack Obama in last year's U.S. presidential election. The frequencies of last digits in these election returns never rise above 14 percent or fall below 6 percent, a pattern we would expect to see in seventy out of a hundred fair elections.
Extremely misleading analogy. 29 =/= 50. If you rolled one die, you could say "OMG, must be weighted dice, I got a 4 100% of the time!" The more dice you roll the closer you will be to average.
Incidentally, I get an 11% chance for that 6 percent, and a 12% chance for that 14 percent. Which...does indeed lead to a variance that is smaller than 70% of elections (I get 68%). Well that's one spot our numbers line up. Maybe they were using a probability table designed for the US and didn't realize that 29 < 50 matters? Yeah...when I redo the calculations like that I get numbers rather close to theirs. -_-