Matt Yglesias is back on the margin of error thing, but he mixes up Bayesian and frequentist interpretations of statistics.
Very simply speaking, frequentists assume that the number that they're looking for is fixed. The fact that they don't get the number every single time is because of the vagaries of the measuring equipment, or that they haven't done enough statistical experiments to get close enough to the real number. This is usually blamed on "chance" getting in the way. For example, a frequentist would say that there is a fixed level of support for Kerry in the population - call it p% for Kerry - and that 95 times out of 100 a 95% CI will include the number p. The other 5 times that it wasn't included were due to "chance". A Bayesian would say, well we've got this number p, which we call our "posterior probability density" (the posterior can also be a function of p called the "prior" density). We then find a function of the posterior, that we call the "likelihood function". That gives us a number for p and a probability that that's really the number we're looking for.
So, Bayesians believe that the original assumption of p is a belief, not a fixed number. With a coin flip for example, frequentists would say "Look at the coin. It has two sides, therefore the probability of heads must be 0.5." Then they would go off and flip a coin 100 times, and say that they get 0.5 within their confidence interval for p 95 out of 100 times (or whatever). Boom - they've found a model that fits their believed prior of 0.5.
A Bayesian says "No no no! You have no idea whether that coin is fair or not, or whether it really is a coin at all. So you can't assume anything about what the probability of heads is - all you can do is find what the probability of heads is most likely to be (the maximum likelihood). You're basing your idea of p on a subjective view of what the coin has always done in the past - but you can't always rely on that. Better not to trust your a priori assumptions about the coin." Here, the Bayesian is trying to find what p really is without making any assumptions about what p is in the first place, instead looking for a likelihood function that will make an interpretation about how probable it is that p is 0.5, or 0.3, or 0.01.
This is all relatively esoteric, and in matters such as presidential polls, there isn't any difference in practice between Bayesian and frequentist approaches. It's a philosophical difference rather than a methodological fallacy, and although it makes a huge difference once you start doing higher level statistics, it's not really something to consider with survey results like these.
In practice, Bayesians accept frequentist models for simple examples, often because they are more intuitive and end up with the same result in the long run. But don't worry about it for presidential polls - worry more about stupid TV journalists saying that people are "statistically tied" when they really aren't. That's far more applicable to the problem at hand.
Bravo! As a statistical consultant and a political scientist (grad school pre-PhD variety), this is an excellent post about frequentists vs. Bayesians.
In my experience, it's rare to find someone who can explain these things with efficiency, accuracy and elegance.
Posted by: Patrick | August 19, 2004 at 12:39 PM
Oh Patrick - I'm blushing! :-D
Posted by: Zoe | August 19, 2004 at 12:46 PM
Nerd.
Posted by: praktike | August 19, 2004 at 01:39 PM
I take that as a big fat compliment, praktike. So there.
Posted by: Zoe | August 19, 2004 at 01:41 PM
Actually, this was a great explanation.
I, too, proclaim my eternal allegiance to the Ancient and Hermetic Order of the Bayesians.
Posted by: praktike | August 19, 2004 at 01:45 PM
Ack! Everywhere I go I find praktike, and he or she has already made the hilarious comment of the section.
But okay. You say that the 'Frequentist' vs. 'Bayesian' distinction is probably too technical. And yet what Kevin Drum (the originator of this thread) is doing is talking about 'probabilities' that Kerry is really ahead. Bu dwei le!
This is where the distinction does matter, or am I high? It seems to be such an easy misstatement to fall into, that probabilities trap.
Posted by: djangone | August 19, 2004 at 02:09 PM
diangone,
Your savvy use of mandarin has warmed my heart.
Anyway, what Kevin Drum is saying could be interpreted in either a frequentist or a bayesian way, depending on your personal mindset.
A frequentist would take his assertion of a "75% chance that kerry is ahead" and say "Aha! This means that if this same poll was done 100 times, then 75 times we would include a number greater than 0 for Kerry's lead (which is a fixed number). Thus Kerry's lead does not equal 0 at the 25% level."
A Bayesian would say "Aha! This means that given our posterior and our likelihood function, the most likely estimate for the probability that kerry's lead is more than 0 is 75%."
Both interpretations arrive at the same point - we're 75% sure that Kerry's lead is probably more than 0. It's just the way you think about how you got there that makes it Bayesian or frequentist. Kevin's distinction between his view and the TV presenter's "statistical tie" view does matter, but whether his view only holds up under a Bayesian or a frequentist view (as Matt asserted) is too technical a distinction at this stage in the game.
Actually, this whole conversation was started by Tom Lang writing at Campaign Desk - Kevin, Matt and I have been trading responses about it for a while.
Posted by: Zoe | August 19, 2004 at 02:40 PM
Xie xie, wo syang...
(Rustily mixing up my spelling methods.)
Wo bu tsungming about this stuff, and in this if in nothing else, I'm happy to say I join the rest of humanity. Hen nán le!
Posted by: djangone | August 19, 2004 at 02:59 PM
I completely agree that whether or not polling numbers hold up under a Bayesian or a frequentist paradigm is too technical a distinction to haggle over at this point in the analysis.
I am militantly skeptical of opinion polling. Nothing drives me battier than watching some blow-dried tool on cable news jabber away about the latest opinion polling numbers as if they were iron-clad gospel. If you're going to place opinion polls in such a prominent position in public discourse, then there ought to be a social science version of the Fairness Doctrine from broadcasting. You should be required to discuss the assumptions and potential pitfalls as much as you discuss the results and interpretation.
The only times I've either advised on or used a Bayesian approach are for very large data sets generated via Monte Carlo simulation or for epidemiological research.
Come to think of it, I don't think I've ever seen opinion polling approached from a Bayesian point-of-view before. Anyone either seen or heard of such a beast?
Posted by: Patrick | August 19, 2004 at 03:29 PM
It isn't quite true that both the frequentist and Bayesian analyses would agree that there's a 75% chance that Kerry's lead is greater than 0. The difference is in what "75% chance" means.
The frequentist interpretation of the 75% is in Zoe's most recent comment. But the Bayesian thinks of the 75% as a measure of belief, not as a long-run frequency. (The Bayesian belief interpretation has a precise definition that does not depend at all on long-run frequencies). The belief and frequency interpretations _seem_ to be the same only because our common-sense view of probability is closer to the belief interpretation to begin with.
The Bayesian analysis might not even yield the same figure of 75%, depending on what the analyst's prior distribution for p (the population percentage for Kerry) was. It isn't possible to do a Bayesian analysis without a prior distribution for p; I didn't quite understand the 4th paragraph of the original post.
Posted by: Wes | August 19, 2004 at 03:40 PM
While I have no idea whether Kevin Drum has thought about this subject very much, he comes up with "there's a 75% probability that [Kerry's] genuinely ahead of Bush (i.e., that his lead in the poll isn't just due to sampling error)."
What I was trying to say was given that, you can think about it in a Bayesian way or a frequentist way. You can say what said above in the frequentist manner, or you can say that that is a measure of the strength of probability (or belief, whatever) that Kerry is really ahead. Since Kevin Drum certainly didn't calculate a prior distribution or mention his likelihood function, I think it's safe to say that he didn't arrive at these numbers in a purely Bayesian fashion.
When I said in the 4th paragraph "don't trust your a priori assumptions about the coin", I didn't mean "Throw out your prior distribution." I just meant that in contrast to the frequentist method, you don't start with your fixed number and then find a structure to fit it - you start off with some prior knowledge and then from that you calculate what the statistic is likely to be (with a measure of belief in that statistic).
Posted by: Zoe | August 19, 2004 at 03:51 PM
Zoe,
Thank you for clarifying the fourth paragraph. I was confused about the Bayesian analysis not making any assumptions about what p is. It is the Bayesian analysis that needs the extra assumption of the prior distribution.
The orthodox (frequentist) way to express the original poll result would be something like "We are 75% _confident_ that Kerry is ahead of Bush," where "confident" means "long-run frequency." Most people beat their introductory students over the head with this distinction. I don't. I beat them over the head with other distinctions. :)
Patrick's comment about Bayesian opinion polling is interesting. I haven't seen it done, but it must be out there somewhere, because there is a theory of Bayesian sampling from finite populations. You could try to use a Bayesian analysis to correct for nonsampling biases.
Posted by: Wes | August 19, 2004 at 04:20 PM
Wes - do I infer from your email address that you're at a School of Public Health (at Johns Hopkins perhaps?) I took survey methods with Prof. Alan Zaslavsky of the School of Public health here at Harvard two semesters ago and we learned about Bayesian sampling. It's too bad that I was far more interested in the cognitive statistical side of sampling though - I remember very little about the Bayesian stuff :-)
Posted by: Zoe | August 19, 2004 at 04:29 PM
Yes, I was at Johns Hopkins until October of last year, until I (somehow) finished my degree and got a job at a liberal arts college. My research was on philosophical issues in statistics (great dinner party conversation!), but not specifically about interpretations of probability.
Posted by: Wes | August 19, 2004 at 04:37 PM
I never even remotely knew about this Bayesian way of looking at things, and I don't really get where the difference is. I'm a mathematician, not a statistician, so what I know about probability and statistics isn't much more than a typical upper-division class at the undergrad level. Is there just a "frequentist" assumption through all that, with a Bayesian approach being more philosophical in its differences, or is there really a difference in the math that's involved? That's probably not a very good question, now that I think about it.
Posted by: Haggai | August 19, 2004 at 05:48 PM
There is a frequentist assumption throughout all of traditional statistics education, except in some graduate departments. The assumption shows up most clearly in introductory textbooks' zealous adherence to the long-run frequency interpretation of probability. Every year one question on the AP Statistics exam focuses on the distinction: use the "p word" (probability) instead of "confidence" and you're marked (partially) wrong.
The other difference: frequentist analyses use probabilities of data values that could have been observed but were not. These probabilities generate inconsistencies and paradoxes in certain situations with confidence intervals and significance tests. Bayesian analyses do not use probabilities of unobserved data values.
Frequentist and Bayesian analyses will often yield the same numerical answers in the simple, low-dimensional, symmetric problems of Stat 101. In more complex problems they can yield substantially different answers.
This comment is too long already, without even mentioning the "priors" of Bayesian analyses. *Sigh* Now try to explain any of this in a one-semester statistics course. :)
Posted by: Wes | August 19, 2004 at 06:18 PM
Thanks for trying, Wes. I'll look into this more on my own if the desire strikes me. :)
Posted by: Haggai | August 19, 2004 at 07:38 PM
I feel like I ought to be able to understand this, but I don't. Any possibility of a pointer to an example comparing the two approaches to the same data set?
Posted by: masaccio | August 19, 2004 at 10:22 PM
masaccio - try this, the third chapter. i can't find anything better with a quick google search, and i don't really know of any good textbooks (the one i learned from was good but extraordinarily dry).
Posted by: Zoe | August 20, 2004 at 12:08 AM
Masaccio, one example is "Investigating Therapies of Potentially Great Benefit: ECMO" by J.H. Ware and discussants from Volume 4, Issue 4 of Statistical Science (1989). The paper applied both frequentist and Bayesian methods to data from a study of infant respiratory failure. The contrast with Bayesian statistics is particularly telling, because the researchers had some evidence (before the study) that their new therapy was greatly superior to the old therapy. Statistical Science is on JSTOR.
Various web sites and books on Bayesian statistics (Berry's introductory book, Carlin & Louis, Gelman/Carlin/Stern/Rubin, many others) should have other examples, but no specifics come to mind.
Posted by: Wes | August 20, 2004 at 12:25 AM
These margins of error only state statistical sampling error. There's also the possibility of systematic bias in the various polls to consider. I know that when you plot national political polls in the aggregate (as on the Prof. Pollkatz site) the spread in systematics between various polls tends to be two or three times as large as the sampling error.
That raises still more questions of what number we're truly trying to measure-- something like current support for Kerry among people who are going to vote in the election, I suppose... or is it support for Kerry among people who are currently planning to vote? The pollsters typically make no claim to be predicting the actual future vote for Kerry, so it's not that, but there's the question of whether or not they are trying to predict the future turnout, which gets into the whole registered voters versus likely voters question.
Posted by: Matt McIrvin | August 20, 2004 at 09:09 AM
Huh huh... she said "posterior"...
Posted by: JP | August 20, 2004 at 10:43 AM
JP, Barbie said it best with "math is hard," right? Can't have all this pointy-headed high-falutin' stuff going on here. :)
Posted by: Haggai | August 20, 2004 at 11:19 AM
Thanks, Wes and Zoe. I will have to seek out the article, but the web site is helpful.
Posted by: masaccio | August 20, 2004 at 11:25 PM
ah, i have resurrected the Demon Of Conflict once more. yes, indeed, soon you will be one of us!
Posted by: ekzept | September 02, 2006 at 11:28 PM