No Prior Information

There has been a lot of work done on trying to find uninformative prior distributions that describe complete ignorance. My opinion is the idea of “complete ignorance” isn’t very useful. Jeffreys priors are interesting, and thinking about them can sometimes give you insight. But that doesn’t make them right. Jaynes’s transformation groups are cool, but they’re not ignorance, they are symmetry arguments. And MaxEnt is for updating probabilities, not assigning them. I don’t know much about reference or default priors so I can’t comment on those.

These attempts come from a desire for scientific objectivity and letting the data “speak for themselves” (Fisher). But data never speak for themselves. You need prior information about what the data means, otherwise the data only tells you about itself. The goal isn’t objectivity but rather transparency. Don’t claim you’re making no assumptions. Instead, clearly communicate what the assumptions are.

A Bayesian model is made up of a prior $p(\theta)$ for some parameters and a prior (sometimes called a sampling distribution) for the data, $p(x|\theta)$. When you multiply these together you have a joint prior $p(\theta, x) = p(\theta)p(x|\theta)$ which is a model for your state of knowledge before you obtained the actual value of the data. If you are completely ignorant then not only don’t you know the value of the parameter, you also don’t know of any connection between the data and the parameters. Your joint prior would be $p(\theta, x) = p(\theta)p(x)$, i.e. independence, which means getting data does not affect your knowledge about the parameter. That’s what complete ignorance really means and it’s why the concept is not helpful.

To emphasize this point, here is a link to a video of a someone analyzing some data without using prior information.