Ed Jaynes frequently criticised what he called the mind projection fallacy. This is the implicit assumption that our concepts of, say, gaussianity, randomness, and so on, reflect properties of to be found “out there in nature”, whereas they often reside only in the structure of our thinking. I’ve brought this up a few times on the blog, in old posts such as Stochastic and Testing Model Assumptions.
Learning basic Haskell has given me another way to think and talk about this issue without getting mired in metaphysics or discussions about subjectivity and objectivity. This post about random sequences is an instance of what I’m talking about in this post.
Haskell is a strongly typed language, like C++. Whenever you define something, you always have to say what type of thing it is. This is pretty much the same idea as in mathematics when you say what set something is in. For example, in the following code
x is an integer and
f is a function that takes two integers as input and returns a string. Each definition is preceded by a type signature which just describes the type. Functions and variables (i.e., constants ;-)) both have types.
x :: Int
x = 5
f :: Int -> Int -> String
f x y = if x > y then "Hello" else "Goodbye"
Jared wrote his PhD dissertation about how Haskell’s sophisticated type system can be used to represent probability distributions in a very natural way. For example, in your program you might have a type
Prob Int to represent probability distributions over
Prob Double to represent probability distributions over
Doubles, and so on. The
Prob type has a type parameter, like templates in C++ (e.g.,
std::vector<T> defines vectors of all types, and
std::vector<double> have definite values for the type parameter
Prob type has instances for
Monad for those who know what that means. It took me about six months to understand this and now that I roughly do, it all seems obvious. But hopefully I can make my point without much of that stuff.
Many instances of the mind projection fallacy turn out to be simple type errors. More specifically, they are attempts to apply functions which require an argument of type
Prob a to input that’s actually of type
a. For example, suppose I define a function which tests whether a distribution over a vector of doubles is gaussian or not:
isGaussian :: Prob (Vector Double) -> Bool
If I tried to apply that to a
V.Vector Double, I’d get a compiler error. There’s no mystery to that, or need for any confusion as to why you can’t do it. It’s like trying to take the logarithm of a string.
Some of the confusion is understandable because our theories have many functions in them, with some functions being similar to each other in name and concept. For example, I might have two similarly-named functions
isGaussian2, which take a
Prob (Vector Double) and a
Vector Double as input respectively:
isGaussian1 :: Prob (Vector Double) -> Bool
isGaussian1 pdf = if pdf is a product of iid gaussians then True else False
isGaussian2 :: Vector Double -> Bool
isGaussian2 vec = if a histogram of vec's values looks gaussian then True else False
It would be an instance of the mind projection fallacy if I started mixing these two functions up willy-nilly. For example if I say “my statistical test assumed normality, so I looked at a histogram of the data to make sure that assumption was correct”, I am jumping from
isGaussian2 mid-sentence, assuming that they are connected (which is often true) and that the connection works in the way I guessed it does (which is often false).
Here’s another example using statistics notions. I could define two standard deviation functions, for the so-called ‘population’ and ‘sample’ standard deviations:
sd1 :: Prob Double -> Double
sd2 :: Vector Double -> Double
Statisticians are typically good at keeping these conceptually distinct, since they learn them at the very beginning of their training. On the other hand, here’s an example that physicists tend to get mixed up about:
entropy :: Prob Microstate -> Double
disorder :: Microstate -> Double
While there is no standard definition of disorder, no definition can make it equivalent to Gibbs/Shannon entropy, which is a property of the probability distribution, not the state itself. Physicists with their concepts straight should recognise the compiler error right away.
I mentioned above that
Prob has a
Functor instance. That means you can take a function with type
a -> b and apply it to a
Prob a, yielding a
Prob b (this is analogous to how you might apply a function to every element of a vector, yielding a vector output where every element has the new type). The interpretation in probability is that you get the probability distribution for the transformed version of the input variable. So, any time you have a function
f :: a -> b, you also get a function
fmap f :: Prob a -> Prob b. This doesn’t get you a way around the mind projection fallacy, though. For that, the type would need to be
Prob a -> b.