This week I gave a presentation in the astronomy group here at the University of Auckland, about some work I’ve been doing over the past year. As usual that work involves fitting fairly complex models to datasets.
One question that I got related to overfitting. I find it a little odd all the warnings we hear about overfitting, and all the methods we supposedly have to use to avoid it. These messages completely clash with my own experience which is that overfitting basically never happens and you don’t have to do anything to avoid it. In fact, the first time I ever fitted a gravitational lens model to an astronomical image, I had a big problem with underfitting caused by my naive prior. I had used a Uniform(0, 1E6) prior applied independently to some pixels, and it turns out that implies a very strong commitment to the sky being bright, which it isn’t.
Most examples of ‘overfitting’ are caused by attempts to solve inference problems with optimisation methods. If an optimisation-based method breaks (overfits), that’s telling you something important. Inference is not an optimisation problem, so you’re using the wrong tool for the job.