Recently I popped across the road to AUT to give a talk for their statisticians and applied mathematicians. In the Q&A session, I got asked about overfitting, and had the same experience I described in this old post. This motivated me to elaborate on that post, which I did in the form of a video talk.
Here it is – I hope you enjoy it.
There’s one thing I wasn’t completely clear about towards the end of the talk, in the bit with the red and green bars where I discuss trans-dimensional models. The green parts are meant to represent the regions of parameter space that fit the data. The regions that overfit the data will be a tiny subset of the green bars, even in the complex model on the right hand side of the slides. Even if you conditioned on the model all the way on the right, you wouldn’t get overfitting unless you optimised within that model.