Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What I don’t see clearly addressed here is whether the test data that these networks are validated against are part of the larger data set that was used for the initial training. I’m guessing the validation data is usually from the same data set, in which case it’s not really a surprise that a massively overfitted network would work pretty well against. Whereas some alternative data set produced by different people under even slightly different conditions will introduce many new unexpected variables that the network won’t be equipped to handle, and that’s when the “overfitting” to the original data set would be more obvious. But I’m going to guess that in practice, useful datasets vary so much that it’s impractical to do this sort of cross checking (and in reality it wouldn’t happen because you don’t want to publish a negative result).


I'm not in the field (nor an academic, nor particularly smart), but my impression is that this is implicit in nearly everything people are doing. Almost all of the papers will be talking about interpolation, not extrapolation. More specifically, training data and test data are assumed to be partitioning an existing data set into test and training portions. "Generalization" is measured only by success at fitting the test data.

Of course, most actual applications immediately break out of that model by running live against previously unobserved data coming from different populations, different times (just think: post-2020 vs pre-2020!), often different purposes. And probably much of the error because you're now extrapolating gets regarded as an engineering problem?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: