I love to hate on U-net. It works but it's just so inelegant. That is not a true convolution and only works for particular 'patch' sizes bothers me to no end.
I am not super up to date with the field, but has anyone caught on to using 'wavenet' like architectures yet? That is, dialated convolutions.
You have to be a little clever to get residual connections to work properly, but it's a true convolution that works for any patch size, is super-parameter efficient, and captures the same multi-scale features U-net was designed for.
Anecdotally, I used such an arch for some (unfortunately proprietary) 3D imaging work and achieved some nice results.
Well that's sorta the point. Personally I'm not a super huge fan of creating a super specific network architecture and resulting in 2-3% difference in performance. Certainly if you're doing something where a configuration makes sense (LSTM for time series for example), but I think there needs to be a rethinking of the Grand Theory of Deep Learning Architecture TM.
And frankly I think a unsaid reason why U-net is so popular is that it does generalize reasonably well with limited data, which in many fields is not as massive as COCO.
I realize it's sorta asking too much (I both want a NN that works both out of the box, super easily, and doesn't require a TON of data), but I think that's where the current pains are for really explosive growth in AI.
> I think there needs to be a rethinking of the Grand Theory of Deep Learning Architecture TM.
strong agree. Although perhaps not so much a rethinking as a theory of all. Huge dearth of theory in the field. Daily practition involves regular use of black magic intuition for arch, problem posing and debugging. Weird times.
I am not super up to date with the field, but has anyone caught on to using 'wavenet' like architectures yet? That is, dialated convolutions.
You have to be a little clever to get residual connections to work properly, but it's a true convolution that works for any patch size, is super-parameter efficient, and captures the same multi-scale features U-net was designed for.
Anecdotally, I used such an arch for some (unfortunately proprietary) 3D imaging work and achieved some nice results.