1. Is not a justification, it's a motivation / intuition. It proves little since...

jpfed · on Jan 11, 2017

Regarding your first point, I agree that it doesn't constitute a justification for "this is the best way" so much as justification for "this is worth trying".

When you say

> It proves little since only the first visual layer (there are 6 visual cortex layers) does convolution-esque things.

I admit I'm not clear on the roles of the various layers of visual cortex. But I should note that there are several convolution-like steps which in an artificial network would be implemented with separate layers: center-surround detection (happens to occur before V1, but that's not really a problem for what I'm saying- it's a distraction that I skipped past for concision's sake) feeds into the convolution-like operations of oriented-edge detection, corner detection, etc. It's not like our visual system in toto just does one convolution-like step.

My second point must have been expressed poorly, because I have apparently completely failed to convey its meaning. It is meant to justify the particular choice of ordering convolutions before fully-connected layers. If you have already chosen to include both types, my point is that it makes sense to put those convolutions before the fully-connected layers.

argonaut · on Jan 11, 2017

The things you described are just V1 of the visual cortex. The first area (not actually the first layer, but there are 5 other visual areas that aren't convolution-esque).