1. Is not a justification, it's a motivation / intuition. It proves little since only the first visual layer (there are 6 visual cortex layers) does convolution-esque things.
2. Justifies convolution over fully-connected. It does not justify convolution in the first place.
None of these justify ReLU, max pooling, batch norm.
For example, Geoffrey Hinton (probably out of all deep learning researchers the most neuroscience motivated) actually believes max-pooling is terrible, the brain doesn't do it, and it's a shame that it actually works.
Regarding your first point, I agree that it doesn't constitute a justification for "this is the best way" so much as justification for "this is worth trying".
When you say
> It proves little since only the first visual layer (there are 6 visual cortex layers) does convolution-esque things.
I admit I'm not clear on the roles of the various layers of visual cortex. But I should note that there are several convolution-like steps which in an artificial network would be implemented with separate layers: center-surround detection (happens to occur before V1, but that's not really a problem for what I'm saying- it's a distraction that I skipped past for concision's sake) feeds into the convolution-like operations of oriented-edge detection, corner detection, etc. It's not like our visual system in toto just does one convolution-like step.
My second point must have been expressed poorly, because I have apparently completely failed to convey its meaning. It is meant to justify the particular choice of ordering convolutions before fully-connected layers. If you have already chosen to include both types, my point is that it makes sense to put those convolutions before the fully-connected layers.
The things you described are just V1 of the visual cortex. The first area (not actually the first layer, but there are 5 other visual areas that aren't convolution-esque).
2. Justifies convolution over fully-connected. It does not justify convolution in the first place.
None of these justify ReLU, max pooling, batch norm.
For example, Geoffrey Hinton (probably out of all deep learning researchers the most neuroscience motivated) actually believes max-pooling is terrible, the brain doesn't do it, and it's a shame that it actually works.