Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For classification, I think of it simply as a nonlinear transformation + multivariable logistic regression where parameters are learned jointly. In particular the nonlinear transformation is assumed to be of the form of some number of affine transformations, each followed by a nonlinear component-wise mapping. I tend to intentionally avoid brain comparisons because: 1) there's more than enough of that already 2) I don't know enough of the neurophysiology to speculate. I'd like to see some mathematical analysis on what classes of function are more efficiently represented (and/or learned) by networks with increasing numbers of layers.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: