What are these core principles?

Dn_Ab · on June 22, 2012

The main characters of both papers are many layered neural network architectures, autoencoders and stochastic gradient descent. The interesting thing is that all these ideas are from the 80s but the breakthrough was in how to use unsupervised learning to seed neural networks so that a many layered neural network did not get mired in local optima.

The key idea is that if you train each layer in an unsupervised manner and then feed its outputs as features for the next layer it performs better when you go on to train it in a supervised way. That is, back-propagation on the pre-trained Neural net, learns a far more robust set of weights than without pretraining. Stochastic gradient descent is a very simple technique that is useful for optimization when you are working with massive data.

The architecture Dahl used layers as RBM (very similar to autoencoders) to seed a regular ole but many layered Feedforward network. SGD is used to do back propagation. RBMs themselves are trained using a generative technique - see Contrastive divergence for more.

The google architecture is more complex and based on biological models. It is not trying to learn an explicit classifier hence they train a many layered autoencoder network to learn features. I only skimmed the paper but they have multiple layers specialized to a particular type of processing (think photoshop not intel) and using SGD they optimize an objective that is essentially learning an effective decomposition on the data.

The main takeaway is if you can find an effective way to build layered abstractions then you will learn robustly.

tylerhobbs · on June 23, 2012

There's a very good Google Tech Talk by Geoff Hinton (who has worked closely with Dahl on a lot of this research and developed some of the key algorithms in this field) that explains how to build deep belief networks using layers of RBMs: http://www.youtube.com/watch?v=AyzOUbkUf3M

That video focuses on handwritten digit recognition, but it's great for understanding the basics. There's a second Google Tech Talk video from a few years later that talks directly about phoneme recognition as well: http://www.youtube.com/watch?v=VdIURAu1-aU