sigh no, it doesn't "outperform humans" on "face recognition". In particular, see my previous comments [1] and [2] for discussion on why this method might be doing well.
As for "outperforming humans", a more accurate statement might be, "this algorithm outperforms (for this simplistic task) one experiment done with a limited set of humans on this one particular dataset which has been in the community for 10 years now and is thus highly gameable."
But I realize that's a lot less pithy.
In particular, this dataset is nearing saturation, and whenever that happens, differences in the accuracy numbers often don't mean much. So for example with Facebook's number at 97.53% and this paper's at 98.52%, you're talking about the difference between getting 148 pairs of faces wrong vs 89 pairs wrong. In practical terms, as a researcher working with a dataset like this, you very quickly learn to focus on just the ones your algorithm gets wrong, and it's impossible to not subconsciously try to optimize for getting those few cases correct, even if those techniques wouldn't actually help in the general case.
depends on the lighting, camera setup, CPU, available processing time and finally your intended purpose.
Face detection/recognition is a broad subject, and the applications are broad.
There are things like HAAR cascades which are able to pickout a face like object from others (or anything else that it's been "trained" to do), however they can't tell faces apart. They can be tuned so that they can be used in realtime apps (like the autofocus on cameras)
HAAR cascades are limited in a number of ways: they can't tell faces apart, and you need a different cascade for different views (profile/portrait/other) they can also have trouble with skin colour as well(depending on training).
more advanced algorithms are able to workout face orientation in realtime (google hangout moustaches and the like) but once again they arn't able tell between two faces.
However there are no accurate real-time (or anywhere near realtime) algorithms that are able to tell faces apart (i.e. put a name to a face in a crowd) In fact I would go so far as to say that there are no non-realtime ones either.
As for "outperforming humans", a more accurate statement might be, "this algorithm outperforms (for this simplistic task) one experiment done with a limited set of humans on this one particular dataset which has been in the community for 10 years now and is thus highly gameable."
But I realize that's a lot less pithy.
In particular, this dataset is nearing saturation, and whenever that happens, differences in the accuracy numbers often don't mean much. So for example with Facebook's number at 97.53% and this paper's at 98.52%, you're talking about the difference between getting 148 pairs of faces wrong vs 89 pairs wrong. In practical terms, as a researcher working with a dataset like this, you very quickly learn to focus on just the ones your algorithm gets wrong, and it's impossible to not subconsciously try to optimize for getting those few cases correct, even if those techniques wouldn't actually help in the general case.
[1] https://news.ycombinator.com/item?id=7637866
[2] https://news.ycombinator.com/item?id=7638269