For those who're looking at implementing a similar service, you should first start by looking at this project from researchers at UC-Irvine: http://www.ics.uci.edu/~xzhu/face/
This work was just presented at the Conference on Computer Vision and Pattern Recognition (CVPR), the leading conference in vision. This paper was an oral presentation (< 5% submissions are), and having worked in this space, this is the real deal -- their results are incredible, and I've already heard that they are indeed reproducible.
Normally, face algorithms operate independently, in a sort of pipeline: face detection ("where in the image are the faces") -> pose detection ("which direction are the faces looking?") -> fiducial detection ("where are the eyes, nose, mouth, etc. on each face?") -> alignment ("warp the faces to make them have a more similar pose") -> recognition ("what is the identity of each face?").
This method does the first three in an integrated way (face, pose, and fiducial detection), and with many fewer training examples than most commercial systems (including face.com and Google's Picasa) achieves really impressive performance. This is a huge deal, because normally more training data = better performance, and also because by doing all three steps together, the numbers reported are much less optimistic than they normally are in such papers (which always assume that things "upstream" happen perfectly).
The need for less data matters especially for open systems, as many are suggesting to build in the comments here, because sharing images of faces runs into copyright and privacy issues. As a company, you can collect a large dataset of images and if you never share it with anyone, then it's okay to use for training your algorithms. But if you're building an open consortium/system, then almost by definition that requires the training images to be shared, which is a big problem because now you're limited to a very small set of available data that is cleared for such use.
As far as code, there is Matlab code available on the linked page, but it's not clear what their license is. By default, I would assume it's "for research purposes only", but the paper goes into some amount of detail on the method which would allow people to reproduce it from scratch if they are worried. The approach itself is quite similar to the traditional "flexible part model" that is the basis for most top-performing object recognition methods (co-invented by one of the authors of this paper, btw), and the modifications to deal with faces are not very complicated.
Face recognition is still very much an unsolved problem, and while the face.com guys had some interesting approaches, it is not clear that they were necessarily the right way to go. And a large part of getting recognition right is getting all the previous steps right, so building on this work would be a good place to start.
Also, one way of building a recognition system ("whose face is this?") is using verification ("are these two faces of the same person?") as the key inner loop. If you take this approach, then you should pay close attention to the results on the Labeled Faces in the Wild (LFW) benchmark, which is the current de-facto standard that vision researchers evaluate on: http://vis-www.cs.umass.edu/lfw/results.html
Since this is my area of expertise, I'm happy to answer any other questions that people might have.
This work was just presented at the Conference on Computer Vision and Pattern Recognition (CVPR), the leading conference in vision. This paper was an oral presentation (< 5% submissions are), and having worked in this space, this is the real deal -- their results are incredible, and I've already heard that they are indeed reproducible.
Normally, face algorithms operate independently, in a sort of pipeline: face detection ("where in the image are the faces") -> pose detection ("which direction are the faces looking?") -> fiducial detection ("where are the eyes, nose, mouth, etc. on each face?") -> alignment ("warp the faces to make them have a more similar pose") -> recognition ("what is the identity of each face?").
This method does the first three in an integrated way (face, pose, and fiducial detection), and with many fewer training examples than most commercial systems (including face.com and Google's Picasa) achieves really impressive performance. This is a huge deal, because normally more training data = better performance, and also because by doing all three steps together, the numbers reported are much less optimistic than they normally are in such papers (which always assume that things "upstream" happen perfectly).
The need for less data matters especially for open systems, as many are suggesting to build in the comments here, because sharing images of faces runs into copyright and privacy issues. As a company, you can collect a large dataset of images and if you never share it with anyone, then it's okay to use for training your algorithms. But if you're building an open consortium/system, then almost by definition that requires the training images to be shared, which is a big problem because now you're limited to a very small set of available data that is cleared for such use.
As far as code, there is Matlab code available on the linked page, but it's not clear what their license is. By default, I would assume it's "for research purposes only", but the paper goes into some amount of detail on the method which would allow people to reproduce it from scratch if they are worried. The approach itself is quite similar to the traditional "flexible part model" that is the basis for most top-performing object recognition methods (co-invented by one of the authors of this paper, btw), and the modifications to deal with faces are not very complicated.
Face recognition is still very much an unsolved problem, and while the face.com guys had some interesting approaches, it is not clear that they were necessarily the right way to go. And a large part of getting recognition right is getting all the previous steps right, so building on this work would be a good place to start.
Also, one way of building a recognition system ("whose face is this?") is using verification ("are these two faces of the same person?") as the key inner loop. If you take this approach, then you should pay close attention to the results on the Labeled Faces in the Wild (LFW) benchmark, which is the current de-facto standard that vision researchers evaluate on: http://vis-www.cs.umass.edu/lfw/results.html
Since this is my area of expertise, I'm happy to answer any other questions that people might have.