Results


Introduction


LFW provides information for supervised learning under two different training paradigms: image-restricted and unrestricted. Under the image-restricted setting, only binary "matched" or "mismatched" labels are given, for pairs of images. Under the unrestricted setting, the identity information of the person appearing in each image is also available, allowing one to potentially form additional image pairs. For more information, see the readme.

Often, algorithms designed for LFW will also make use of additional, external sources of training information. For instance, this issue originally arose when facial landmark detectors were being used to align the images (Huang et al.4). These detectors were pre-trained on face part images outside of LFW, so this algorithm was implicitly making use of this additional source of information. As these outside sources of training data can have a large impact on recognition accuracy, the use of such data must be considered when comparing algorithm performance. Therefore, we have roughly divided the image-restricted results into several classes based on the amount of use of outside training data. There are also additional notes on this issue.

Results in red indicate methods accepted but not yet published (e.g. accepted to an upcoming conference). Results in green indicate commercial recognition systems whose algorithms have not been published and peer-reviewed. We emphasize that researchers should not be compelled to compare against either of these types of results.

Image-Restricted Training Results


Strict LFW, no outside training data used: [see notes on the use of outside training data]

û ± SE
Eigenfaces1, original 0.6002 ± 0.0079
Nowak2, original 0.7245 ± 0.0040
Nowak2, funneled3 0.7393 ± 0.0049
Hybrid descriptor-based5, funneled 0.7847 ± 0.0051
3x3 Multi-Region Histograms (1024)6 0.7295 ± 0.0055
Pixels/MKL, funneled7 0.6822 ± 0.0041
V1-like/MKL, funneled7 0.7935 ± 0.0055
APEM (fusion), funneled25 0.8408 ± 0.0120
MRF-MLBP30 0.7908 ± 0.0014
Fisher vector faces32 0.8747 ± 0.0149

Outside training data used for alignment or feature extraction: [notes]

(commercial system, see note at top)
MERL4 0.7052 ± 0.0060
MERL+Nowak4, funneled 0.7618 ± 0.0058

LDML, funneled8 0.7927 ± 0.0060
Hybrid, aligned9 0.8398 ± 0.0035
Combined b/g samples based methods, aligned10 0.8683 ± 0.0034
Single LE + holistic14 0.8122 ± 0.0053
LBP + CSML, aligned15 0.8557 ± 0.0052
CSML + SVM, aligned15 0.8800 ± 0.0037
High-Throughput Brain-Inspired Features, aligned16 0.8813 ± 0.0058
LARK supervised20, aligned 0.8510 ± 0.0059
DML-eig SIFT21, funneled 0.8127 ± 0.0230
DML-eig combined21, funneled & aligned 0.8565 ± 0.0056
Convolutional DBN37 0.8777 ± 0.0062
SFRD+PMML28 0.8935 ± 0.0050
Pose Adaptive Filter (PAF)31 0.8777 ± 0.0051
Sub-SML35 0.8973 ± 0.0038
VMRS36 0.9110 ± 0.0059

Outside training data in recognition system (beyond alignment/feature extraction): [notes]

Attribute classifiers11 0.8362 ± 0.0158
Simile classifiers11 0.8414 ± 0.0131
Attribute and Simile classifiers11 0.8529 ± 0.0123
NReLU13 0.8073 ± 0.0134
Multiple LE + comp14 0.8445 ± 0.0046
Associate-Predict18 0.9057 ± 0.0056
Tom-vs-Pete23 0.9310 ± 0.0135
Tom-vs-Pete + Attribute23 0.9330 ± 0.0128
combined Joint Bayesian26 0.9242 ± 0.0108
high-dim LBP27 0.9517 ± 0.0113
TL Joint Bayesian34 0.9633 ± 0.0108

Human performance, measured through Amazon Mechanical Turk:

Human, funneled11 0.9920
Human, cropped11 0.9753
Human, inverse mask11 0.9427
Table 1: Mean classification accuracy û and standard error of the mean SE.
lfw restricted roc curve
Fig 1a: ROC curves averaged over 10 folds of View 2, all methods*.
lfw restricted roc curve
Fig 1b: ROC curves averaged over 10 folds of View 2, best performing*.


Unrestricted Training Results


û ± SE
LDML-MkNN, funneled8 0.8750 ± 0.0040
Combined multishot, aligned9 0.8950 ± 0.0051
LBP multishot, aligned9 0.8517 ± 0.0061
LBP PLDA, aligned17 0.8733 ± 0.0055
combined PLDA, funneled & aligned17 0.9007 ± 0.0051
combined Joint Bayesian26 0.9090 ± 0.0148
high-dim LBP27 0.9318 ± 0.0107
Fisher vector faces32 0.9303 ± 0.0105
Sub-SML35 0.9075 ± 0.0064
VMRS36 0.9205 ± 0.0045

(commercial system, see note at top)
face.com r2011b19 0.9130 ± 0.0030
CMD, aligned22 0.9170 ± 0.0110
SLBP, aligned22 0.9000 ± 0.0133
CMD+SLBP, aligned22 0.9258 ± 0.0136
VisionLabs ver. 1.0, aligned38 0.9290 ± 0.0031
Aurora, aligned39 0.9324 ± 0.0044
Face++40 0.9727 ± 0.0065
Table 2: Mean classification accuracy û and standard error of the mean SE.
lfw unrestricted roc curve
Fig 2a: ROC curves averaged over 10 folds of View 2, published*.

lfw unrestricted roc curve
Fig 2b: ROC curves averaged over 10 folds of View 2, all*.


Unsupervised Results


û ± SE
SD-MATCHES, 125x12512, aligned 0.6410 ± 0.0062
H-XS-40, 81x15012, aligned 0.6945 ± 0.0048
GJD-BC-100, 122x22512, aligned 0.6847 ± 0.0065
LARK unsupervised20, aligned 0.7223 ± 0.0049
LHS29, aligned 0.7340 ± 0.0040
I-LPQ*24, aligned 0.8620 ± 0.0046
Pose Adaptive Filter (PAF)31 0.8777 ± 0.0051
MRF-MLBP30 0.8008 ± 0.0013
DFD33 0.8402 ± 0.0044
VMRS36 0.8857 ± 0.0037
Table 3: Mean classification accuracy û and standard error of the mean SE.
lfw unsupervised roc curve
Fig 3: ROC curves over View 2*.


Notes


* Each point on the curve represents the average over the 10 folds of (false positive rate, true positive rate) for a fixed threshold.

(u) indicates ROC curve is for the unrestricted setting.

On the use of outside training data:

The use of training data outside of LFW can have a significant impact on recognition performance. For instance, it was shown in Wolf et al.10 that using LFW-a, the version of LFW aligned using a trained commercial alignment system, improved the accuracy of the early Nowak and Jurie method2 from 0.7393 on the funneled images to 0.7912, despite the fact that this method was designed to handle some misalignment.

To enable the fair comparison of different algorithms on LFW, we ask that researchers be specific about what type of outside training data was used in the experiments. We have also roughly separated the results into three categories.

The first class of results strictly use only the training data provided in LFW. The second class of results make implicit use of outside training data through trained facial feature detectors that are used to either align the images as in LFW-a or to determine where to extract features from in the image. The third class of results make explicit use of outside training data in the recognition system itself, beyond the alignment/feature extraction stage as in the second class.

Notes on the type of outside training data used for specific systems can be found in the list of methods at the bottom of the page. Details regarding training data falling under the second class are marked by sections beginning with a , and under the third class are marked by sections beginning with a .

Generating ROC Curves


The following script can be used to generate ROC curves using gnuplot: create_lfw_all_roc.p (only restricted / unrestricted / unsupervised).

The script takes in one text file for each method, containing on each line a point on the ROC curve, i.e. average true positive rate, followed by average false positive rate, separated by a single space. Additional methods can be added to the script by adding on to the plot command, e.g.
plot "nowak-original-roc.txt" using 2:1 with lines title "Nowak, original", \
     "nowak-funneled-roc.txt" using 2:1 with lines title "Nowak, funneled", \
     "new-method-roc.txt" using 2:1 with lines title "New Method"
Existing ROC files can be downloaded here:

Notes: gnuplot is multi-platform and freely distributed, and can be downloaded here. create_lfw_roc.p can either be run as a shell script on Unix/Linux machines (e.g. chmod u+x create_lfw_roc.p; ./create_lfw_roc.p) or loaded through gnuplot (e.g. at the gnuplot command line gnuplot> load "create_lfw_roc.p").

Methods

  1. Matthew A. Turk and Alex P. Pentland.
    Face Recognition Using Eigenfaces.
    Computer Vision and Pattern Recognition (CVPR), 1991.
    [pdf]

  2. Eric Nowak and Frederic Jurie.
    Learning visual similarity measures for comparing never seen objects.
    Computer Vision and Pattern Recognition (CVPR), 2007.
    [pdf]
    [webpage]

    Results were obtained using the binary available from the paper's webpage. View 1 of the database was used to compute the cut-off threshold used in computing mean classification accuracy on View 2. For each of the 10 folds of View 2 of the database, 9 of the sets were used as training, the similarity measures were computed for the held out test set, and the threshold value was used to classify pairs as matched or mismatched. This procedure was performed both on the original images as well as the set of aligned images from the funneled parallel database.

    We used the same parameters given on the paper's webpage, with C=1 for the SVM, specifically:

    pRazSimiERCF -verbose 2 -ntrees 5 -maxleavesnb 25000 -nppL 100000 -ncondtrial 1000 -nppT 1000 -wmin 15 -wmax 100 -neirelsize 1 -svmc 1

  3. Gary B. Huang, Vidit Jain, and Erik Learned-Miller.
    Unsupervised joint alignment of complex images.
    International Conference on Computer Vision (ICCV), 2007.
    [pdf]
    [webpage]

    Face images were aligned using publicly available source code from project webpage.

  4. Gary B. Huang, Michael J. Jones, and Erik Learned-Miller.
    LFW Results Using a Combined Nowak Plus MERL Recognizer.
    Faces in Real-Life Images Workshop in European Conference on Computer Vision (ECCV), 2008.
    [pdf]
    [commercial system, see note at top]

    Face images were aligned using a commercial system that attempts to identify nine facial landmark points through Viola-Jones type landmark detectors.

  5. Lior Wolf, Tal Hassner, and Yaniv Taigman.
    Descriptor Based Methods in the Wild.
    Faces in Real-Life Images Workshop in European Conference on Computer Vision (ECCV), 2008.
    [pdf]
    [webpage]

  6. Conrad Sanderson and Brian C. Lovell.
    Multi-Region Probabilistic Histograms for Robust and Scalable Identity Inference.
    International Conference on Biometrics (ICB), 2009.
    [pdf]

  7. Nicolas Pinto, James J. DiCarlo, and David D. Cox.
    How far can you get with a modern face recognition test set using only simple features? Computer Vision and Pattern Recognition (CVPR), 2009.
    [pdf]

  8. Matthieu Guillaumin, Jakob Verbeek, and Cordelia Schmid.
    Is that you? Metric Learning Approaches for Face Identification.
    International Conference on Computer Vision (ICCV), 2009.
    [pdf]
    [webpage]

    SIFT features were extracted at nine facial feature points using the detector of Everingham, Sivic, and Zisserman, 'Hello! My name is... Buffy' - automatic naming of characters in TV video, BMVC, 2006.

  9. Yaniv Taigman, Lior Wolf, and Tal Hassner.
    Multiple One-Shots for Utilizing Class Label Information.
    British Machine Vision Conference (BMVC), 2009.
    [pdf]
    [webpage]

    Used LFW-a, a version of LFW aligned using a commercial, fiducial-points based alignment system.

  10. Lior Wolf, Tal Hassner, and Yaniv Taigman.
    Similarity Scores based on Background Samples.
    Asian Conference on Computer Vision (ACCV), 2009.
    [pdf]

    Used LFW-a.

  11. Neeraj Kumar, Alexander C. Berg, Peter N. Belhumeur, and Shree K. Nayar.
    Attribute and Simile Classifiers for Face Verification.
    International Conference on Computer Vision (ICCV), 2009.
    [pdf]
    [webpage]

    A commercial face detector - Omron, OKAO vision - was used to detect fiducial point locations. These locations were used to align the images and extract features from particular face regions.

    Attribute classifiers (e.g. Brown Hair) were trained using outside data and Amazon Mechanical Turk labelings, and simile classifiers (e.g. mouth similar to Angelina Jolie) were trained using images from PubFig. The outputs of these classifiers on LFW images were used as features in the recognition system.

    The computed attributes for all images in LFW can be obtained in this file: lfw_attributes.txt. The file format and meaning are described on this page, and further information on the attributes can be found on the project website.

  12. Javier Ruiz-del-Solar, Rodrigo Verschae, and Mauricio Correa.
    Recognition of Faces in Unconstrained Environments: A Comparative Study.
    EURASIP Journal on Advances in Signal Processing (Recent Advances in Biometric Systems: A Signal Processing Perspective), Vol. 2009, Article ID 184617, 19 pages.
    [pdf]

  13. Vinod Nair and Geoffrey E. Hinton.
    Rectified Linear Units Improve Restricted Boltzmann Machines.
    International Conference on Machine Learning (ICML), 2010.
    [pdf]

    Used Machine Perception Toolbox from MPLab, UCSD to detect eye location, manually corrected eye coordinates for worst ~2000 detections, used coordinates to rotate and scale images.

    Used face data outside of LFW for unsupervised feature learning.

  14. Zhimin Cao, Qi Yin, Xiaoou Tang, and Jian Sun.
    Face Recognition with Learning-based Descriptor.
    Computer Vision and Pattern Recognition (CVPR), 2010.
    [pdf]

    Landmarks are detected using the fiducial point detector of Liang, Xiao, Wen, Sun, Face Alignment via Component-based Discriminative Search, ECCV, 2008, which are then used to extract face component images for feature computation.

    The "+ comp" method uses a pose-adaptive approach, where LFW images are labeled as being frontal, left facing, or right facing, using three images selected from the Multi-PIE data set.

  15. Hieu V. Nguyen and Li Bai.
    Cosine Similarity Metric Learning for Face Verification.
    Asian Conference on Computer Vision (ACCV), 2010.
    [pdf]

    Used LFW-a.

  16. Nicolas Pinto and David Cox.
    Beyond Simple Features: A Large-Scale Feature Search Approach to Unconstrained Face Recognition.
    International Conference on Automatic Face and Gesture Recognition (FG), 2011.
    [pdf]

    Used LFW-a.

  17. Peng Li, Yun Fu, Umar Mohammed, James H. Elder, and Simon J.D. Prince.
    Probabilistic Models for Inference About Identity.
    IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 1, pp. 144-157, Jan. 2012.
    [pdf]
    [webpage]

  18. Qi Yin, Xiaoou Tang, and Jian Sun.
    An Associate-Predict Model for Face Recognition.
    Computer Vision and Pattern Recognition (CVPR), 2011.
    [pdf]

    Four landmarks are detected using a standard facial point detector and used to determine twelve facial components.

    The recognition system makes use of 200 identities from the Multi-PIE data set, covering 7 poses and 4 illumination conditions for each identity.

  19. Yaniv Taigman and Lior Wolf.
    Leveraging Billions of Faces to Overcome Performance Barriers in Unconstrained Face Recognition.
    ArXiv e-prints, 2011.
    [pdf]
    [webpage]
    [commercial system, see note at top]

    † ‡A commercial recognition system, making use of outside training data, is tested on LFW.

  20. Hae Jong Seo and Peyman Milanfar.
    Face Verification Using the LARK Representation.
    IEEE Transactions on Information Forensics and Security, 2011.
    [pdf]

    Used LFW-a.

  21. Yiming Ying and Peng Li.
    Distance Metric Learning with Eigenvalue Optimization.
    Journal of Machine Learning Research (Special Topics on Kernel and Metric Learning), 2012.
    [pdf]

    Used LFW-a, and features extracted from facial feature points of Guillaumin et al., 2009.

  22. Chang Huang, Shenghuo Zhu, and Kai Yu.
    Large Scale Strongly Supervised Ensemble Metric Learning, with Applications to Face Verification and Retrieval.
    NEC Technical Report TR115, 2011.
    [pdf]
    [arxiv]

    Used LFW-a.

  23. Thomas Berg and Peter N. Belhumeur.
    Tom-vs-Pete Classifiers and Identity-Preserving Alignment for Face Verification.
    British Machine Vision Conference (BMVC), 2012.
    [pdf]

    † ‡Outside training data is used in alignment and recognition systems.

  24. Sibt ul Hussain, Thibault Napoléon, and Fréderic Jurie.
    Face Recognition Using Local Quantized Patterns.
    British Machine Vision Conference (BMVC), 2012.
    [pdf]
    [webpage/code]

    Used LFW-a.

  25. Haoxiang Li, Gang Hua, Zhe Lin, Jonathan Brandt, and Jianchao Yang.
    Probabilistic Elastic Matching for Pose Variant Face Verification.
    Computer Vision and Pattern Recognition (CVPR), 2013.
    [pdf]

  26. Dong Chen, Xudong Cao, Liwei Wang, Fang Wen, and Jian Sun.
    Bayesian Face Revisited: A Joint Formulation.
    European Conference on Computer Vision (ECCV), 2012.
    [pdf]

    Used outside training data in recognition system.

  27. Dong Chen, Xudong Cao, Fang Wen, and Jian Sun.
    Blessing of Dimensionality: High-dimensional Feature and Its Efficient Compression for Face Verification.
    Computer Vision and Pattern Recognition (CVPR), 2013.
    [pdf]

    Used outside training data in recognition system.

  28. Zhen Cui, Wen Li, Dong Xu, Shiguang Shan, and Xilin Chen.
    Fusing Robust Face Region Descriptors via Multiple Metric Learning for Face Recognition in the Wild.
    Computer Vision and Pattern Recognition (CVPR), 2013.
    [pdf]

    Used commercial face alignment software.

  29. Gaurav Sharma, Sibt ul Hussain, Fréderic Jurie.
    Local Higher-Order Statistics (LHS) for Texture Categorization and Facial Analysis.
    European Conference on Computer Vision (ECCV), 2012.
    [pdf]

    Used LFW-a.

  30. Shervin Rahimzadeh Arashloo and Josef Kittler.
    Efficient Processing of MRFs for Unconstrained-Pose Face Recognition.
    Biometrics: Theory, Applications and Systems, 2013.

  31. Dong Yi, Zhen Lei, and Stan Z. Li.
    Towards Pose Robust Face Recognition.
    Computer Vision and Pattern Recognition (CVPR), 2013.

    Used outside training data for alignment.

  32. Karen Simonyan, Omkar M. Parkhi, Andrea Vedaldi, and Andrew Zisserman.
    Fisher Vector Faces in the Wild.
    British Machine Vision Conference (BMVC), 2013.
    [pdf]
    [webpage]

    Used face landmark detector, trained using Everingham et al., "Taking the bite out of automatic naming of characters in TV video", Image and Vision Computing, 2009.

  33. Zhen Lei, Matti Pietikainen, and Stan Z. Li.
    Learning Discriminant Face Descriptor.
    IEEE Transactions on Pattern Analysis and Machine Intelligence, 24 July 2013.

  34. Xudong Cao, David Wipf, Fang Wen, and Genquan Duan.
    A Practical Transfer Learning Algorithm for Face Verification.
    International Conference on Computer Vision (ICCV), 2013.
    [webpage]

    Used outside training data in recognition system.

  35. Qiong Cao, Yiming Ying, and Peng Li.
    Similarity Metric Learning for Face Recognition.
    International Conference on Computer Vision (ICCV), 2013.
    [pdf]

    Used LFW-a.

  36. Oren Barkan, Jonathan Weill, Lior Wolf, and Hagai Aronowitz.
    Fast High Dimensional Vector Multiplication Face Recognition.
    International Conference on Computer Vision (ICCV), 2013.
    [pdf]

    Used LFW-a.

  37. Gary B. Huang, Honglak Lee, and Erik Learned-Miller.
    Learning Hierarchical Representations for Face Verification with Convolutional Deep Belief Networks.
    Computer Vision and Pattern Recognition (CVPR), 2012.
    [pdf]
    [webpage]

    Used LFW-a.

  38. VisionLabs ver. 1.0
    Brief method description:
    The method makes use of metric learning and dense local image descriptors. Results are reported for the unrestricted training setup, using LFW-a aligned images. External data is only used implicitly for face alignment.
    [webpage]
    [commercial system, see note at top]

  39. Aurora Computer Services Ltd: Aurora-c-2014-1
    [pdf] Technical Report
    [webpage]
    Brief method description:
    The face recognition technology is comprised of Aurora's proprietary algorithms, machine learning and computer vision techniques. We report results using the unrestricted training protocol, applied to the view 2 ten-fold cross validation test, using images provided by the LFW website, including the aligned and funnelled sets and some external data used solely for alignment purposes.
    [commercial system, see note at top]

  40. Face++
    [pdf] Technical Report
    [webpage]
    Brief method description:
    Our system leverages billion-level images and adopts deep learning framework for face verification. Our face representation is extracted from a set of facial landmarks. A new deep learning structure is designed to generate highly-abstract and expressive representation of faces. Based on the face representation, covariance based recognition model is built to predict whether the pair of faces have the same identity.
    [commercial system, see note at top]