Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
Color Quantization using K-Means in Scikit Learn
Next article icon

ML | Implement Face recognition using k-NN with scikit-learn

Last Updated : 15 Mar, 2019
Comments
Improve
Suggest changes
Like Article
Like
Report

k-Nearest Neighbors:

k-NN is one of the most basic classification algorithms in machine learning. It belongs to the supervised learning category of machine learning. k-NN is often used in search applications where you are looking for “similar” items. The way we measure similarity is by creating a vector representation of the items, and then compare the vectors using an appropriate distance metric (like the Euclidean distance, for example).

It is generally used in data mining, pattern recognition, recommender systems and intrusion detection.

Libraries used are:

OpenCV2
Pandas
Numpy
Scikit-learn

Dataset used:
We used haarcascade_frontalface_default.xml dataset which is easily available online and also you can download it from this link.

Scikit-learn:
scikit-learn provides a range of supervised and unsupervised learning algorithms via a consistent interface in Python.
This library is built upon SciPy that must be installed on your devices in order to use scikit_learn.

Face-Recognition :
This includes three Python files where the first one is used to detect the face and storing it in a list format, second one is used to store the data in ‘.csv’ file format and the third one is used recognize the face.

facedetect.py –




# this file is used to detect face 
# and then store the data of the face
import cv2
import numpy as np
  
# import the file where data is
# stored in a csv file format
import npwriter
  
name = input("Enter your name: ")
  
# this is used to access the web-cam
# in order to capture frames
cap = cv2.VideoCapture(0)
  
classifier = cv2.CascadeClassifier("../dataset/haarcascade_frontalface_default.xml")
  
# this is class used to detect the faces as provided
# with a haarcascade_frontalface_default.xml file as data
f_list = []
  
while True:
    ret, frame = cap.read()
      
    # converting the image into gray
    # scale as it is easy for detection
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
      
    # detect multiscale, detects the face and its coordinates
    faces = classifier.detectMultiScale(gray, 1.5, 5)
      
    # this is used to detect the face which
    # is closest to the web-cam on the first position
    faces = sorted(faces, key = lambda x: x[2]*x[3],
                                     reverse = True)
  
    # only the first detected face is used
    faces = faces[:1]  
       
    # len(faces) is the number of
    # faces showing in a frame
    if len(faces) == 1:   
        # this is removing from tuple format      
        face = faces[0]   
        
        # storing the coordinates of the
        # face in different variables
        x, y, w, h = face 
  
        # this is will show the face
        # that is being detected     
        im_face = frame[y:y + h, x:x + w] 
  
        cv2.imshow("face", im_face)
  
  
    if not ret:
        continue
  
    cv2.imshow("full", frame)
  
    key = cv2.waitKey(1)
  
    # this will break the execution of the program
    # on pressing 'q' and will click the frame on pressing 'c'
    if key & 0xFF == ord('q'):
        break
    elif key & 0xFF == ord('c'):
        if len(faces) == 1:
            gray_face = cv2.cvtColor(im_face, cv2.COLOR_BGR2GRAY)
            gray_face = cv2.resize(gray_face, (100, 100))
            print(len(f_list), type(gray_face), gray_face.shape)
  
            # this will append the face's coordinates in f_list
            f_list.append(gray_face.reshape(-1)) 
        else:
            print("face not found")
  
        # this will store the data for detected
        # face 10 times in order to increase accuracy
        if len(f_list) == 10:
            break
  
# declared in npwriter
npwriter.write(name, np.array(f_list)) 
  
  
cap.release()
cv2.destroyAllWindows()
 
 

 
npwriter.py – Create/Update ‘.csv’: file




import pandas as pd
import numpy as np
import os.path
  
f_name = "face_data.csv"
  
# storing the data into a csv file
def write(name, data):
  
    if os.path.isfile(f_name):
  
        df = pd.read_csv(f_name, index_col = 0)
  
        latest = pd.DataFrame(data, columns = map(str, range(10000)))
        latest["name"] = name
  
        df = pd.concat((df, latest), ignore_index = True, sort = False)
  
    else:
  
        # Providing range only because the data
        # here is already flattened for when
        # it was store in f_list
        df = pd.DataFrame(data, columns = map(str, range(10000)))
        df["name"] = name
  
    df.to_csv(f_name)
 
 

 
recog.py – Face-recognizer




# this one is used to recognize the 
# face after training the model with
# our data stored using knn
import cv2
import numpy as np
import pandas as pd
  
from npwriter import f_name
from sklearn.neighbors import KNeighborsClassifier
  
  
# reading the data
data = pd.read_csv(f_name).values
  
# data partition
X, Y = data[:, 1:-1], data[:, -1]
  
print(X, Y)
  
# Knn function calling with k = 5
model = KNeighborsClassifier(n_neighbors = 5)
  
# fdtraining of model
model.fit(X, Y)
  
cap = cv2.VideoCapture(0)
  
classifier = cv2.CascadeClassifier("../dataset/haarcascade_frontalface_default.xml")
  
f_list = []
  
while True:
  
    ret, frame = cap.read()
  
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
  
    faces = classifier.detectMultiScale(gray, 1.5, 5)
  
    X_test = []
  
    # Testing data
    for face in faces:
        x, y, w, h = face
        im_face = gray[y:y + h, x:x + w]
        im_face = cv2.resize(im_face, (100, 100))
        X_test.append(im_face.reshape(-1))
  
    if len(faces)>0:
        response = model.predict(np.array(X_test))
        # prediction of result using knn
  
        for i, face in enumerate(faces):
            x, y, w, h = face
  
            # drawing a rectangle on the detected face
            cv2.rectangle(frame, (x, y), (x + w, y + h),
                                         (255, 0, 0), 3)
  
            # adding detected/predicted name for the face
            cv2.putText(frame, response[i], (x-50, y-50),
                              cv2.FONT_HERSHEY_DUPLEX, 2,
                                         (0, 255, 0), 3)
     
    cv2.imshow("full", frame)
  
    key = cv2.waitKey(1)
  
    if key & 0xFF == ord("q") :
        break
  
cap.release()
cv2.destroyAllWindows()
 
 

Output:



Next Article
Color Quantization using K-Means in Scikit Learn

N

ngrover241
Improve
Article Tags :
  • AI-ML-DS
  • Machine Learning
  • AI-ML-DS With Python
  • Python scikit-module
Practice Tags :
  • Machine Learning

Similar Reads

  • ML | Face Recognition Using PCA Implementation
    Face Recognition is one of the most popular and controversial tasks of computer vision. One of the most important milestones is achieved using This approach was first developed by Sirovich and Kirby in 1987 and first used by Turk and Alex Pentland in face classification in 1991. It is easy to implem
    6 min read
  • Implementation of KNN classifier using Scikit - learn - Python
    K-Nearest Neighbors is a most simple but fundamental classifier algorithm in Machine Learning. It is under the supervised learning category and used with great intensity for pattern recognition, data mining and analysis of intrusion. It is widely disposable in real-life scenarios since it is non-par
    3 min read
  • Color Quantization using K-Means in Scikit Learn
    In this article, we shall play around with pixel intensity value using Machine Learning Algorithms. The goal is to perform a Color Quantization example using KMeans in the Scikit Learn library.  Color Quantization Color Quantization is a technique in which the color spaces in an image are reduced to
    2 min read
  • Emojify using Face Recognition with Machine Learning
    In this article, we will learn how to implement a modification app that will show an emoji of expression which resembles the expression on your face. This is a fun project based on computer vision in which we use an image classification model in reality to classify different expressions of a person.
    7 min read
  • Implementing PCA in Python with scikit-learn
    In this article, we will learn about PCA (Principal Component Analysis) in Python with scikit-learn. Let's start our learning step by step. WHY PCA? When there are many input attributes, it is difficult to visualize the data. There is a very famous term ‘Curse of dimensionality in the machine learni
    5 min read
  • Implementing SVM and Kernel SVM with Python's Scikit-Learn
    In this article we will implement a classification model using Scikit learn implementation for SVM model in Python. Then we will try to understand what is a kernel and how it can helps us to achieve better performance by learning non-linear boundaries in the dataset. What is a SVM algorithm? Support
    6 min read
  • Swiss Roll Reduction with LLE in Scikit Learn
    This article discusses the concept of dimensionality reduction, specifically using the Swiss Roll dataset and the Locally Linear Embedding (LLE) algorithm. The article discusses the process involved in performing Swiss Roll reduction with LLE, including the steps of loading and preprocessing the dat
    9 min read
  • Feature Selection in Python with Scikit-Learn
    Feature selection is a crucial step in the machine learning pipeline. It involves selecting the most important features from your dataset to improve model performance and reduce computational cost. In this article, we will explore various techniques for feature selection in Python using the Scikit-L
    4 min read
  • ML | Implementing L1 and L2 regularization using Sklearn
    Prerequisites: L2 and L1 regularizationThis article aims to implement the L2 and L1 regularization for Linear regression using the Ridge and Lasso modules of the Sklearn library of Python. Dataset - House prices dataset.Step 1: Importing the required libraries C/C++ Code import pandas as pd import n
    3 min read
  • Linear Regression Implementation From Scratch using Python
    Linear Regression is a supervised learning algorithm which is both a statistical and a machine learning algorithm. It is used to predict the real-valued output y based on the given input value x. It depicts the relationship between the dependent variable y and the independent variables xi ( or featu
    4 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences