Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
Deep Face Recognition
Next article icon

Face recognition using Artificial Intelligence

Last Updated : 10 Jun, 2023
Comments
Improve
Suggest changes
Like Article
Like
Report

The current technology amazes people with amazing innovations that not only make life simple but also bearable. Face recognition has over time proven to be the least intrusive and fastest form of biometric verification. The software uses deep learning algorithms to compare a live captured image to the stored face print to verify one’s identity. Image processing and machine learning are the backbones of this technology. Face recognition has received substantial attention from researchers due to human activities found in various applications of security like airports, criminal detection, face tracking, forensics, etc. Compared to other biometric traits like palm print, iris, fingerprint, etc., face biometrics can be non-intrusive.

They can be taken even without the user’s knowledge and further can be used for security-based applications like criminal detection, face tracking, airport security, and forensic surveillance systems. Face recognition involves capturing face images from a video or a surveillance camera. They are compared with the stored database. Face recognition involves training known images, classifying them with known classes, and then they are stored in the database. When a test image is given to the system it is classified and compared with the stored database.

Face recognition

Face recognition using Artificial Intelligence(AI) is a computer vision technology that is used to identify a person or object from an image or video. It uses a combination of techniques including deep learning,  computer vision algorithms, and Image processing. These technologies are used to enable a system to detect, recognize, and verify faces in digital images or videos.

The technology has become increasingly popular in a wide variety of applications such as unlocking a smartphone, unlocking doors, passport authentication, security systems, medical applications, and so on. There are even models that can detect emotions from facial expressions.

Difference between Face recognition  & Face detection 

Face recognition is the process of identifying a person from an image or video feed and face detection is the process of detecting a face in an image or video feed. In the case of  Face recognition, someone’s face is recognized and differentiated based on their facial features. It involves more advanced processing techniques to identify a person’s identity based on feature point extraction, and comparison algorithms.  and can be used for applications such as automated attendance systems or security checks. While Face detection is a much simpler process and can be used for applications such as image tagging or altering the angle of a photo based on the face detected. it is the initial step in the face recognition process and is a simpler process that simply identifies a face in an image or video feed. 

Image Processing and Machine learning

Image processing by computers involves the process of Computer Vision. It deals with a high-level understanding of digital images or videos. The requirement is to automate tasks that the human visual systems can do. So, a computer should be able to recognize objects such as the face of a human being or a lamppost, or even a statue.

Image reading: 

The computer reads any image in a range of values between 0 and 255. For any color image, there are 3 primary colors – Red, green, and blue. A matrix is formed for every primary color and later these matrices combine to provide a Pixel value for the individual R, G, and B colors. Each element of the matrices provide data about the intensity of the brightness of the pixel.

OpenCV is a Python library that is designed to solve computer vision problems. OpenCV was originally developed in 1999 by Intel but later supported by Willow Garage.

Machine learning

Every Machine Learning algorithm takes a dataset as input and learns from the data it basically means to learn the algorithm from the provided input and output as data. It identifies the patterns in the data and provides the desired algorithm. For instance, to identify whose face is present in a given image, multiple things can be looked at as a pattern:

  • Height/width of the face.
  • Height and width may not be reliable since the image could be rescaled to a smaller face or grid. However, even after rescaling, what remains unchanged are the ratios – the ratio of the height of the face to the width of the face won’t change.
  • Color of the face.
  • Width of other parts of the face like lips, nose, etc.

There is a pattern involved – different faces have different dimensions like the ones above. Similar faces have similar dimensions. Machine Learning algorithms only understand numbers so it is quite challenging. This numerical representation of a “face” (or an element in the training set) is termed as a feature vector. A feature vector comprises of various numbers in a specific order.

As a simple example, we can map a “face” into a feature vector which can comprise various features like:

  • Height of face (cm)
  • Width of the face (cm)
  • Average color of face (R, G, B)
  • Width of lips (cm)
  • Height of nose (cm)

Essentially, given an image, we can convert them into a feature vector like:

Height of face (cm) Width of the face (cm) Average color of face (RGB) Width of lips (cm) Height of nose (cm)

23.1 15.8 (255, 224, 189) 5.2 4.4

So, the image is now a vector that could be represented as (23.1, 15.8, 255, 224, 189, 5.2, 4.4). There could be countless other features that could be derived from the image,, for instance, hair color, facial hair, spectacles, etc. 

Machine Learning does two major functions in face recognition technology. These are given below:

  1. Deriving the feature vector: it is difficult to manually list down all of the features because there are just so many. A Machine Learning algorithm can intelligently label out many of such features. For instance, a complex feature could be the ratio of the height of the nose and the width of the forehead. 
  2. Matching algorithms: Once the feature vectors have been obtained, a Machine Learning algorithm needs to match a new image with the set of feature vectors present in the corpus.
  3. Face Recognition Operations

Face Recognition Operations

The technology system may vary when it comes to facial recognition. Different software applies different methods and means to achieve face recognition. The stepwise method is as follows:

  • Face Detection: To begin with, the camera will detect and recognize a face. The face can be best detected when the person is looking directly at the camera as it makes it easy for facial recognition. With the advancements in technology, this is improved where the face can be detected with slight variation in their posture of face facing the camera.
  • Face Analysis: Then the photo of the face is captured and analyzed. Most facial recognition relies on 2D images rather than 3D because it is more convenient to match to the database. Facial recognition software will analyze the distance between your eyes or the shape of your cheekbones.  
  • Image to Data Conversion: Now it is converted to a mathematical formula and these facial features become numbers. This numerical code is known as a face print. The way every person has a unique fingerprint, in the same way, they have unique face prints.
  • Match Finding: Then the code is compared against a database of other face prints. This database has photos with identification that can be compared. The technology then identifies a match for your exact features in the provided database. It returns with the match and attached information such as name and address or it depends on the information saved in the database of an individual.

Implementations

Steps:

  • Import the necessary packages
  • Load the known face images and make the face embedding of known image
  • Launch the live camera
  • Record the images from the live camera frame by frame
  • Make the face detection using the face_recognization face_location command
  • Make the rectangle around the faces
  • Make the face encoding for the faces captured by the camera
  • if the faces are matched then plot the person image else continue

Python3




# Import the necessary packages
import cv2 as cv
import face_recognition
import matplotlib.pyplot as plt
 
 
# Load the known image
known_image = face_recognition.load_image_file("pawankrgunjan.jpeg")
known_faces = face_recognition.face_encodings(face_image = known_image,
                                              num_jitters=50,
                                              model='large')[0]
 
# Lanch the live camera
cam = cv.VideoCapture(0)
#Check camera
if not cam.isOpened():
    print("Camera not working")
    exit()
     
# when camera is opened
while True:
     
    # campture the image frame-by-frame
    ret, frame = cam.read()
     
    # check frame is reading or not
    if not ret:
        print("Can't receive the frame")
        break
 
    # Face detection in the frame
    face_locations = face_recognition.face_locations(frame)
 
    for face_location in face_locations:
        top, right, bottom, left = face_location
        # Draw a rectangle with blue line borders of thickness of 2 px
        frame = cv.rectangle(frame,  (right,top), (left,bottom), color = (0,0, 255), thickness=2)
    # Check the each faces location in each frame
    try:
        # Frame encoding
        Live_face_encoding = face_recognition.face_encodings(face_image = frame,
                                                              num_jitters=23,
                                                              model='large')[0]
 
        # Match with the known faces
        results = face_recognition.compare_faces([known_faces], Live_face_encoding)
 
        if results:
            img = cv.cvtColor(frame, cv2.COLOR_BGR2RGB)
            img = cv.putText(img, 'PawanKrgunjan', (30, 55), cv2.FONT_HERSHEY_SIMPLEX, 1,
                    (255,0,0), 2, cv2.LINE_AA)
            print('Pawan Kumar Gunjan Enter....')
            plt.imshow(img)
            plt.show()
            break
    except:
        img = cv.putText(frame, 'Not PawanKrgunjan', (30, 55), cv2.FONT_HERSHEY_SIMPLEX, 1,
                (255,0,0), 2, cv2.LINE_AA)
        # Display the resulting frame
        cv.imshow('frame', img)
        # End the streaming
        if cv.waitKey(1) == ord('q'):
            break
     
 
# Release the capture
cam.release()
cv.destroyAllWindows()
 
 

Output:

Pawan Kumar Gunjan Enter....
Face Recognization-Geeksforgeeks

Face Recognization

The model accuracy further can be improved using deep learning and and other methods.

Face Recognition Softwares

Many renowned companies are constantly innovating and improvising to develop face recognition software that is foolproof and dependable. Some prominent software is being discussed below:  

a. Deep Vision AI

Deep Vision AI is a front-runner company excelling in facial recognition software. The company owns the proprietorship of advanced computer vision technology that can understand images and videos automatically. It then turns the visual content into real-time analytics and provides very valuable insights.  

Deep Vision AI provides a plug and plays platform to its users worldwide. The users are given real-time alerts and faster responses based upon the analysis of camera streams through various AI-based modules. The product offers a highly accurate rate of identification of individuals on a watch list by continuous monitoring of target zones. The software is highly flexible that it can be connected to any existing camera system or can be deployed through the cloud.  

At present, Deep Vision AI offers the best performance solution in the market supporting real-time processing at +15 streams per GPU.  

Business intelligence gathering is helped by providing real-time data on customers, their frequency of visits, or enhancement of security and safety. Further, the output from the software can provide attributes like count, age, gender, etc that can enhance the understanding of consumer behavior, changing preferences, shifts with time, and conditions that can guide future marketing efforts and strategies. The users also combine the face recognition capabilities with other AI-based features of Deep Vision AI like vehicle recognition to get more correlated data of the consumers.  

The company complies with international data protection laws and applies significant measures for a transparent and secure process of the data generated by its customers. Data privacy and ethics are taken care of.  

The potential markets include cities, public venues, public transportation, educational institutes, large retailers, etc. Deep Vision AI is a certified partner for NVIDIA’s Metropolis, Dell Digital Cities, Amazon AWS, Microsoft, Red Hat, and others.

b. SenseTime

  • SenseTime is a leading platform developer that has dedicated efforts to create solutions using the innovations in AI and big data analysis. The technology offered by SenseTime is multifunctional. The aspects of this technology are expanding and include the capabilities of facial recognition, image recognition, intelligent video analytics, autonomous driving, and medical image recognition. SenseTime software includes different subparts namely, SensePortrait-S, SensePortrait-D, and SenseFace.  
  • SensePortrait-S is a Static Face Recognition Server. It includes the functionality of face detection from an image source, extraction of features, extraction, and analysis of attributes, and target retrieval from a vast facial image database
  • SensePortrait D is a Dynamic Face Recognition Server. The capabilities included are face detection, tracking of a face, extraction of features, and comparison and analysis of data from data in multiple surveillance video streams.
  • SenseFace is a Face Recognition Surveillance Platform.  This utility is a Face Recognition technology that uses a deep learning algorithm. SenseFace is very efficient in integrated solutions to intelligent video analysis. It can be extensively used for target surveillance, analysis of the trajectory of a person, management of population and the associated data analysis, etc
  • SenseTime has provided its services to many companies and government agencies including Honda, Qualcomm, China Mobile, UnionPay, Huawei, Xiaomi, OPPO, Vivo, and Weibo.  

c. Amazon Rekognition

Amazon provides a cloud-based software solution Amazon Rekognition is a service computer vision platform. This solution allows an easy method to add image and video analysis to various applications. It uses a highly scalable and proven deep learning technology. The user is not required to have any machine learning expertise to use this software. The platform can be utilized to identify objects, text, people, activities, and scenes in images and videos. It can also detect any inappropriate content. The user gets a highly accurate facial analysis and facial search capabilities. Hence, the software can be easily used for verification, counting of people, and public safety by detection, analysis, and comparison of faces.

Organizations can use Amazon Rekognition Custom Labels to generate data about specific objects and scenes available in images according to their business needs. For example, a model may be easily built to classify specific machine parts on the assembly line or to detect unhealthy plants. The user simply provides the images of objects or scenes he wants to identify, and the service handles the rest.

d. FaceFirst

The FaceFirst software ensures the safety of communities, secure transactions, and great customer experiences. FaceFirst is secure, accurate, private, fast, and scalable software. Plug-and-play solutions are also included for physical security, authentication of identity, access control, and visitor analytics. It can be easily integrated into any system. This computer vision platform has been used for face recognition and automated video analytics by many organizations to prevent crime and improve customer engagement.

As a leading provider of effective facial recognition systems, it benefits to retail, transportation, event security, casinos, and other industry and public spaces. FaceFirst ensures the integration of artificial intelligence with existing surveillance systems to prevent theft, fraud, and violence.  

e. Trueface

TrueFace is a leading computer vision model that helps people understand their camera data and convert the data into actionable information. TrueFace is an on-premise computer vision solution that enhances data security and performance speeds. The platform-based solutions are specifically trained as per the requirements of individual deployment and operate effectively in a variety of ecosystems. The software places the utmost priority on the diversity of training data. It ensures equivalent performance for all users irrespective of their widely different requirements.

Trueface has developed a suite consisting of SDKs and a dockerized container solution based on the capabilities of machine learning and artificial intelligence. The suite can convert the camera data into actionable intelligence. It can help organizations to create a safer and smarter environment for their employees, customers, and guests using facial recognition, weapon detection, and age verification technologies.

f. Face++  

  • Face++ is an open platform enabled by the Chinese company Megvii. It offers computer vision technologies.  It allows users to easily integrate deep learning-based image analysis recognition technologies into their applications.
  • Face++ uses AI and machine vision in amazing ways to detect and analyze faces, and accurately confirm a person’s identity. Face++ is also developer-friendly being an open platform such that any developer can create apps using its algorithms. This feature has resulted in making Face++ the most extensive facial recognition platform in the world, with 300,000 developers from 150 countries using it.
  • The most significant usage of Face++ has been its integration into Alibaba’s City Brain platform. This has allowed the analysis of the CCTV network in cities to optimize traffic flows and direct the attention of medics and police by observing incidents.

g. Kairos

  • Kairos is a state-of-the-art and ethical face recognition solution available to developers and businesses across the globe. Kairos can be used for Face Recognition via Kairos cloud API, or the user can host Kairos on their servers. The utility can be used for control of data, security, and privacy. Organizations can ensure a safer and better accessibility experience for their customers.  
  • Kairos Face Recognition On-Premises has the added advantage of controlling data privacy and security, keeping critical data in-house and safe from any potential third parties/hackers. The speed of face recognition-enabled products is highly enhanced because it does not come across the issue of delay and other risks associated with public cloud deployment.
  • Kairos is ultra-scalable architecture such that the search for 10 million faces can be done at approximately the same time as 1 face. It is being accepted by the market with open hands.  

h. Cognitec

Cognitec’s FaceVACS Engine enables users to develop new applications for face recognition. The engine is very versatile as it allows a clear and logical API for easy integration in other software programs. Cognitec allows the use of the FaceVACS Engine through customized software development kits. The platform can be easily tailored through a set of functions and modules specific to each use case and computing platform. The capabilities of this software include image quality checks, secure document issuance, and access control by accurate verification.

The distinct features include:  

  • A very powerful face localization and face tracking
  • Efficient algorithms for enrollment, verification, and identification
  • Accurate checking of age, gender, age, exposure, pose deviation, glasses, eyes closed, uniform lighting detection, unnatural color, image, and face geometry
  • Fulfills the requirements of ePassports by providing ISO 19794-5 full-frontal image type checks and formatting

Utilization of Face Recognition

While facial recognition may seem futuristic, it’s currently being used in a variety of ways. Here are some surprising applications of this technology.

Genetic Disorder Identification:

There are healthcare apps such as Face2Gene and software like Deep Gestalt that uses facial recognition to detect genetic disorders. This face is then analyzed and matched with the existing database of disorders.

Airline Industry:

Some airlines use facial recognition to identify passengers. This face scanner would help save time and to prevent the hassle of keeping track of a ticket.

Hospital Security:

Facial recognition can be used in hospitals to keep a record of the patients which is far better than keeping records and finding their names, and addresses. It would be easy for the staff to use this app and recognize a patient and get its details within seconds. Secondly, can be used for security purposes where it can detect if the person is genuine or not or if is it a patient.

Detection of emotions and sentiments:

Real-time emotion detection is yet another valuable application of face recognition in healthcare. It can be used to detect emotions that patients exhibit during their stay in the hospital and analyze the data to determine how they are feeling. The results of the analysis may help to identify if patients need more attention in case they’re in pain or sad.

Problems and Challenges

Face recognition technology is facing several challenges. The common problems and challenges that a face recognition system can have while detecting and recognizing faces are discussed in the following paragraphs.   

  • Pose: A Face Recognition System can tolerate cases with small rotation angles, but it becomes difficult to detect if the angle would be large and if the database does not contain all the angles of the face then it can impose a problem.  
  •  Expressions: Because of emotions, human mood varies and results in different expressions. With these facial expressions, the machine could make mistakes to find the correct person’s identity.
  • Aging: With time and age face changes it is unique and does not remain rigid due to which it may be difficult to identify a person who is now 60 years old.
  •  Occlusion: Occlusion means blockage. This is due to the presence of various occluding objects such as glasses, beard, mustache, etc. on the face, and when an image is captured, the face lacks some parts.  Such a problem can severely affect the classification process of the recognition system.  
  • Illumination: Illumination means light variations. Illumination changes can vary the overall magnitude of light intensity reflected from an object, as well as the pattern of shading and shadows visible in an image. The problem of face recognition over changes in illumination is widely recognized to be difficult for humans and algorithms. The difficulties posed by illumination condition is a challenge for automatic face recognition systems. 
  • Identify similar faces: Different persons may have a similar appearance that sometimes makes it impossible to distinguish.

Disadvantages of Face Recognition

  1. The danger of automated blanket surveillance
  2. Lack of clear legal or regulatory framework
  3. Violation of the principles of necessity and proportionality
  4. Violation of the right to privacy
  5. Effect on democratic political culture


Next Article
Deep Face Recognition
author
shreya_garg
Improve
Article Tags :
  • AI-ML-DS
  • Machine Learning
  • Artificial Intelligence
Practice Tags :
  • Machine Learning

Similar Reads

  • Computer Vision Tutorial
    Computer Vision is a branch of Artificial Intelligence (AI) that enables computers to interpret and extract information from images and videos, similar to human perception. It involves developing algorithms to process visual data and derive meaningful insights. Why Learn Computer Vision?High Demand
    8 min read
  • Introduction to Computer Vision

    • Computer Vision - Introduction
      Ever wondered how are we able to understand the things we see? Like we see someone walking, whether we realize it or not, using the prerequisite knowledge, our brain understands what is happening and stores it as information. Imagine we look at something and go completely blank. Into oblivion. Scary
      3 min read

    • A Quick Overview to Computer Vision
      Computer vision means the extraction of information from images, text, videos, etc. Sometimes computer vision tries to mimic human vision. It’s a subset of computer-based intelligence or Artificial intelligence which collects information from digital images or videos and analyze them to define the a
      3 min read

    • Applications of Computer Vision
      Have you ever wondered how machines can "see" and understand the world around them, much like humans do? This is the magic of computer vision—a branch of artificial intelligence that enables computers to interpret and analyze digital images, videos, and other visual inputs. From self-driving cars to
      6 min read

    • Fundamentals of Image Formation
      Image formation is an analog to digital conversion of an image with the help of 2D Sampling and Quantization techniques that is done by the capturing devices like cameras. In general, we see a 2D view of the 3D world. In the same way, the formation of the analog image took place. It is basically a c
      7 min read

    • Satellite Image Processing
      Satellite Image Processing is an important field in research and development and consists of the images of earth and satellites taken by the means of artificial satellites. Firstly, the photographs are taken in digital form and later are processed by the computers to extract the information. Statist
      2 min read

    • Image Formats
      Image formats are different types of file types used for saving pictures, graphics, and photos. Choosing the right image format is important because it affects how your images look, load, and perform on websites, social media, or in print. Common formats include JPEG, PNG, GIF, and SVG, each with it
      5 min read

    Image Processing & Transformation

    • Digital Image Processing Basics
      Digital Image Processing means processing digital image by means of a digital computer. We can also say that it is a use of computer algorithms, in order to get enhanced image either to extract some useful information. Digital image processing is the use of algorithms and mathematical models to proc
      7 min read

    • Difference Between RGB, CMYK, HSV, and YIQ Color Models
      The colour spaces in image processing aim to facilitate the specifications of colours in some standard way. Different types of colour models are used in multiple fields like in hardware, in multiple applications of creating animation, etc. Let’s see each colour model and its application. RGBCMYKHSV
      3 min read

    • Image Enhancement Techniques using OpenCV - Python
      Image enhancement is the process of improving the quality and appearance of an image. It can be used to correct flaws or defects in an image, or to simply make an image more visually appealing. Image enhancement techniques can be applied to a wide range of images, including photographs, scans, and d
      15+ min read

    • Image Transformations using OpenCV in Python
      In this tutorial, we are going to learn Image Transformation using the OpenCV module in Python. What is Image Transformation? Image Transformation involves the transformation of image data in order to retrieve information from the image or preprocess the image for further usage. In this tutorial we
      5 min read

    • How to find the Fourier Transform of an image using OpenCV Python?
      The Fourier Transform is a mathematical tool used to decompose a signal into its frequency components. In the case of image processing, the Fourier Transform can be used to analyze the frequency content of an image, which can be useful for tasks such as image filtering and feature extraction. In thi
      5 min read

    • Python | Intensity Transformation Operations on Images
      Intensity transformations are applied on images for contrast manipulation or image thresholding. These are in the spatial domain, i.e. they are performed directly on the pixels of the image at hand, as opposed to being performed on the Fourier transform of the image. The following are commonly used
      5 min read

    • Histogram Equalization in Digital Image Processing
      A digital image is a two-dimensional matrix of two spatial coordinates, with each cell specifying the intensity level of the image at that point. So, we have an N x N matrix with integer values ranging from a minimum intensity level of 0 to a maximum level of L-1, where L denotes the number of inten
      6 min read

    • Python - Color Inversion using Pillow
      Color Inversion (Image Negative) is the method of inverting pixel values of an image. Image inversion does not depend on the color mode of the image, i.e. inversion works on channel level. When inversion is used on a multi color image (RGB, CMYK etc) then each channel is treated separately, and the
      4 min read

    • Image Sharpening Using Laplacian Filter and High Boost Filtering in MATLAB
      Image sharpening is an effect applied to digital images to give them a sharper appearance. Sharpening enhances the definition of edges in an image. The dull images are those which are poor at the edges. There is not much difference in background and edges. On the contrary, the sharpened image is tha
      4 min read

    • Wand sharpen() function - Python
      The sharpen() function is an inbuilt function in the Python Wand ImageMagick library which is used to sharpen the image. Syntax: sharpen(radius, sigma) Parameters: This function accepts four parameters as mentioned above and defined below: radius: This parameter stores the radius value of the sharpn
      2 min read

    • Python OpenCV - Smoothing and Blurring
      In this article, we are going to learn about smoothing and blurring with python-OpenCV. When we are dealing with images at some points the images will be crisper and sharper which we need to smoothen or blur to get a clean image, or sometimes the image will be with a really bad edge which also we ne
      7 min read

    • Python PIL | GaussianBlur() method
      PIL is the Python Imaging Library which provides the python interpreter with image editing capabilities. The ImageFilter module contains definitions for a pre-defined set of filters, which can be used with the Image.filter() method. PIL.ImageFilter.GaussianBlur() method create Gaussian blur filter.
      1 min read

    • Apply a Gauss filter to an image with Python
      A Gaussian Filter is a low-pass filter used for reducing noise (high-frequency components) and for blurring regions of an image. This filter uses an odd-sized, symmetric kernel that is convolved with the image. The kernel weights are highest at the center and decrease as you move towards the periphe
      2 min read

    • Spatial Filtering and its Types
      Spatial Filtering technique is used directly on pixels of an image. Mask is usually considered to be added in size so that it has specific center pixel. This mask is moved on the image such that the center of the mask traverses all image pixels. Classification on the basis of Linearity There are two
      3 min read

    • Python PIL | MedianFilter() and ModeFilter() method
      PIL is the Python Imaging Library which provides the python interpreter with image editing capabilities. The ImageFilter module contains definitions for a pre-defined set of filters, which can be used with the Image.filter() method. PIL.ImageFilter.MedianFilter() method creates a median filter. Pick
      1 min read

    • Python | Bilateral Filtering
      A bilateral filter is used for smoothening images and reducing noise, while preserving edges. This article explains an approach using the averaging filter, while this article provides one using a median filter. However, these convolutions often result in a loss of important edge information, since t
      2 min read

    • Python OpenCV - Morphological Operations
      Python OpenCV Morphological operations are one of the Image processing techniques that processes image based on shape. This processing strategy is usually performed on binary images.  Morphological operations based on OpenCV are as follows: ErosionDilationOpeningClosingMorphological GradientTop hatB
      7 min read

    • Erosion and Dilation of images using OpenCV in python
      Morphological operations are a set of operations that process images based on shapes. They apply a structuring element to an input image and generate an output image. The most basic morphological operations are two: Erosion and Dilation Basics of Erosion: Erodes away the boundaries of the foreground
      2 min read

    • Introduction to Resampling methods
      While reading about Machine Learning and Data Science we often come across a term called Imbalanced Class Distribution, which generally happens when observations in one of the classes are much higher or lower than in other classes. As Machine Learning algorithms tend to increase accuracy by reducing
      8 min read

    • Python | Image Registration using OpenCV
      Image registration is a digital image processing technique that helps us align different images of the same scene. For instance, one may click the picture of a book from various angles. Below are a few instances that show the diversity of camera angles.Now, we may want to "align" a particular image
      3 min read

    Feature Extraction and Description

    • Feature Extraction Techniques - NLP
      Introduction : This article focuses on basic feature extraction techniques in NLP to analyse the similarities between pieces of text. Natural Language Processing (NLP) is a branch of computer science and machine learning that deals with training computers to process a large amount of human (natural)
      11 min read

    • SIFT Interest Point Detector Using Python - OpenCV
      SIFT (Scale Invariant Feature Transform) Detector is used in the detection of interest points on an input image. It allows the identification of localized features in images which is essential in applications such as: Object Recognition in ImagesPath detection and obstacle avoidance algorithmsGestur
      4 min read

    • Feature Matching using Brute Force in OpenCV
      In this article, we will do feature matching using Brute Force in Python by using OpenCV library. Prerequisites: OpenCV OpenCV is a python library which is used to solve the computer vision problems. OpenCV is an open source Computer Vision library. So computer vision is a way of teaching intelligen
      13 min read

    • Feature detection and matching with OpenCV-Python
      In this article, we are going to see about feature detection in computer vision with OpenCV in Python. Feature detection is the process of checking the important features of the image in this case features of the image can be edges, corners, ridges, and blobs in the images. In OpenCV, there are a nu
      5 min read

    • Feature matching using ORB algorithm in Python-OpenCV
      ORB is a fusion of FAST keypoint detector and BRIEF descriptor with some added features to improve the performance. FAST is Features from Accelerated Segment Test used to detect features from the provided image. It also uses a pyramid to produce multiscale-features. Now it doesn’t compute the orient
      2 min read

    • Mahotas - Speeded-Up Robust Features
      In this article we will see how we can get the speeded up robust features of image in mahotas. In computer vision, speeded up robust features (SURF) is a patented local feature detector and descriptor. It can be used for tasks such as object recognition, image registration, classification, or 3D rec
      2 min read

    • Create Local Binary Pattern of an image using OpenCV-Python
      In this article, we will discuss the image and how to find a binary pattern using the pixel value of the image. As we all know, image is also known as a set of pixels. When we store an image in computers or digitally, it’s corresponding pixel values are stored. So, when we read an image to a variabl
      5 min read

    Deep Learning for Computer Vision

    • Image Classification using CNN
      The article is about creating an Image classifier for identifying cat-vs-dogs using TFLearn in Python. Machine Learning is now one of the hottest topics around the world. Well, it can even be said of the new electricity in today's world. But to be precise what is Machine Learning, well it's just one
      7 min read

    • What is Transfer Learning?
      Transfer learning is a machine learning technique where a model trained on one task is repurposed as the foundation for a second task. This approach is beneficial when the second task is related to the first or when data for the second task is limited. Leveraging learned features from the initial ta
      11 min read

    • Top 5 PreTrained Models in Natural Language Processing (NLP)
      Pretrained models are deep learning models that have been trained on huge amounts of data before fine-tuning for a specific task. The pre-trained models have revolutionized the landscape of natural language processing as they allow the developer to transfer the learned knowledge to specific tasks, e
      7 min read

    • ML | Introduction to Strided Convolutions
      Let us begin this article with a basic question - "Why padding and strided convolutions are required?" Assume we have an image with dimensions of n x n. If it is convoluted with an f x f filter, then the dimensions of the image obtained are [Tex](n-f+1) x (n-f+1)[/Tex]. Example: Consider a 6 x 6 ima
      2 min read

    • Dilated Convolution
      Prerequisite: Convolutional Neural Networks Dilated Convolution: It is a technique that expands the kernel (input) by inserting holes between its consecutive elements. In simpler terms, it is the same as convolution but it involves pixel skipping, so as to cover a larger area of the input.  Dilated
      5 min read

    • Continuous Kernel Convolution
      Continuous Kernel convolution was proposed by the researcher of Verije University Amsterdam in collaboration with the University of Amsterdam in a paper titled 'CKConv: Continuous Kernel Convolution For Sequential Data'. The motivation behind that is to propose a model that uses the properties of co
      6 min read

    • CNN | Introduction to Pooling Layer
      Pooling layer is used in CNNs to reduce the spatial dimensions (width and height) of the input feature maps while retaining the most important information. It involves sliding a two-dimensional filter over each channel of a feature map and summarizing the features within the region covered by the fi
      5 min read

    • CNN | Introduction to Padding
      During convolution, the size of the output feature map is determined by the size of the input feature map, the size of the kernel, and the stride. if we simply apply the kernel on the input feature map, then the output feature map will be smaller than the input. This can result in the loss of inform
      5 min read

    • What is the difference between 'SAME' and 'VALID' padding in tf.nn.max_pool of tensorflow?
      Padding is a technique used in convolutional neural networks (CNNs) to preserve the spatial dimensions of the input data and prevent the loss of information at the edges of the image. It involves adding additional rows and columns of pixels around the edges of the input data. There are several diffe
      14 min read

    • Convolutional Neural Network (CNN) Architectures
      Convolutional Neural Network(CNN) is a neural network architecture in Deep Learning, used to recognize the pattern from structured arrays. However, over many years, CNN architectures have evolved. Many variants of the fundamental CNN Architecture This been developed, leading to amazing advances in t
      11 min read

    • Deep Transfer Learning - Introduction
      Deep transfer learning is a machine learning technique that utilizes the knowledge learned from one task to improve the performance of another related task. This technique is particularly useful when there is a shortage of labeled data for the target task, as it allows the model to leverage the know
      8 min read

    • Introduction to Residual Networks
      Recent years have seen tremendous progress in the field of Image Processing and Recognition. Deep Neural Networks are becoming deeper and more complex. It has been proved that adding more layers to a Neural Network can make it more robust for image-related tasks. But it can also cause them to lose a
      4 min read

    • Residual Networks (ResNet) - Deep Learning
      After the first CNN-based architecture (AlexNet) that win the ImageNet 2012 competition, Every subsequent winning architecture uses more layers in a deep neural network to reduce the error rate. This works for less number of layers, but when we increase the number of layers, there is a common proble
      9 min read

    • ML | Inception Network V1
      Inception net achieved a milestone in CNN classifiers when previous models were just going deeper to improve the performance and accuracy but compromising the computational cost. The Inception network, on the other hand, is heavily engineered. It uses a lot of tricks to push performance, both in ter
      4 min read

    • Understanding GoogLeNet Model - CNN Architecture
      Google Net (or Inception V1) was proposed by research at Google (with the collaboration of various universities) in 2014 in the research paper titled "Going Deeper with Convolutions". This architecture was the winner at the ILSVRC 2014 image classification challenge. It has provided a significant de
      4 min read

    • Image Recognition with Mobilenet
      Introduction: Image Recognition plays an important role in many fields like medical disease analysis, and many more. In this article, we will mainly focus on how to Recognize the given image, what is being displayed. We are assuming to have a pre-knowledge of Tensorflow, Keras, Python, MachineLearni
      5 min read

    • VGG-16 | CNN model
      A Convolutional Neural Network (CNN) architecture is a deep learning model designed for processing structured grid-like data, such as images. It consists of multiple layers, including convolutional, pooling, and fully connected layers. CNNs are highly effective for tasks like image classification, o
      7 min read

    • Autoencoders in Machine Learning
      An autoencoder is a type of artificial neural network that learns to represent data in a compressed form and then reconstructs it as closely as possible to the original input. Autoencoders consists of two components: Encoder: This compresses the input into a compact representation and capture the mo
      9 min read

    • How Autoencoders works ?
      Autoencoders is a type of neural network used for unsupervised learning particularly for tasks like dimensionality reduction, anomaly detection and feature extraction. It consists of two main parts: an encoder and a decoder. The goal of an autoencoder is to learn a more efficient representation of t
      7 min read

    • Difference Between Encoder and Decoder
      Combinational Logic is the concept in which two or more input states define one or more output states. The Encoder and Decoder are combinational logic circuits. In which we implement combinational logic with the help of boolean algebra. To encode something is to convert in piece of information into
      9 min read

    • Implementing an Autoencoder in PyTorch
      Autoencoders are neural networks that learn to compress and reconstruct data. In this guide we’ll walk you through building a simple autoencoder in PyTorch using the MNIST dataset. This approach is useful for image compression, denoising and feature extraction. Implementation of Autoencoder in PyTor
      4 min read

    • Generative Adversarial Network (GAN)
      Generative Adversarial Networks (GANs) were introduced by Ian Goodfellow and his colleagues in 2014. GANs are a class of neural networks that autonomously learn patterns in the input data to generate new examples resembling the original dataset. GAN's architecture consists of two neural networks: Ge
      12 min read

    • Deep Convolutional GAN with Keras
      Deep Convolutional GAN (DCGAN) was proposed by a researcher from MIT and Facebook AI research. It is widely used in many convolution-based generation-based techniques. The focus of this paper was to make training GANs stable. Hence, they proposed some architectural changes in the computer vision pro
      9 min read

    • StyleGAN - Style Generative Adversarial Networks
      Generative Adversarial Networks (GANs) are a type of neural network that consist two neural networks: a generator that creates images and a discriminator that evaluates them. The generator tries to produce realistic data while the discriminator tries to differentiate between real and generated data.
      6 min read

    Object Detection and Recognition

    • Detect an object with OpenCV-Python
      Object detection refers to identifying and locating objects within images or videos. OpenCV provides a simple way to implement object detection using Haar Cascades a classifier trained to detect objects based on positive and negative images. In this article we will focus on detecting objects using i
      4 min read

    • Haar Cascades for Object Detection - Python
      Haar Cascade classifiers are a machine learning-based method for object detection. They use a set of positive and negative images to train a classifier, which is then used to detect objects in new images. Positive Images: These images contain the objects that the classifier is trained to detect.Nega
      3 min read

    • R-CNN - Region-Based Convolutional Neural Networks
      R-CNN (Region-based Convolutional Neural Network) was introduced by Ross Girshick et al. in 2014. R-CNN revolutionized object detection by combining the strengths of region proposal algorithms and deep learning, leading to remarkable improvements in detection accuracy and efficiency. This article de
      9 min read

    • YOLO v2 - Object Detection
      In terms of speed, YOLO is one of the best models in object recognition, able to recognize objects and process frames at the rate up to 150 FPS for small networks. However, In terms of accuracy mAP, YOLO was not the state of the art model but has fairly good Mean average Precision (mAP) of 63% when
      6 min read

    • Face recognition using Artificial Intelligence
      The current technology amazes people with amazing innovations that not only make life simple but also bearable. Face recognition has over time proven to be the least intrusive and fastest form of biometric verification. The software uses deep learning algorithms to compare a live captured image to t
      15+ min read

    • Deep Face Recognition
      DeepFace is the facial recognition system used by Facebook for tagging images. It was proposed by researchers at Facebook AI Research (FAIR) at the 2014 IEEE Computer Vision and Pattern Recognition Conference (CVPR). In modern face recognition there are 4 steps: DetectAlignRepresentClassify This app
      8 min read

    • ML | Face Recognition Using Eigenfaces (PCA Algorithm)
      In 1991, Turk and Pentland suggested an approach to face recognition that uses dimensionality reduction and linear algebra concepts to recognize faces. This approach is computationally less expensive and easy to implement and thus used in various applications at that time such as handwritten recogni
      4 min read

    • Emojify using Face Recognition with Machine Learning
      In this article, we will learn how to implement a modification app that will show an emoji of expression which resembles the expression on your face. This is a fun project based on computer vision in which we use an image classification model in reality to classify different expressions of a person.
      7 min read

    • Object Detection with Detection Transformer (DETR) by Facebook
      Facebook has just released its State of the art object detection Model on 27 May 2020. They are calling it DERT stands for Detection Transformer as it uses transformers to detect objects.This is the first time that transformer is used for such a task of Object detection along with a Convolutional Ne
      7 min read

    Image Segmentation

    • Image Segmentation Using TensorFlow
      Image segmentation refers to the task of annotating a single class to different groups of pixels. While the input is an image, the output is a mask that draws the region of the shape in that image. Image segmentation has wide applications in domains such as medical image analysis, self-driving cars,
      8 min read

    • Thresholding-Based Image Segmentation
      Image segmentation is the technique of subdividing an image into constituent sub-regions or distinct objects. The level of detail to which subdivision is carried out depends on the problem being solved. That is, segmentation should stop when the objects or the regions of interest in an application h
      7 min read

    • Region and Edge Based Segmentation
      Segmentation Segmentation is the separation of one or more regions or objects in an image based on a discontinuity or a similarity criterion. A region in an image can be defined by its border (edge) or its interior, and the two representations are equal. There are prominently three methods of perfor
      4 min read

    • Image Segmentation with Watershed Algorithm - OpenCV Python
      Image segmentation is a fundamental computer vision task that involves partitioning an image into meaningful and semantically homogeneous regions. The goal is to simplify the representation of an image or make it more meaningful for further analysis. These segments typically correspond to objects or
      9 min read

    • Mask R-CNN | ML
      The article provides a comprehensive understanding of the evolution from basic Convolutional Neural Networks (CNN) to the sophisticated Mask R-CNN, exploring the iterative improvements in object detection, instance segmentation, and the challenges and advantages associated with each model. What is R
      9 min read

    3D Reconstruction

    • Python OpenCV - Depth map from Stereo Images
      OpenCV is the huge open-source library for the computer vision, machine learning, and image processing and now it plays a major role in real-time operation which is very important in today’s systems.Note: For more information, refer to Introduction to OpenCV Depth Map : A depth map is a picture wher
      2 min read

    • Top 7 Modern-Day Applications of Augmented Reality (AR)
      Augmented Reality (or AR) in simpler terms means intensifying the reality of real-time objects which we see through our eyes or gadgets like smartphones. You may think how is it trending a lot? The answer is that it can impactfully offer an unforgettable experience either of learning, measuring the
      9 min read

    • Virtual Reality, Augmented Reality, and Mixed Reality
      Virtual Reality (VR): The word 'virtual' means something that is conceptual and does not exist physically and the word 'reality' means the state of being real. So the term 'virtual reality' is itself conflicting. It means something that is almost real. We will probably never be on the top of Mount E
      3 min read

    • Camera Calibration with Python - OpenCV
      Prerequisites: OpenCV A camera is an integral part of several domains like robotics, space exploration, etc camera is playing a major role. It helps to capture each and every moment and helpful for many analyses. In order to use the camera as a visual sensor, we should know the parameters of the cam
      4 min read

    • Python OpenCV - Pose Estimation
      What is Pose Estimation? Pose estimation is a computer vision technique that is used to predict the configuration of the body(POSE) from an image. The reason for its importance is the abundance of applications that can benefit from technology.  Human pose estimation localizes body key points to accu
      6 min read

  • 40+ Top Computer Vision Projects [2025 Updated]
    Computer Vision is a branch of Artificial Intelligence (AI) that helps computers understand and interpret context of images and videos. It is used in domains like security cameras, photo editing, self-driving cars and robots to recognize objects and navigate real world using machine learning. This a
    4 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences