Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • NLP
  • Data Analysis Tutorial
  • Python - Data visualization tutorial
  • NumPy
  • Pandas
  • OpenCV
  • R
  • Machine Learning Tutorial
  • Machine Learning Projects
  • Machine Learning Interview Questions
  • Machine Learning Mathematics
  • Deep Learning Tutorial
  • Deep Learning Project
  • Deep Learning Interview Questions
  • Computer Vision Tutorial
  • Computer Vision Projects
  • NLP
  • NLP Project
  • NLP Interview Questions
  • Statistics with Python
  • 100 Days of Machine Learning
Open In App
Next Article:
Python - Phrase removal in String
Next article icon

Python | Named Entity Recognition (NER) using spaCy

Last Updated : 03 Apr, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Named Entity Recognition (NER) is used in Natural Language Processing (NLP) to identify and classify important information within unstructured text. These "named entities" include proper nouns like people, organizations, locations and other meaningful categories such as dates, monetary values and products. By tagging these entities, we can transform raw text into structured data that can be analyzed, indexed or used in applications.

Ner
Representation of Named Entity Recognition

Use of spaCy in NER

spaCy is efficient in NLP tasks and is available in Python. It offers:

  • Optimized performance: spaCy is built for high-speed text processing making it ideal for large-scale NLP tasks.
  • Pre-trained models: It includes various pre-trained NER models that recognize multiple entity types out of the box.
  • Ease of use: With a user-friendly API allowing developers to implement NER with minimal effort.
  • Deep learning integration: The library works seamlessly with deep learning frameworks like TensorFlow and PyTorch.
  • Efficient pipeline processing: It can efficiently handle text processing tasks, including tokenization, part-of-speech tagging, dependency parsing and named entity recognition.
  • Customizability: We can train custom models or manually defining new entities.

Implementation of NER using spaCy

Here is the step by step procedure to do NER using spaCy:

1. Install spaCy

We will download spaCy. We will use en_core_web_sm model which is used for english and is a lightweight model that includes pre-trained word vectors and an NER component. spaCy supports various entity types including:

  • PERSON – Names of people
  • ORG – Organizations
  • GPE – Countries, cities, states
  • DATE – Dates and time expressions
  • MONEY – Monetary values
  • PRODUCT – Products and brand names
  • EVENT – Events (e.g., "Olympics")
  • LAW – Legal documents

A full list of entity types can be found in the spaCy documentation.

!pip install spacy
!python - m spacy download en_core_web_sm

The following code demonstrates how to perform NER using spaCy:

  • spacy.load("en_core_web_sm") loads the pre-trained English model.
  • nlp(text) processes the input text and tokenizes it.
  • doc.ents contains all recognized named entities.
Python
import spacy nlp = spacy.load('en_core_web_sm') sentence = "Why Apple is looking at buying U.K. startup for $1 billion ?" doc = nlp(sentence)  for ent in doc.ents:     print(ent.text, ent.start_char, ent.end_char, ent.label_) 

Output:

Apple 4 9 ORG
U.K. 31 35 GPE
$1 billion 48 58 MONEY

Here Apple is classified as an Organization (ORG), U.K. as a Geopolitical Entity (GPE) and $1 billion as Money (MONEY).

3. Effect of Case Sensitivity

Here we examine how capitalization affects entity recognition. Lowercasing an entity name may prevent it from being recognized correctly.

Python
sentence = "Why apple is now looking at buying U.K. startup for $1 billion ?" doc = nlp(sentence)  for ent in doc.ents:     print(ent.text, ent.start_char, ent.end_char, ent.label_) 

Output:

U.K. 35 39 GPE
$1 billion 52 62 MONEY

Since "apple" is in lowercase it is no longer recognised as an organization.

4. Customizing Named Entity Recognition

Here we manually add a new named entity to spaCy's output. This technique is useful when you want to recognize specific terms that are not in the pre-trained model.

  • We use Span to define the new entity.
  • The entity is added to doc.ents to update the output.
Python
from spacy.tokens import Span doc = nlp("Tesla is planning to launch a new product.")  custom_label = "ORG" doc.ents = (Span(doc, 0, 1, label=custom_label),)  for ent in doc.ents:     print(ent.text, ent.label_) 


Output:

Tesla ORG

Here "Tesla" was manually added as an organization. In a full NER training setup you can retrain the model using annotated datasets.

Named Entity Recognition (NER) is an essential tool for extracting valuable insights from unstructured text for better automation and analysis across industries. spaCy’s flexible capabilities allow developers to quickly implement and customize entity recognition for specific applications. It also offers an efficient and scalable solution for handling named entity recognition in real-world text processing.

You can download source code from here.


Next Article
Python - Phrase removal in String

A

Anannya Uberoi 1
Improve
Article Tags :
  • Python Programs
  • Machine Learning
  • Deep Learning
  • NLP
  • Natural-language-processing
  • python
  • Data Science
Practice Tags :
  • Machine Learning
  • python

Similar Reads

  • Python - Phrase removal in String
    Sometimes, while working with Python strings, we can have a problem in which we need to extract certain words in a string excluding the initial and rear K words. This can have application in many domains including all those include data. Lets discuss certain ways in which this task can be performed.
    2 min read
  • Python | Gender Identification by name using NLTK
    Natural Language Toolkit (NLTK) is a platform used for building programs for text analysis. We can observe that male and female names have some distinctive characteristics. Names ending in a, e and i are likely to be female, while names ending in k, o, r, s and t are likely to be male. Let's build a
    4 min read
  • Word Prediction using concepts of N - grams and CDF
    Have some basic understanding about - CDF and N - gramsProblem Statement - Given any input word and text file, predict the next n words that can occur after the input word in the text file. Examples: Input : is Output : is it simply makes sure that there are never Input : is Output : is split, all t
    6 min read
  • Install Specific Version of Spacy using Python PIP
    SpaCy is basically an open-source Natural Language Processing (NLP) library used for advanced tasks in the NLP field, written in programming languages like Python and Cython. Sometimes, in your project, you don't want to use the updated version of SpaCy. In this case, you want to install the specifi
    3 min read
  • Extract feed details from RSS in Python
    In the article, we will be seeing how to extract feed and post details using RSS feed for a Hashnode blog. Although we are going to use it for blogs on Hashnode it can be used for other feeds as well. RSS means Rich Site Summary and uses standard web formats to publish information that changes frequ
    3 min read
  • Python | Positional Index
    This article talks about building an inverted index for an information retrieval (IR) system. However, in a real-life IR system, we not only encounter single-word queries (such as "dog", "computer", or "alex") but also phrasal queries (such as "winter is coming", "new york", or "where is kevin"). To
    5 min read
  • Prefix Extraction Before Specific Character - Python
    Prefix extraction before a specific character" refers to retrieving or extracting the portion of a string that comes before a given character or delimiter. For example: For the string [email protected] and the specific character @, the prefix extraction would yield hello. Using find() One efficient wa
    3 min read
  • Creating a Basic hardcoded ChatBot using Python-NLTK
    Creating a basic chatbot using Python in Jupyter Notebook. This chatbot interacts with the user using the hardcoded inputs and outputs which are fed into the Python code. Requirements: You need to install the NLTK (Natural Language Toolkit), it provides libraries and programs for symbolic and statis
    2 min read
  • Split strings ignoring the space formatting characters - Python
    Splitting strings while ignoring space formatting characters in Python involves breaking a string into components while treating irregular spaces, tabs (\t), and newlines (\n) as standard separators. For example, splitting the string "Hello\tWorld \nPython" should result in ['Hello', 'World', 'Pytho
    3 min read
  • Text Preprocessing in Python | Set 2
    Text Preprocessing is one of the initial steps of Natural Language Processing (NLP) that involves cleaning and transforming raw data into suitable data for further processing. It enhances the quality of the text makes it easier to work and improves the performance of machine learning models. In this
    4 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences