Skip to content
geeksforgeeks
  • Tutorials
    • Python
    • Java
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
    • Practice Coding Problems
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
Porter Stemmer Technique in Natural Language Processing
Next article icon

Porter Stemmer Technique in Natural Language Processing

Last Updated : 21 Dec, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

It is one of the most popular stemming methods proposed in 1980 by Martin Porter . It simplifies words by reducing them to their root forms, a process known as "stemming." For example, the words "running," "runner," and "ran" can all be reduced to their root form, "run." In this article we will explore more on the Porter Stemming technique and how to perform stemming in Python.

Prerequisites: NLP Pipeline, Stemming

Implementing Porter Stemmer

You can easily implement the Porter Stemmer using Python's Natural Language Toolkit (NLTK).

Python
import nltk from nltk.stem import PorterStemmer  # Create a Porter Stemmer instance porter_stemmer = PorterStemmer()  # Example words for stemming words = ["running", "jumps", "happily", "programming"]  # Apply stemming to each word stemmed_words = [porter_stemmer.stem(word) for word in words]  print("Original words:", words) print("Stemmed words:", stemmed_words) 

Output:

Original words: ['running', 'jumps', 'happily', 'programming']

Stemmed words: ['run', 'jump', 'happi', 'program']

How the Porter Stemmer Works

The Porter Stemmer works by applying a series of rules to remove suffixes from words in five steps. It identifies and strips common endings, reducing words to their base forms (stems). For example, "eating" becomes "eat" and "happily" becomes "happi." This helps in text analysis by standardizing word forms.

Key Features & Benefits of Porter Stemmer

  • The algorithm takes off common endings like "-ing," "-ed," and "-ly," changing "running" to "run" and "happily" to "happi."
  • The stemming process uses several steps to deal with different suffixes, making sure only the right ones are removed.
  • It counts groups of consonants in a word to help decide if certain endings should be taken off.
  • The Lancaster Stemmer is easy to implement and understand, making it beginner-friendly.
  • It processes text quickly, which is useful for handling large amounts of data.
  • It provides good results for most common English words and is widely used in NLP projects.
  • By simplifying words to their base forms, it reduces the number of unique words in a dataset, making analysis easier.

Limitations of Porter Stemmer

  • It can produce stems that are not meaningful, such as turning "iteration" into "iter."
  • The algorithm is primarily designed for English and may not work well with other languages.
  • Compared to other stemmers , it may remove suffixes more aggressively, making words more similar to each other.
  • Different words may be reduced to the same stem, resulting in a loss of meaning.

Next Article
Porter Stemmer Technique in Natural Language Processing

A

ayushimalm50
Improve
Article Tags :
  • NLP
  • AI-ML-DS
  • AI-ML-DS With Python

Similar Reads

    Natural Language Processing (NLP): 7 Key Techniques
    Natural Language Processing (NLP) is a subfield in Deep Learning that makes machines or computers learn, interpret, manipulate and comprehend the natural human language. Natural human language comes under the unstructured data category, such as text and voice. Generally, computers can understand the
    5 min read
    Unleashing the Power of Natural Language Processing
    Imagine talking to a computer and it understands you just like a human would. That’s the magic of Natural Language Processing. It a branch of AI that helps computers understand and respond to human language. It works by combining computer science to process text, linguistics to understand grammar an
    6 min read
    Natural Language Processing with R
    Natural Language Processing (NLP) is a field of artificial intelligence (AI) that enables machines to understand and process human language. R, known for its statistical capabilities, provides a wide range of libraries to perform various NLP tasks. Understanding Natural Language ProcessingNLP involv
    4 min read
    Natural Language Processing (NLP) Tutorial
    Natural Language Processing (NLP) is a branch of Artificial Intelligence (AI) that helps machines to understand and process human languages either in text or audio form. It is used across a variety of applications from speech recognition to language translation and text summarization.Natural Languag
    5 min read
    Natural Language Processing (NLP) - Overview
    Natural Language Processing (NLP) is a field that combines computer science, artificial intelligence and language studies. It helps computers understand, process and create human language in a way that makes sense and is useful. With the growing amount of text data from social media, websites and ot
    9 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences