Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
Processing text using NLP | Basics
Next article icon

ML | Text Summarization of links based on user query

Last Updated : 30 Nov, 2018
Comments
Improve
Suggest changes
Like Article
Like
Report

Whenever a user searches for particular information on the internet, multiple results are returned which are explained in variety ways. It becomes difficult and time-consuming to understand information.

Let’s say for example when a user searches for “machine learning” on Google, number of results are returned. Results returned by Google related to “machine learning” have explained “machine learning” in different ways. It becomes difficult and time-consuming to understand the various definition of “machine learning”. Thus, given a busy schedule of people and an immense amount of information available on Internet, there is a need for automatic summarization of links based on user query.

Introduction to Text Summarization:
Text summarization is the process of creating a shorter version of the text with only vital information and thus, helps the user to understand the text in a shorter amount of time. The main advantage of text summarization lies in the fact that it reduces user’s time in searching the important details in the document.

There are two main approaches to summarizing text documents –

  1. Extractive Method: It involves selecting phrases and sentences from the original text and including it in the final summary.

    Example:

    Original Text : Python is a high-level, interpreted, interactive and object-oriented scripting language.Python is a great language for the beginner-level programmers.

    Extractive Summary : Python is a high-level scripting language is great language for beginner-level programmers.

  2. Abstractive Method: The Abstractive method involves generating entirely new phrases and sentences to capture the meaning of source document.

    Example:

    Original Text : Python is a high-level, interpreted, interactive and object-oriented scripting language.Python is a great language for the beginner-level programmers

    Abstractive Summary : Python is interpreted and interactive language and it is easy to learn.

    As we compare the summaries of two methods, we find the abstractive method best for creating summaries. Summaries created by abstractive method is summary that we humans create. Although best, not much of advances have been made in the Abstractive method.

Solution-

The problem of surfing can be solved by following steps:

  • Allow user to enter query.(on web application or on app.)
  • If the query is valid, search the query on google.
  • Google will return multiple results related to query, extract all the links on the first page(because the links are highly relevant to user query)
  • Scrape and clean the data from all links and store it in text file.
  • Send the data to machine learning models to generate a summary(abstractive)
  • Reference:
    https://machinelearningmastery.com/gentle-introduction-text-summarization/
    https://ai.googleblog.com/2016/08/text-summarization-with-tensorflow.html



    Next Article
    Processing text using NLP | Basics

    K

    kirtanbhatt
    Improve
    Article Tags :
    • Machine Learning
    Practice Tags :
    • Machine Learning

    Similar Reads

    • RWR Similarity Measure in Graph-Based Text Mining
      Graph-based text mining is an essential technique for extracting meaningful patterns and relationships from unstructured text data. One of the powerful methods used in this domain is the Random Walk with Restart (RWR) algorithm. This article delves into the RWR similarity measure, its application in
      6 min read
    • Processing text using NLP | Basics
      In this article, we will be learning the steps followed to process the text data before using it to train the actual Machine Learning Model. Importing Libraries The following must be installed in the current working environment: NLTK Library: The NLTK library is a collection of libraries and program
      2 min read
    • Classification of Text Documents using Naive Bayes
      In natural language processing and machine learning Naïve Bayes approach is a popular method for classifying text documents. This method classifies documents into predetermined types based on the likelihood of a word occurring by using the concepts of the Bayes theorem. This article aims to implemen
      5 min read
    • Machine Learning-based Recommendation Systems for E-learning
      In today's digital age, e-learning platforms are transforming education by giving students unprecedented access to a wide range of courses and resources. Machine learning-based recommendation systems have emerged as critical tools for effectively navigating this vast amount of content. The article d
      9 min read
    • ML - Content Based Recommender System
      A Content-Based Recommender works by the data that we take from the user, either explicitly (rating) or implicitly (clicking on a link). By the data we create a user profile, which is then used to suggest to the user, as the user provides more input or take more actions on the recommendation, the en
      3 min read
    • Clustering Based Algorithms in Recommendation System
      Recommendation systems have become an essential tool in various industries, from e-commerce to streaming services, helping users discover products, movies, music, and more. Clustering-based algorithms are a powerful technique used to enhance these systems by grouping similar users or items, enabling
      5 min read
    • Types of Queries in IR Systems
      During the process of indexing, many keywords are associated with document set which contains words, phrases, date created, author names, and type of document. They are used by an IR system to build an inverted index which is then consulted during the search. The queries formulated by users are comp
      3 min read
    • Multilingual Google Meet Summarizer - Python Project
      At the start of 2020, we faced the largest crisis of the 21st century - The COVID-19 pandemic. Amidst the chaos, the generation eventually found a way to get the job done by introducing automation in every other aspect of life. After the hit of the pandemic, we have encountered a rise of 87% in vide
      6 min read
    • Understand Data Pipeline for Text to Numeric Data
      Data pipelining is essential for transforming raw text data into a numeric format suitable for analysis and model training in Natural Language Processing (NLP). This article outlines a comprehensive preprocessing pipeline, leveraging Python and the NLTK library, to convert textual data into a usable
      6 min read
    • Python | PoS Tagging and Lemmatization using spaCy
      spaCy is one of the best text analysis library. spaCy excels at large-scale information extraction tasks and is one of the fastest in the world. It is also the best way to prepare text for deep learning. spaCy is much faster and accurate than NLTKTagger and TextBlob. How to Install ? pip install spa
      2 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences