Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • Beautiful Soup
  • Selenium
  • Scrapy
  • urllib
  • Request
  • open cv
  • Data analysis
  • Machine learning
  • NLP
  • Deep learning
  • Data Science
  • Interview question
  • ML math
  • ML Projects
  • ML interview
  • DL interview
Open In App
Next Article:
Web Scraping Financial News Using Python
Next article icon

Scraping websites with Newspaper3k in Python

Last Updated : 23 Jan, 2022
Comments
Improve
Suggest changes
Like Article
Like
Report

Web Scraping is a powerful tool to gather information from a website. To scrape multiple URLs, we can use a Python library called Newspaper3k. The Newspaper3k package is a Python library used for Web Scraping articles, It is built on top of requests and for parsing lxml. This module is a modified and better version of the Newspaper module which is also used for the same purpose.

Installation:

To install this module type the below command in the terminal.

pip install newspaper3k

Step-by-step Approach:

  1. First we will define a list containing the URLs or assign a single URL.
  2. We will create an Article object passing in the parameters such as the name of the URL and optional parameters     like language=’en’, for English
  3. We will then download and parse the file.
  4. Finally, display the data extracted.

Below are some examples based on the above approach:

Example 1

Below is a program to scrap data from a given URL.

Python3




# Import required module
import newspaper
 
# Assign url
url = 'https://www.geeksforgeeks.org/top-5-open-source-online-machine-learning-environments/'
 
# Extract web data
url_i = newspaper.Article(url="%s" % (url), language='en')
url_i.download()
url_i.parse()
 
# Display scrapped data
print(url_i.text)
 
 

Output:

Example 2

Here, we scrap data from multiple URLs and then display it.

Python3




# Import required modules
import newspaper
 
# Define list of urls
list_of_urls = ['https://www.geeksforgeeks.org/how-to-get-the-magnitude-of-a-vector-in-numpy/',
                'https://www.geeksforgeeks.org/3d-wireframe-plotting-in-python-using-matplotlib/',
                'https://www.geeksforgeeks.org/difference-between-small-data-and-big-data/']
 
# Parse through each url and display its content
for url in list_of_urls:
    url_i = newspaper.Article(url="%s" % (url), language='en')
    url_i.download()
    url_i.parse()
    print(url_i.text)
 
 

Output:



Next Article
Web Scraping Financial News Using Python

S

sangy987
Improve
Article Tags :
  • Python
  • Python web-scraping-exercises
  • python-modules
  • Web-scraping
Practice Tags :
  • python

Similar Reads

  • Newspaper scraping using Python and News API
    There are mainly two ways to extract data from a website: Use the API of the website (if it exists). For example, Facebook has the Facebook Graph API which allows retrieval of data posted on Facebook.Access the HTML of the webpage and extract useful information/data from it. This technique is called
    4 min read
  • Image Scraping with Python
    Scraping Is a very essential skill for everyone to get data from any website. In this article, we are going to see how to scrape images from websites using python. For scraping images, we will try different approaches. Method 1: Using BeautifulSoup and Requests bs4: Beautiful Soup(bs4) is a Python l
    2 min read
  • Web Scraping Financial News Using Python
    In this article, we will cover how to extract financial news seamlessly using Python. This financial news helps many traders in placing the trade in cryptocurrency, bitcoins, the stock markets, and many other global stock markets setting up of trading bot will help us to analyze the data. Thus all t
    3 min read
  • How to Scrape Websites with Beautifulsoup and Python ?
    Have you ever wondered how much data is created on the internet every day, and what if you want to work with those data? Unfortunately, this data is not properly organized like some CSV or JSON file but fortunately, we can use web scraping to scrape the data from the internet and can use it accordin
    10 min read
  • Python | Parse a website with regex and urllib
    Let's discuss the concept of parsing using python. In python we have lot of modules but for parsing we only need urllib and re i.e regular expression. By using both of these libraries we can fetch the data on web pages. Note that parsing of websites means that fetch the whole source code and that we
    2 min read
  • Web Scraping Tables with Selenium and Python
    Selenium is the automation software testing tool that obtains the website, performs various actions, or obtains the data from the website. It was chiefly developed for easing the testing work by automating web applications. Nowadays, apart from being used for testing, it can also be used for making
    4 min read
  • Automatic News Scraping with Python, Newspaper and Feedparser
    The problem we are trying to solve here is to extract relevant information from news articles, such as the title, author, publish date, and the main content of the article. This information can then be used for various purposes such as creating a personal news feed, analyzing trends in the news, or
    3 min read
  • Web Scraping for Stock Prices in Python
    Web scraping is a data extraction method that collects data only from websites. It is often used for data mining and gathering valuable insights from large websites. Web scraping is also useful for personal use. Python includes a nice library called BeautifulSoup that enables web scraping. In this a
    7 min read
  • Scraping data in network traffic using Python
    In this article, we will learn how to scrap data in network traffic using Python. Modules Neededselenium: Selenium is a portable framework for controlling web browser.time: This module provides various time-related functions.json: This module is required to work with JSON data.browsermobproxy: This
    5 min read
  • Newspaper: Article scraping & curation (Python)
    Newspaper is a Python module used for extracting and parsing newspaper articles. Newspaper use advance algorithms with web scraping to extract all the useful text from a website. It works amazingly well on online newspapers websites. Since it use web scraping too many request to a newspaper website
    7 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences