How to Scrape Text from Tag in Python

Last Updated : 03 Jul, 2024

In this article, we are going to scrape text data from tag. We will scrape all the data which comes under the strong tag of a website. We will cover all the basic understandings with clear and concise examples.

Scraping Text from Tag

Scraping text from HTML tags can be easily done by using:

Selenium
BeautifulSoup along with requests library.

In Selenium, we will simply use the By class to get all the  tag data. After acquiring the data, simply use a for loop to display the data from the tag. In the second method, we need to request the HTML content from the webpage. We will use the request library to do so. After getting the data we will parse it BeautifulScoup library of Python and at the end, we will display it.

Scraping Text from Tag using Selenium

In this, we will use selenium to fetch the text data from the tag. Let's see the code implementation.

Example

By class, selenium will provide us with the set of attributes that will help us locate the web elements.
Chrome Configuration options:-
- headless option will allow us to operate Chrome without GUI.
- sandboxing : sandboxing sandboxing problems of some websites.
- –disable-dev-shm-usage will disable /dev/shm/ file.
Then, we will provide our desired website to the web driver object and display the data under the strong tag using a for loop.
At the end, we will close the browser.

Python

from selenium import webdriver from selenium.webdriver.common.by import By  options = webdriver.ChromeOptions() options.add_argument('--headless') options.add_argument('--no-sandbox') options.add_argument('--disable-dev-shm-usage')  dr = webdriver.Chrome(options=options) dr.get("https://www.geeksforgeeks.org/machine-learning-types-of-artificial-intelligence/")  st = dr.find_elements(By.TAG_NAME, 'strong') for i in st:   print(i.text) dr.quit()

Output

Scraping Text from Tag using BeautifulScoup

In this, we are going to scrape the test from tag using Python's request library and beautiful soup.

Example

We will first import all the installed libraries in our code. Then we will fetch the HTML content from the webpage using the requests module of Python. After getting the data we will parse it BeautifulScoup library of Python and at the end, we will display it. We will use the find_all() function to find the text under the  tag. We will finally display our acquired data and exit from our defined function.

Python

#importing necessary libraries import requests from bs4 import BeautifulSoup  #creating a function in which we will accept the url and #fetch the html content from the url using request and apply the parser function on it def strongText(url):     r = requests.get(url)      TextData = BeautifulSoup(r.content, 'html.parser')      st = TextData.find_all('strong')      #displaying the data     for data in st:         print(data.text)          if __name__ == "__main__":      #input url     url = 'https://www.geeksforgeeks.org/machine-learning-types-of-artificial-intelligence/'        #function calling     strongText(url)

Output

web_scape01 — BeautifulScoup web scrape text

Best Practices of Web Scraping

Follow ethical rules while scraping the data. Do not scrape any sensitive or private information from any website.
To store the scraped data, use efficient data storage techniques such as databases or structured files such as CSV or JSON.
Make sure that websites do not block you. You can use randomized intervals or proxies to avoid getting blocked.
Prepare your code in such a way that it can handle errors like error 404(page not found).
Do not overload the server by making too many requests in a short period. Keep some delays between your requests.
Make sure you follow the guidelines suggested by the websites before scraping the data.

Conclusion

Web Scraping is an efficient way to scrape the desired data from our provided website. We can scrape text, files, links, and many more. Although, we need to consider some ethical rules before scraping the data. We have covered, how we can scrape the data from tag. We have shown two methods to perform this task. In the first method, we used selenium to scrape the data whereas, in the second method, we used BeautifulScoup along with the requests library of Python. In both of these methods, we have demonstrated a clear and concise way to scrape the data efficiently.

How to get text of a tag in selenium - Python?

vishuvaishnav3001

Improve

Article Tags :

Practice Tags :

python

How to Scrape Text from Tag in Python

Scraping Text from Tag

Scraping Text from Tag using Selenium

Example

Scraping Text from Tag using BeautifulScoup

Example

Best Practices of Web Scraping

Conclusion

Similar Reads