[파이썬] requests-html `find()` 메서드 활용

06 Sep 2023

requests-html

The requests-html library is a powerful tool in Python for making HTTP requests and parsing the HTML responses. It provides a wide range of functionalities to extract data from web pages effortlessly.

One of the most useful methods in requests-html is the find() method. This method allows you to easily locate HTML elements within the parsed HTML content. It provides a convenient and intuitive way to navigate and retrieve specific elements based on various criteria.

Here’s an example that demonstrates how to use the find() method in requests-html:

from requests_html import HTMLSession

# Create an instance of HTMLSession
session = HTMLSession()

# Send a GET request to a web page
response = session.get('https://www.example.com')

# Access the parsed HTML content
html_content = response.html

# Find the first occurrence of a specific HTML element by tag name
element = html_content.find('h1', first=True)
print(element.text)

# Find all occurrences of a specific HTML element by class name
elements = html_content.find('.intro')
for element in elements:
    print(element.text)

# Find all occurrences of a specific HTML element by attribute value
elements = html_content.find('[name="description"]')
for element in elements:
    print(element.text)

In the above example, we first create an instance of HTMLSession to initiate an HTTP session. Then, we send a GET request to a web page and access the parsed HTML content using the html attribute of the response object.

Next, we use the find() method to locate HTML elements within the parsed HTML content. We demonstrate different ways to find elements based on tag name, class name, and attribute value. By setting the parameter first=True, we retrieve only the first occurrence of an element.

Finally, we display the text content of the retrieved elements using the text attribute.

Using the find() method in requests-html, you can easily scrape and extract data from web pages based on specific criteria. This powerful functionality makes it possible to automate web scraping tasks and gather valuable information for a wide range of applications.