[파이썬] Beautiful Soup 4 이미지 링크 추출하기

Beautiful Soup is a powerful Python library for web scraping. With Beautiful Soup, you can easily extract data from HTML or XML documents. In this blog post, we will focus on how to extract image links using Beautiful Soup 4 in Python.

Installing Beautiful Soup

To get started, you need to install Beautiful Soup. Open your terminal and run the following command:

pip install beautifulsoup4

Importing the necessary libraries

Next, let’s import the necessary libraries - Beautiful Soup and requests:

from bs4 import BeautifulSoup
import requests

To extract image links, you first need to fetch the HTML content of the webpage. You can use the requests library to send an HTTP GET request and get the HTML response. Here’s an example:

url = "https://example.com"
response = requests.get(url)
html_content = response.text

Once you have the HTML content, you can create a BeautifulSoup object by passing it the HTML content and specifying the parser library. In our case, we will use the default HTML parser:

soup = BeautifulSoup(html_content, 'html.parser')

To extract all the image tags from the HTML document, you can use the find_all method and pass it the tag name (img in our case). This will return a list of all the image tags on the webpage:

image_tags = soup.find_all('img')

Now that we have the list of image tags, we can loop through each tag and extract the src attribute, which contains the URL of the image:

for img in image_tags:
    image_url = img['src']
    print(image_url)

You can perform any further processing on the extracted image URLs, such as saving them to a file or downloading the images.

Conclusion

In this blog post, we learned how to extract image links from a webpage using Beautiful Soup 4 in Python. By combining the power of Beautiful Soup with other libraries like requests, you can easily scrape and extract data from HTML or XML documents.

Remember to respect the website’s terms of service and check if web scraping is allowed before scraping any website.

Happy coding!