[파이썬] Scrapy CSV 출력 설정

Scrapy is a powerful and flexible framework for web scraping in Python. One of the ways to save your scraped data is by exporting it to a CSV file. In this blog post, we will learn how to configure Scrapy to output scraped data to a CSV file.

Setting up Scrapy project

To begin with, let’s create a Scrapy project by running the following command in your terminal:

scrapy startproject myproject

This will create a new project folder named myproject.

Creating a Scrapy spider

Inside the project folder, navigate to the spiders directory using the following command:

cd myproject/spiders

Next, create a new spider by creating a Python file, for example my_spider.py. In this file, import the necessary modules such as scrapy, and define your spider class that inherits from scrapy.Spider.

import scrapy

class MySpider(scrapy.Spider):
    name = "my_spider"

    def start_requests(self):
        # define your start URLs here

    def parse(self, response):
        # parse the response and extract data

Exporting data to CSV

To export the extracted data to a CSV file, we need to configure the FEEDS setting in our Scrapy project’s settings.py file. Open the file and locate the FEEDS configuration section.

FEEDS = {
    'data.csv': {
        'format': 'csv',
        'encoding': 'utf8',
        'overwrite': True,
    }
}

In this example, we will save the data to a file named data.csv. You can adjust the filename and other options based on your requirements.

Running the Scrapy spider

Now that everything is set up, it’s time to run our Scrapy spider and scrape the desired data.

Navigate back to the project’s root directory and run the following command:

scrapy crawl my_spider

This will start the spider and begin scraping the data. Once the process is complete, you will find the CSV file with the exported data in the project’s root directory.

Conclusion

In this blog post, we learned how to configure Scrapy to output scraped data to a CSV file. This allows us to easily save and analyze the scraped data using popular data analysis tools like Excel or pandas. With Scrapy’s powerful web scraping capabilities and the ability to export data in different formats, we can efficiently gather and process data for various applications.