[파이썬] textblob 개체명 인식 (NER)

Named Entity Recognition (NER) is a subtask of natural language processing (NLP) that aims to identify and classify named entities within text into predefined categories such as person names, organization names, location names, etc. TextBlob is a Python library that provides simple and intuitive API for NLP tasks such as part-of-speech tagging, noun phrase extraction, and of course, named entity recognition.

In this blog post, we will explore how to perform named entity recognition using TextBlob in Python.

Installation

To get started with TextBlob, you need to install it using pip:

$ pip install textblob

TextBlob also requires additional NLTK corpora, so you will need to download them by running the following command:

$ python -m textblob.download_corpora

Usage

To perform named entity recognition using TextBlob, we first need to create a TextBlob object with the text we want to analyze. Then, we can use the noun_phrases and entities attributes to extract noun phrases and named entities, respectively.

Here’s an example code snippet that demonstrates how to use TextBlob for NER:

from textblob import TextBlob

text = "Apple Inc. was founded by Steve Jobs and Steve Wozniak in 1976. It is headquartered in Cupertino, California."

blob = TextBlob(text)

# Extract noun phrases
print("Noun Phrases:")
for np in blob.noun_phrases:
    print(np)

# Extract named entities
print("\nNamed Entities:")
for entity, tag in blob.tags:
    if tag == "NNP": # Proper noun
        print(entity)

In the above code, we create a TextBlob object with the given text. Then, we iterate over the noun_phrases and entities attributes to extract and print the noun phrases and named entities, respectively.

Conclusion

TextBlob provides a convenient way to perform named entity recognition in Python. With its simple API, you can quickly extract important information from text such as person names, organization names, and location names.

If you are working on a project that involves NLP and need to perform NER, give TextBlob a try! It can save you a lot of time and effort in implementing your own NER system.

I hope you found this blog post helpful. Happy coding!