[파이썬] 음성 데이터의 음성 합성에서의 음성 텍스트 전달

05 Sep 2023

python

Google Cloud Text-to-Speech API

Google Cloud Text-to-Speech API는 Python 개발자가 음성 텍스트를 음성 데이터로 변환할 수 있는 강력한 도구입니다. 이 API를 사용하려면 다음 단계를 따르면 됩니다.

Google Cloud Console에 접속한 후, 프로젝트를 생성하고 Billing을 활성화합니다.
Google Cloud Text-to-Speech API를 활성화합니다.
필요한 인증 정보를 설정하고, Python 클라이언트 라이브러리를 설치합니다.
다음 예제 코드를 사용하여 음성 텍스트를 음성 데이터로 변환합니다:

from google.cloud import texttospeech

client = texttospeech.TextToSpeechClient()

text = "Hello, welcome to my blog."
synthesis_input = texttospeech.SynthesisInput(text=text)

voice = texttospeech.VoiceSelectionParams(language_code="en-US", ssml_gender=texttospeech.SsmlVoiceGender.NEUTRAL)

audio_config = texttospeech.AudioConfig(audio_encoding=texttospeech.AudioEncoding.MP3)

response = client.synthesize_speech(request={"input": synthesis_input, "voice": voice, "audio_config": audio_config})

with open("output.mp3", "wb") as out:
    out.write(response.audio_content)
    print("Audio content written to file 'output.mp3'")

위의 코드는 “Hello, welcome to my blog.”라는 텍스트를 음성 데이터로 변환하여 output.mp3 파일로 저장하는 예시입니다. 이를 실행하면 해당 텍스트의 음성 데이터가 생성됩니다.

pyttsx3

pyttsx3는 Python의 간단한 음성 합성 라이브러리입니다. 이 라이브러리는 기본적으로 사용자의 시스템 음성 엔진을 활용하여 텍스트를 음성으로 변환합니다.

다음은 pyttsx3를 사용하여 텍스트를 음성으로 출력하는 간단한 예제 코드입니다:

import pyttsx3

engine = pyttsx3.init()
text = "Hello, welcome to my blog."
engine.say(text)
engine.runAndWait()

위의 코드에서 text 변수에 원하는 텍스트를 입력하면 시스템 음성 엔진을 통해 해당 텍스트가 음성으로 출력됩니다.

이상으로 Python을 사용하여 음성 데이터의 음성 합성에서 음성 텍스트를 전달하는 방법을 알아보았습니다. 각각의 라이브러리나 API를 사용하여 다양한 활용 범위에서 음성 합성을 수행할 수 있습니다. 연구, 개발, 혹은 다양한 프로젝트에서 이러한 도구들을 적절히 활용하여 음성 인터페이스의 편의성과 기능을 향상시킬 수 있습니다.