파이썬 오디오 라이브러리를 활용한 실시간 음향 분석 시스템 구축

28 Sep 2023

python

In today’s blog post, we will explore how to build a real-time sound analysis system using the power of Python and audio libraries. We will delve into the process of capturing audio, performing analysis on the captured sound, and visualizing the results. So let’s dive in!

Capturing Audio with Python

To capture audio with Python, we can make use of the pyaudio library. It provides a convenient way to access audio input devices and record audio data. Here’s an example code snippet that demonstrates how to capture audio using pyaudio:

import pyaudio

# Initialize the audio stream
audio = pyaudio.PyAudio()
stream = audio.open(format=pyaudio.paInt16, channels=1, rate=44100, input=True, frames_per_buffer=1024)

# Start capturing audio
print("Capturing audio...")
frames = []
for _ in range(0, int(44100 / 1024 * 5)):  # Capture audio for 5 seconds
    data = stream.read(1024)
    frames.append(data)

# Stop capturing audio
print("Finished capturing audio.")

# Close the audio stream
stream.stop_stream()
stream.close()
audio.terminate()

In the above code, we initialize the audio stream with specified format, number of channels, sample rate, input device, and frames per buffer. We then start capturing audio by reading data from the stream and appending it to a list of frames. After capturing audio for a desired duration, we stop the stream and close it.

Performing Sound Analysis

Now that we have captured the audio, let’s move on to performing sound analysis. For this, we can leverage the power of the librosa library in Python, which provides a wide range of audio analysis functions. Here’s an example code snippet that demonstrates how to perform spectrogram analysis using librosa:

import librosa
import librosa.display
import matplotlib.pyplot as plt
import numpy as np

# Convert captured frames to audio data
audio_data = np.frombuffer(b''.join(frames), dtype=np.int16)

# Perform spectrogram analysis
spectrogram = librosa.amplitude_to_db(np.abs(librosa.stft(audio_data)), ref=np.max)

# Display the spectrogram
plt.figure(figsize=(12, 8))
librosa.display.specshow(spectrogram, sr=44100, x_axis='time', y_axis='log')
plt.title('Spectrogram')
plt.colorbar(format='%+2.0f dB')
plt.show()

In the above code, we convert the captured frames into audio data that can be processed by librosa. We then compute the spectrogram of the audio data using the Short-Time Fourier Transform (STFT) and convert the magnitude values to decibels. Finally, we visualize the spectrogram using librosa and matplotlib.

Visualizing the Results

Now that we have performed sound analysis and obtained the spectrogram, let’s visualize the results. Visualizing audio data can provide useful insights into the sound characteristics. There are various visualizations that can be done, such as waveform plots, frequency spectrum plots, and spectrograms. Here, we’ll focus on displaying the spectrogram.

The spectrogram is a 2D representation of the audio signal, where the x-axis represents time, the y-axis represents frequency, and the color represents the intensity of the sound at each time-frequency point. It allows us to visualize the changes in frequency content over time.

To visualize the spectrogram, we make use of the librosa.display.specshow() function and additional plotting functions from matplotlib. By adjusting the parameters and settings, we can customize the appearance of the spectrogram to suit our needs.

Conclusion

In this blog post, we have learned how to build a real-time sound analysis system using Python audio libraries. We explored the process of capturing audio, performing sound analysis using librosa, and visualizing the results using matplotlib. By leveraging the power of Python and these audio libraries, we can gain valuable insights into sound data and create interesting applications in various domains such as music, speech recognition, and more.

#Python #AudioAnalysis