[파이썬] scipy 희소 행렬 (scipy.sparse)

Scipy is a powerful scientific computing library in Python that provides various modules and submodules for performing mathematical operations efficiently. One of the prominent submodules in Scipy is scipy.sparse, which is used for handling sparse matrices.

What is a Sparse Matrix?

A sparse matrix is a matrix where most of the elements are zero. Unlike dense matrices, which store all elements in memory, sparse matrices store only the non-zero elements and their positions. This efficient representation is particularly useful when dealing with large matrices with a large number of zero elements.

Why use Sparse Matrices?

Sparse matrices offer several advantages over dense matrices:

  1. Reduced memory usage: Sparse matrices store only the non-zero elements, saving memory compared to dense matrices that store all elements.
  2. Faster computations: Performing calculations on sparse matrices can be significantly faster, as the focus is on the non-zero elements rather than unnecessary zero elements.
  3. Better scalability: Sparse matrices allow efficient processing of large matrices that would be infeasible with dense matrices due to the memory requirements.

Working with Scipy Sparse Matrices

The scipy.sparse subpackage provides different types of sparse matrices, such as csr_matrix, csc_matrix, coo_matrix, etc. Each type is optimized for specific operations, and the choice depends on the specific use case.

Here’s an example of creating a sparse matrix using csr_matrix:

import scipy.sparse as sp

# Create a 3x3 sparse matrix
data = [1, 2, 3, 4, 5, 6]
row = [0, 1, 1, 2, 2, 2]
col = [0, 1, 2, 0, 1, 2]

sparse_matrix = sp.csr_matrix((data, (row, col)), shape=(3, 3))
print(sparse_matrix.toarray())

Output:

[[1 2 0]
 [0 3 4]
 [0 0 5]]

In the above example, we create a sparse matrix with non-zero data elements [1, 2, 3, 4, 5, 6], corresponding row indices [0, 1, 1, 2, 2, 2], and column indices [0, 1, 2, 0, 1, 2]. The shape parameter specifies the dimensions of the resulting matrix.

Operations on Sparse Matrices

Scipy provides a wide range of operations for working with sparse matrices. Some commonly used operations include:

Conclusion

Sparse matrices in Scipy provide a memory-efficient and scalable way to handle large matrices with a large number of zero elements. By utilizing the scipy.sparse submodule, you can perform various operations on sparse matrices efficiently, taking advantage of the optimized algorithms and data structures specifically designed for handling sparse matrices.