[pandas] Series 기본

04 Dec 2020

pandas

Series

동일한 데이터 타입의 복수개의 성분으로 구성되는 자료구조이다.

1차원 구조이다.

Numpy의 ndarray 기반으로 되어있다.

Series 기초

Pandas의 Series를 이용해 만들 수 있다.

import numpy as np
import pandas as pd # pd로 줄여서 사용한다.

s = pd.Series([-1,4, 5, 99], dtype=float64)
print(s)
## 0    -1
## 1     4
## 2     5
## 3    99
## dtype: float64
print(s.values)    # [-1.  4.  5. 99.]
print(s.index)     # RangeIndex(start=0, stop=4, step=1)
print(s.dtype)     # float64

print(s) : 모든 정보가 나온다. index, values, dtype 을 알려주며 name명시하면 name도 함께 출력된다.
print(s.values) : Series의 ndarray로 된 values가 출력된다.
print(s.index) : Series의 index값들이 출력된다. 기본 index에 대해서는 실제값들이 나오지는 않고 위와 같이 출력된다.
print(s.dtype) : Series의 dtype, 즉, values의 dtype이 나온다.

index

Series 생성 시 index option을 별도로 지정할 수 있다. 지정시 list로 표현된다. 중복도 가능하다.

s = pd.Series([1, -8, 5, 10], dtype=np.float64, index=['b','a','c','d'] )
print(s)
## b     1.0
## a    -8.0
## c     5.0
## d    10.0
## dtype: float64
print(s['c'])   # 5.0
print(s[2])     # 5.0

print(s) : 숫자형 index가 지정한 index 로 출력된다.
print(s['c']) : 이와 같이 indexing이 가능하다.
print(s[2]) : 여전히 숫자형 index 를 사용할 수 있다.

Slicing

숫자 index를 가지고 다른 인덱스가 있는 자료형과 같이 slicing 가능하고 문자 index를 이용해서도 가능하다.

print(s[1:3])
## a   -8.0
## c    5.0
## dtype: float64
print(s['a':'c'])
## a   -8.0
## c    5.0
## dtype: float64

문자 index는 숫자 index와 다르게 stop을 포함해서 출력한다.

(참고)

**Series는 ndarray 기반이기 때문에 Boolean indexing, Fancy indexing, 그 밖의 numpy 함수나 method를 그대로 사용할 수 있다. **

dictionary를 이용한 Series

python의 dictionary를 이용해 Series를 만들어 본다.

dictionary의 key는 Series의 index로 마찬가지로 value는 Series의 value로 대응된다.

import pandas as pd
import numpy as np

my_dict = {"떡볶이":1500, "순대":3500, "튀김":2500}
s = pd.Series(my_dict)
print(s)
## 떡볶이    1500
## 순대     3500
## 튀김     2500
## dtype: int64

Series에는 name 속성을 추가할수 있다. index와 Series 두가지에 만들어 줄 수 있다.

s.name = "분식집"
s.index.name = '메뉴'
print(s)
## 메뉴
## 떡볶이    1500
## 순대     3500
## 튀김     2500
## Name: 분식집, dtype: int64

Series 추가 및 삭제

Series 를 추가하거나 삭제가 가능하다.

추가 : 주먹밥 메뉴가 추가되었다.

print(s)
## 떡볶이    1500
## 순대     3500
## 튀김     2500
## dtype: int64

s['주먹밥']=1000
print(s)
## 메뉴
## 떡볶이    1500
## 순대     3500
## 튀김     2500
## 주먹밥    1000
## Name: 분식집, dtype: int64

삭제 : 주먹밥을 추가하려고 했으나 다시 없애기로 결정했다. drop method를 사용한다.

print(s)
## 메뉴
## 떡볶이    1500
## 순대     3500
## 튀김     2500
## 주먹밥    1000
## Name: 분식집, dtype: int64

s = s.drop('주먹밥')
print(s)
## 메뉴
## 떡볶이    1500
## 순대     3500
## 튀김     2500
## Name: 분식집, dtype: int64        

주의 : drop을 한다고 저절로 형태를 바꾸는것이 아니기 때문에 다시 s를 정의해야 한다.