pandas的Series

pandas.Series(data=None, index=None, dtype=None, name=None, copy=False, fastpath=False)

首先介绍一下基本的：

data : array-like, dict, or scalar value，数组类型

index : array-like or Index (1d)，

dtype : numpy.dtype or None

copy : boolean, default False


初始化时，如果只输入data和index，则得保证两者长度相同，否则报错：

>>> pd.Series(range(4),index=list("list"))
l    0
i    1
s    2
t    3
dtype: int32

>>> pd.Series(range(5),index=list("list"))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "E:Python3libsite-packagespandascoreseries.py", line 245, in __init__
    data = SingleBlockManager(data, index, fastpath=True)
  File "E:Python3libsite-packagespandascoreinternals.py", line 4070, in __init__
    fastpath=True)
  File "E:Python3libsite-packagespandascoreinternals.py", line 2685, in make_block
    return klass(values, ndim=ndim, fastpath=fastpath, placement=placement)
  File "E:Python3libsite-packagespandascoreinternals.py", line 109, in __init__
    len(self.mgr_locs)))
ValueError: Wrong number of items passed 5, placement implies 4

>>> pd.Series(range(4),index=list("lists"))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "E:Python3libsite-packagespandascoreseries.py", line 245, in __init__
    data = SingleBlockManager(data, index, fastpath=True)
  File "E:Python3libsite-packagespandascoreinternals.py", line 4070, in __init__
    fastpath=True)
  File "E:Python3libsite-packagespandascoreinternals.py", line 2685, in make_block
    return klass(values, ndim=ndim, fastpath=fastpath, placement=placement)
  File "E:Python3libsite-packagespandascoreinternals.py", line 109, in __init__
    len(self.mgr_locs)))
ValueError: Wrong number of items passed 4, placement implies 5

创建一个series：

>>> se = pd.Series(range(5))
>>> se.name = "values"
>>> se = pd.Series(range(5),name="values")
>>> se
0    0
1    1
2    2
3    3
4    4
Name: values, dtype: int32
# 两者效果等价

可以更改index：

>>> se.index
RangeIndex(start=0, stop=5, step=1)

>>> se.index = list("abcde")
>>> se
a    0
b    1
c    2
d    3
e    4
Name: values, dtype: int32

将index列命名：

>>> se.index.name = "id"
>>> se
id
a    0
b    1
c    2
d    3
e    4
Name: values, dtype: int32

转化为dataframe：

>>> se.to_frame()
    values
id
a        0
b        1
c        2
d        3
e        4

选出一个：

>>> se["b"]
1
>>> se.loc["b"]
1

但是里面的字符串不能用数字，（否则会被认为是切片操作选择）：

>>> se[1]   # 元素充足时
1

>>> se[5]   # 元素不足时，报错
Traceback (most recent call last):
  File "E:Python3libsite-packagespandasindexesase.py", line 2169, in get_value
    tz=getattr(series.dtype, 'tz', None))
  File "pandasindex.pyx", line 98, in pandas.index.IndexEngine.get_value (pandasindex.c:3557)
  File "pandasindex.pyx", line 106, in pandas.index.IndexEngine.get_value (pandasindex.c:3240)
  File "pandasindex.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandasindex.c:4279)
  File "pandassrchashtable_class_helper.pxi", line 732, in pandas.hashtable.PyObjectHashTable.get_item (pandashashtable.c:13742)
  File "pandassrchashtable_class_helper.pxi", line 740, in pandas.hashtable.PyObjectHashTable.get_item (pandashashtable.c:13696)
KeyError: 5

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "E:Python3libsite-packagespandascoreseries.py", line 603, in __getitem__
    result = self.index.get_value(self, key)
  File "E:Python3libsite-packagespandasindexesase.py", line 2175, in get_value
    return tslib.get_value_box(s, key)
  File "pandas	slib.pyx", line 946, in pandas.tslib.get_value_box (pandas	slib.c:19053)
  File "pandas	slib.pyx", line 962, in pandas.tslib.get_value_box (pandas	slib.c:18770)
IndexError: index out of bounds

>>> se[5] = "s"   # 也是错误的，越界了