Pandas一般导入文件为：

import pandas as pd
import numpy  as np
import matplotlib.pyplot as plt

对象建立¶

创建Series，在pandas中，Series是代labels的数组。首先用list创建数据，pandas会自动创建labels

s = pd.Series([1,3,5,np.nan,6,8])
s

0    1.0
1    3.0
2    5.0
3    NaN
4    6.0
5    8.0
dtype: float64

通过传递numpy数组，使用datetime索引和标记列来创建DataFrame：

dates = pd.date_range('20130101', periods=6)
dates

DatetimeIndex(['2013-01-01', '2013-01-02', '2013-01-03', '2013-01-04',
               '2013-01-05', '2013-01-06'],
              dtype='datetime64[ns]', freq='D')

df = pd.DataFrame(np.random.randn(6,4), index=dates, columns=list('ABCD'))
df

通过传递可以转换为类似Series的对象的Dict来创建DataFrame。

df2 = pd.DataFrame({ 'A' : 1.,
                     'B' : pd.Timestamp('20130102'),
                     'C' : pd.Series(1,index=list(range(4)),dtype='float32'),
                     'D' : np.array([3] * 4,dtype='int32'),
                     'E' : pd.Categorical(["test","train","test","train"]),
                     'F' : 'foo' })
df2

显示其类型：

df2.dtypes

A           float64
B    datetime64[ns]
C           float32
D             int32
E          category
F            object
dtype: object

查看数据¶

查看开始或结尾几行数据

df.head(2)

df.tail(3)

查看索引、列名称、numpy数据

df.index

DatetimeIndex(['2013-01-01', '2013-01-02', '2013-01-03', '2013-01-04',
               '2013-01-05', '2013-01-06'],
              dtype='datetime64[ns]', freq='D')

df.columns

Index(['A', 'B', 'C', 'D'], dtype='object')

df.values

array([[-0.23452598, -0.75850137, -0.60955962, -0.54116297],
       [-1.37778166, -0.80120685,  2.16304408,  0.48161429],
       [ 0.34576422,  0.03167245,  0.66053017,  0.24051834],
       [-0.89690453, -1.61403402, -0.53950167, -1.5127315 ],
       [ 0.11994836,  0.93271156, -0.37552097, -1.90197732],
       [ 1.1536568 ,  1.55695533, -1.17266403, -0.097324  ]])

描述显示数据的快速统计摘要

df.describe()

数据转置：

df.T

数据按轴排序

df.sort_index(axis=1, ascending=False)

按值排序

df.sort_values(by='B')

选择¶

注意：对于标准的Python/Numpy方法，选择与设置元素是直观并且易于交互的。但，对于生成代码，建议采用Pandas方法， .at, .iat, .loc, .iloc and .ix.

获取数据¶

选择标记为‘A’的一列

df['A']

2013-01-01   -0.234526
2013-01-02   -1.377782
2013-01-03    0.345764
2013-01-04   -0.896905
2013-01-05    0.119948
2013-01-06    1.153657
Freq: D, Name: A, dtype: float64

通过[]切分行：

df[0:3]

df['20130102':'20130104']

通过label选择数据¶

通过label获取对应数据

df.loc[dates[0]]

A   -0.234526
B   -0.758501
C   -0.609560
D   -0.541163
Name: 2013-01-01 00:00:00, dtype: float64

通过列label选择多列数据

df.loc[:,['A','B']]

通过行label与列label选择数据

df.loc['20130102':'20130104',['A','B']]

降低返回对象的维度

df.loc['20130102', ['A','B']]

A   -1.377782
B   -0.801207
Name: 2013-01-02 00:00:00, dtype: float64

获得标量值

df.loc[dates[0], 'A']

-0.23452598284858567

快速获得标量（同前一个方法）

df.at[dates[0], 'A']

-0.23452598284858567

通过位置选择数据¶

通过传递整数位置选择数据

df.iloc[3]

A   -0.896905
B   -1.614034
C   -0.539502
D   -1.512732
Name: 2013-01-04 00:00:00, dtype: float64

通过行与列的位置选择数据，类似于python与numpy

df.iloc[3:5, 0:2]

通过list来选择特定行或列的元素

df.iloc[[0, 2, 4], [0, 2]]

通过list选择行

df.iloc[1:3,:]

通过list选择列

df.iloc[:,1:3]

通过行列元素选择特定的元素

df.iloc[1, 1]

-0.80120684660851138

快速访问某一个元素（同上一个方法）

df.iat[1,1]

-0.80120684660851138

布尔型(条件)索引¶

用单列的值来选择元素

df[df.A > 0]

选择全部数据满足条件的数据

df[df > 0]

通过isin()方法过滤数据

df2 = df.copy()
df2['E'] = ['one', 'two', 'three', 'four', 'five', 'six']
df2

df2[df2['E'].isin(['two','four'])]

设置数据¶

设置一个新列会根据索引自动对齐数据

s1 = pd.Series([1,2,3,4,5,6], index = pd.date_range('20130102', periods = 6))
s1

2013-01-02    1
2013-01-03    2
2013-01-04    3
2013-01-05    4
2013-01-06    5
2013-01-07    6
Freq: D, dtype: int64

df['F'] = s1
df

通过label设置数据

df.at[dates[0],'A'] = 0
df

通过位置设置数据

df.iat[0, 1] = 0
df

通过numpy设置整列数据

df.loc[:,'D'] = np.array([5] * len(df))
df

通过where操作设置数据

df2 = df.copy()
df2[df2 > 0] = -df2
df2

缺失数据¶

Pandas基本采用np.nan表示缺失数据，缺失数据默认不会参加运算。

重新索引允许您更改/添加/删除指定行或列上的索引。这将返回数据的副本。

df1 = df.reindex(index=dates[0:4], columns=list(df.columns) + ['E'])
df1.loc[dates[0]:dates[1],'E'] = 1
df1

舍去任何包含缺失数据的行

df1.dropna(how='any')

填充缺失数据

df1.fillna(value=5)

返回布尔数据，如果为nan返回true，否则返回false

pd.isnull(df1)

数据操作¶

概率统计操作

这些操作一般不包括缺失数据

列方向均值操作：

df.mean()

A   -0.109219
B    0.017683
C    0.021055
D    5.000000
F    3.000000
dtype: float64

行方向均值

df.mean(1)

2013-01-01    1.097610
2013-01-02    1.196811
2013-01-03    1.607593
2013-01-04    0.989912
2013-01-05    1.935428
2013-01-06    2.307590
Freq: D, dtype: float64

操作具有不同维度和需要对齐的对象。此外，Pandas会沿着指定的维度自动广播。

s = pd.Series([1,3,5,np.nan,6,8], index=dates).shift(2)
s

2013-01-01    NaN
2013-01-02    NaN
2013-01-03    1.0
2013-01-04    3.0
2013-01-05    5.0
2013-01-06    NaN
Freq: D, dtype: float64

df.sub(s, axis='index')

Apply¶

应用方程到数据

df.apply(np.cumsum)

df.apply(lambda x: x.max() - x.min())

A    2.531438
B    3.170989
C    3.335708
D    0.000000
F    4.000000
dtype: float64

直方图¶

s = pd.Series(np.random.randint(0, 7, size=10))
s

0    5
1    4
2    0
3    6
4    1
5    0
6    4
7    1
8    1
9    4
dtype: int32

s.value_counts()

4    3
1    3
0    2
6    1
5    1
dtype: int64

字符串方法¶

Series在string中提供配置了一组字符串处理方法，可以方便地对数组的每个元素进行操作。

s = pd.Series(['A', 'B', 'C', 'Aaba', 'Baca', np.nan, 'CABA', 'dog', 'cat'])
s

0       A
1       B
2       C
3    Aaba
4    Baca
5     NaN
6    CABA
7     dog
8     cat
dtype: object

s.str.lower()

0       a
1       b
2       c
3    aaba
4    baca
5     NaN
6    caba
7     dog
8     cat
dtype: object

合并数据¶

concat操作¶

Pandas提供一系列合并数据的操作，例如：Series, DataFrame, and Panel操作

df = pd.DataFrame(np.random.randn(10, 4))
df

pieces = [df[:3], df[3:7], df[7:]]
pd.concat(pieces)

join操作¶

left = pd.DataFrame({'key': ['foo', 'foo'], 'lval': [1, 2]})
right = pd.DataFrame({'key': ['foo', 'foo'], 'rval': [4, 5]})
left

right

pd.merge(left, right, on='key')

另一种方式

left= pd.DataFrame({'key': ['foo', 'bar'], 'lval': [1, 2]})
right = pd.DataFrame({'key': ['foo', 'bar'], 'rval': [4, 5]})
left

right

pd.merge(left, right, on='key')

Append操作¶

df = pd.DataFrame(np.random.randn(8, 4), columns=['A','B','C','D'])
df

s = df.iloc[3]
df.append(s, ignore_index=True)

Group操作¶

group by操作可以分为如下操作：1.基于某些规则分组数据；

df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'foo'],
                   'B' : ['one', 'one', 'two', 'three', 'two', 'two', 'one', 'three'],
                   'C' : np.random.randn(8),
                   'D' : np.random.randn(8)})
df

2.对每组数据应用方程处理，例如：sum

df.groupby('A').sum()

3.组合数据结果

df.groupby(['A','B']).sum()

reshape¶

stack操作¶

 tuples = list(zip(*[['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
                     ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
df = pd.DataFrame(np.random.randn(8, 2), index=index, columns=['A', 'B'])
df2 = df[:4]
df2

stack()方法压缩DataFrame列中数据

stacked = df2.stack()
stacked

first  second   
bar    one     A    0.482226
               B   -1.085646
       two     A    0.805270
               B    1.785375
baz    one     A   -0.860945
               B   -1.018662
       two     A    0.921586
               B   -0.216946
dtype: float64

unstack()方法，是stack()方法的反操作，例如：

stacked.unstack()

stacked.unstack(1)

stacked.unstack(0)

数据透视表

df = pd.DataFrame({'A' : ['one', 'one', 'two', 'three'] * 3,
                   'B' : ['A', 'B', 'C'] * 4,
                   'C' : ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'] * 2,
                   'D' : np.random.randn(12),
                   'E' : np.random.randn(12)})
df

可以获取透视数据：

pd.pivot_table(df, values='D', index=['A', 'B'], columns=['C'])

Time Series（时间序列分析）¶

Pandas具有简单，强大和高效的功能，用于在频率转换期间执行重采样操作

rng = pd.date_range('1/1/2012', periods=100, freq='S')
ts = pd.Series(np.random.randint(0, 500, len(rng)), index=rng)
ts.resample('5Min').sum()

2012-01-01    26325
Freq: 5T, dtype: int32

时区表示

rng = pd.date_range('3/6/2012 00:00', periods=5, freq='D')
ts = pd.Series(np.random.randn(len(rng)), rng)
ts

2012-03-06   -0.756171
2012-03-07   -1.818210
2012-03-08   -1.742229
2012-03-09   -0.666278
2012-03-10   -0.246013
Freq: D, dtype: float64

ts_utc = ts.tz_localize('UTC')
ts_utc

2012-03-06 00:00:00+00:00   -0.756171
2012-03-07 00:00:00+00:00   -1.818210
2012-03-08 00:00:00+00:00   -1.742229
2012-03-09 00:00:00+00:00   -0.666278
2012-03-10 00:00:00+00:00   -0.246013
Freq: D, dtype: float64

转换成其他时区

ts_utc.tz_convert('US/Eastern')

2012-03-05 19:00:00-05:00   -0.756171
2012-03-06 19:00:00-05:00   -1.818210
2012-03-07 19:00:00-05:00   -1.742229
2012-03-08 19:00:00-05:00   -0.666278
2012-03-09 19:00:00-05:00   -0.246013
Freq: D, dtype: float64

在时间跨度表示之间转换

rng = pd.date_range('1/1/2012', periods=5, freq='M')
ts = pd.Series(np.random.randn(len(rng)), index=rng)
ts

2012-01-31   -2.280217
2012-02-29   -1.265666
2012-03-31   -0.693234
2012-04-30   -0.160583
2012-05-31    0.394237
Freq: M, dtype: float64

ps = ts.to_period()
ps

2012-01   -2.280217
2012-02   -1.265666
2012-03   -0.693234
2012-04   -0.160583
2012-05    0.394237
Freq: M, dtype: float64

ps.to_timestamp()

2012-01-01   -2.280217
2012-02-01   -1.265666
2012-03-01   -0.693234
2012-04-01   -0.160583
2012-05-01    0.394237
Freq: MS, dtype: float64

在周期和时间戳之间进行转换，可以使用一些方便的数学函数。

prng = pd.period_range('1990Q1', '2000Q4', freq='Q-NOV')
ts = pd.Series(np.random.randn(len(prng)), prng)
ts.index = (prng.asfreq('M', 'e') + 1).asfreq('H', 's') + 9
ts.head()

1990-03-01 09:00   -0.116439
1990-06-01 09:00    0.689428
1990-09-01 09:00    1.603478
1990-12-01 09:00   -0.074146
1991-03-01 09:00   -0.642818
Freq: H, dtype: float64

Categorical Data¶

pandas自0.15版本后，Pandas包含了Categorical Data

df = pd.DataFrame({"id":[1,2,3,4,5,6], "raw_grade":['a', 'b', 'b', 'a', 'a', 'e']})

df["grade"] = df["raw_grade"].astype("category")
df["grade"]

0    a
1    b
2    b
3    a
4    a
5    e
Name: grade, dtype: category
Categories (3, object): [a, b, e]

将Categorical Data重命名为更有意义的名称

df["grade"].cat.categories = ["very good", "good", "very bad"]

df["grade"] = df["grade"].cat.set_categories(["very bad", "bad", "medium", "good", "very good"])
df["grade"]

0    very good
1         good
2         good
3    very good
4    very good
5     very bad
Name: grade, dtype: category
Categories (5, object): [very bad, bad, medium, good, very good]

排序

df.sort_values(by="grade")

按分类列分组还显示空类别：

df.groupby("grade").size()

grade
very bad     1
bad          0
medium       0
good         2
very good    3
dtype: int64

Plotting¶

ts = pd.Series(np.random.randn(1000), index=pd.date_range('1/1/2000', periods=1000))
ts = ts.cumsum()
ts.plot()

<matplotlib.axes._subplots.AxesSubplot at 0x15444d8a518>

df = pd.DataFrame(np.random.randn(1000, 4), index=ts.index,
                  columns=['A', 'B', 'C', 'D'])
df = df.cumsum()
plt.figure(); df.plot(); plt.legend(loc='best')

<matplotlib.legend.Legend at 0x154451d2048>

获取数据¶

CSV¶

df.to_csv('foo.csv')

pd.read_csv('foo.csv')

HDF5¶

df.to_hdf('foo.h5','df')

 pd.read_hdf('foo.h5','df')

Excel¶

df.to_excel('foo.xlsx', sheet_name='Sheet1')

 pd.read_excel('foo.xlsx', 'Sheet1', index_col=None, na_values=['NA'])

	A	B	C	D
count	6.000000	6.000000	6.000000	6.000000
mean	-0.148307	-0.108734	0.021055	-0.555177
std	0.904502	1.187138	1.207575	0.964228
min	-1.377782	-1.614034	-1.172664	-1.901977
25%	-0.731310	-0.790530	-0.592045	-1.269839
50%	-0.057289	-0.363414	-0.457511	-0.319243
75%	0.289310	0.707452	0.401517	0.156058
max	1.153657	1.556955	2.163044	0.481614

	A	B	C	D	F
2013-01-01	0.000000	0.000000	-0.609560	-5	NaN
2013-01-02	-1.377782	-0.801207	-2.163044	-5	-1.0
2013-01-03	-0.345764	-0.031672	-0.660530	-5	-2.0
2013-01-04	-0.896905	-1.614034	-0.539502	-5	-3.0
2013-01-05	-0.119948	-0.932712	-0.375521	-5	-4.0
2013-01-06	-1.153657	-1.556955	-1.172664	-5	-5.0

	0	1	2	3
0	-0.382355	-1.457945	-1.058879	1.106354
1	1.120769	-0.559993	-2.319053	1.404413
2	-0.382069	1.362170	0.289857	1.009210
3	-1.280824	1.158049	-0.356128	0.023975
4	0.797376	0.384624	0.152443	-1.392915
5	-1.886364	-0.276635	-1.159430	2.355427
6	-1.851654	0.781897	1.054916	0.323811
7	0.708013	1.648491	-0.031482	-1.245125
8	0.502539	-1.013665	-1.653738	0.748612
9	-0.773031	-1.085371	-1.070529	-0.190790

	0	1	2	3
0	-0.382355	-1.457945	-1.058879	1.106354
1	1.120769	-0.559993	-2.319053	1.404413
2	-0.382069	1.362170	0.289857	1.009210
3	-1.280824	1.158049	-0.356128	0.023975
4	0.797376	0.384624	0.152443	-1.392915
5	-1.886364	-0.276635	-1.159430	2.355427
6	-1.851654	0.781897	1.054916	0.323811
7	0.708013	1.648491	-0.031482	-1.245125
8	0.502539	-1.013665	-1.653738	0.748612
9	-0.773031	-1.085371	-1.070529	-0.190790

	A	B	C	D
0	-0.789698	0.807736	-0.535839	1.633274
1	1.122605	0.550968	-0.337497	-0.287827
2	0.231271	0.273175	0.891461	-0.531991
3	0.902182	0.802540	-0.319774	0.131826
4	-1.999558	-0.114146	0.449653	0.174982
5	1.688851	0.789424	-0.194151	2.017002
6	0.738474	-0.682955	-1.420737	-0.978726
7	-0.234849	0.682447	0.753980	0.327666

	A	B	C	D
2013-01-01	-0.234526	-0.758501	-0.609560	-0.541163
2013-01-02	-1.377782	-0.801207	2.163044	0.481614
2013-01-03	0.345764	0.031672	0.660530	0.240518
2013-01-04	-0.896905	-1.614034	-0.539502	-1.512732
2013-01-05	0.119948	0.932712	-0.375521	-1.901977
2013-01-06	1.153657	1.556955	-1.172664	-0.097324

	A	B	C	D	E	F
0	1.0	2013-01-02	1.0	3	test	foo
1	1.0	2013-01-02	1.0	3	train	foo
2	1.0	2013-01-02	1.0	3	test	foo
3	1.0	2013-01-02	1.0	3	train	foo

	A	B	C	D	F	E
2013-01-01	False	False	False	False	True	False
2013-01-02	False	False	False	False	False	False
2013-01-03	False	False	False	False	False	True
2013-01-04	False	False	False	False	False	True

	A	B	C	D
0	foo	one	0.645354	0.110824
1	bar	one	1.090199	-1.103622
2	foo	two	0.492488	0.918727
3	bar	three	-1.938242	0.122475
4	foo	two	-0.416253	-1.056679
5	bar	two	-0.616998	0.482990
6	foo	one	0.311077	0.641402
7	foo	three	-1.400911	-0.126283

		A	B
first	second
bar	one	0.482226	-1.085646
bar	two	0.805270	1.785375
baz	one	-0.860945	-1.018662
baz	two	0.921586	-0.216946

	A	B	C	D	E
0	one	A	foo	-1.863191	-0.420588
1	one	B	foo	1.308556	1.373186
2	two	C	foo	0.969745	-0.244350
3	three	A	bar	0.308731	0.904415
4	one	B	bar	1.918195	-0.113832
5	one	C	bar	0.121828	-0.364853
6	two	A	foo	0.240318	-0.536804
7	three	B	foo	-1.067721	0.743653
8	one	C	foo	-0.837986	0.775367
9	one	A	bar	-0.622907	1.653944
10	two	B	bar	0.221569	-2.451701
11	three	C	bar	-2.312243	0.502830

	id	raw_grade	grade
5	6	e	very bad
1	2	b	good
2	3	b	good
0	1	a	very good
3	4	a	very good
4	5	a	very good

	Unnamed: 0	A	B	C	D
0	2000-01-01	-1.393111	1.514788	0.280721	-0.294551
1	2000-01-02	-0.642525	2.195259	0.592064	-0.057531
2	2000-01-03	-1.001919	1.979959	2.552074	1.107618
3	2000-01-04	0.949372	1.834230	2.081279	1.344826
4	2000-01-05	1.031375	2.840040	1.999606	2.436098
5	2000-01-06	0.709656	3.388670	1.952937	1.355730
6	2000-01-07	1.723083	3.503932	0.290802	1.950817
7	2000-01-08	2.374127	5.203101	-0.330689	3.287472
8	2000-01-09	2.674692	8.018739	-0.036404	3.521545
9	2000-01-10	2.630701	8.189795	0.385022	4.931897
10	2000-01-11	3.126444	7.579697	0.679853	5.769419
11	2000-01-12	2.709425	8.329440	-1.069793	5.963032
12	2000-01-13	2.860447	7.253739	-1.309274	5.421635
13	2000-01-14	3.314667	8.721476	-1.965677	4.064228
14	2000-01-15	2.349331	8.863615	-2.761195	4.930499
15	2000-01-16	1.958310	7.119070	-2.625528	2.593476
16	2000-01-17	2.070067	9.004844	-2.052658	2.760979
17	2000-01-18	1.965295	8.512214	-2.359618	0.699228
18	2000-01-19	0.900034	8.133227	-1.982716	1.624482
19	2000-01-20	0.035752	6.445181	-2.893294	1.594081
20	2000-01-21	-2.014831	6.620557	-4.773599	1.002027
21	2000-01-22	-0.923967	8.693741	-3.695795	0.558935
22	2000-01-23	-1.668225	7.790933	-3.573539	0.506860
23	2000-01-24	-2.056015	6.824887	-2.564427	-0.662262
24	2000-01-25	-1.843934	8.955545	-2.683112	-0.809258
25	2000-01-26	-1.856735	8.607138	-2.440229	-2.196696
26	2000-01-27	-3.463581	8.966359	-3.873806	-1.524501
27	2000-01-28	-4.788617	7.752132	-4.553026	-2.399321
28	2000-01-29	-5.976016	7.036011	-4.460207	-1.605071
29	2000-01-30	-5.034437	6.138244	-4.451524	-1.628326
...	...	...	...	...	...
970	2002-08-28	7.087637	-10.915543	-58.482667	-74.821127
971	2002-08-29	6.120060	-10.997083	-57.452650	-73.742471
972	2002-08-30	7.691062	-11.176013	-57.091866	-73.281751
973	2002-08-31	8.452480	-11.293687	-56.313169	-74.869274
974	2002-09-01	7.553114	-10.510296	-56.900910	-75.732503
975	2002-09-02	8.275944	-9.958087	-56.768495	-75.445624
976	2002-09-03	10.713038	-10.430143	-56.750073	-76.824079
977	2002-09-04	12.299215	-11.939397	-58.104176	-77.024297
978	2002-09-05	13.160027	-12.323100	-58.055834	-79.144961
979	2002-09-06	14.413937	-11.905493	-57.145252	-78.776055
980	2002-09-07	14.514871	-11.271294	-58.854309	-78.735910
981	2002-09-08	14.828057	-10.749718	-59.836208	-77.881782
982	2002-09-09	14.525107	-9.981579	-59.973495	-77.019424
983	2002-09-10	13.248493	-10.430729	-60.947579	-76.936804
984	2002-09-11	13.866722	-9.581368	-61.176041	-76.366701
985	2002-09-12	13.711199	-9.163406	-60.199849	-76.076018
986	2002-09-13	12.278210	-9.744253	-61.488643	-77.830263
987	2002-09-14	13.872331	-9.769701	-60.596225	-79.301200
988	2002-09-15	13.742700	-11.035063	-61.132231	-79.930390
989	2002-09-16	13.117977	-10.444062	-61.367571	-79.468517
990	2002-09-17	12.547621	-13.403313	-61.542260	-77.685127
991	2002-09-18	12.156448	-14.764817	-62.347422	-77.060493
992	2002-09-19	12.356847	-14.440572	-62.154593	-75.846297
993	2002-09-20	12.967553	-13.593848	-62.368733	-76.248747
994	2002-09-21	12.188960	-13.960158	-64.801744	-77.560436
995	2002-09-22	10.930343	-14.180484	-61.880778	-78.704844
996	2002-09-23	11.546164	-14.640839	-60.011920	-79.081217
997	2002-09-24	12.010131	-13.948313	-58.988687	-79.707070
998	2002-09-25	10.203590	-14.170849	-59.165740	-78.678488
999	2002-09-26	9.985623	-13.509717	-57.876236	-78.605853

	A	B	C	D	F
2013-01-01	0.000000	0.000000	-0.609560	5	NaN
2013-01-02	-1.377782	-0.801207	1.553484	10	1.0
2013-01-03	-1.032017	-0.769534	2.214015	15	3.0
2013-01-04	-1.928922	-2.383568	1.674513	20	6.0
2013-01-05	-1.808974	-1.450857	1.298992	25	10.0
2013-01-06	-0.655317	0.106098	0.126328	30	15.0

	key	lval
0	foo	1
1	foo	2

	key	rval
0	foo	4
1	foo	5

	key	lval
0	foo	1
1	bar	2

Pandas快速入门（深度学习入门2）

对象建立¶

查看数据¶

选择¶

获取数据¶

通过label选择数据¶

通过位置选择数据¶

布尔型(条件)索引¶

设置数据¶

缺失数据¶

数据操作¶

Apply¶

直方图¶

字符串方法¶

合并数据¶

concat操作¶

join操作¶

Append操作¶

Group操作¶

reshape¶

stack操作¶

Time Series（时间序列分析）¶

Categorical Data¶

Plotting¶

获取数据¶

CSV¶

HDF5¶

Excel¶