3.5.2 索引

1.导入三方库

1 import numpy as np
2 import pandas as pd
3 df = pd.read_csv('table.csv',index_col='ID') #用来指定表格的索引值
4 
5 df.head(2)

	School	Class	Gender	Address	Height	Weight	Math	Physics
ID
1101	S_1	C_1	M	street_1	173	63	34.0	A+
1102	S_1	C_1	F	street_2	192	73	32.5	B+

2.索引

1）loc：标签索引；遵循左闭右闭

a）单行索引

1 df.loc[1103]

School          S_1
Class           C_1
Gender            M
Address    street_2
Height          186
Weight           82
Math           87.2
Physics          B+
Name: 1103, dtype: object

b）多行索引

1 df.loc[[1101,1105,1204,1301]]

	School	Class	Gender	Address	Height	Weight	Math	Physics
ID
1101	S_1	C_1	M	street_1	173	63	34.0	A+
1105	S_1	C_1	F	street_4	159	64	84.8	B+
1204	S_1	C_2	F	street_5	162	63	33.8	B
1301	S_1	C_3	M	street_4	161	68	31.5	B+

1 df.loc[1103:1203]

	School	Class	Gender	Address	Height	Weight	Math	Physics
ID
1103	S_1	C_1	M	street_2	186	82	87.2	B+
1104	S_1	C_1	F	street_2	167	81	80.4	B-
1105	S_1	C_1	F	street_4	159	64	84.8	B+
1201	S_1	C_2	M	street_5	188	68	97.0	A-
1202	S_1	C_2	F	street_4	176	94	63.5	B-
1203	S_1	C_2	M	street_6	160	53	58.8	A+

c）单列索引

1 df.loc[:,'Weight'].head(3)

ID
1101    63
1102    73
1103    82
Name: Weight, dtype: int64

d）多列索引

1 df.loc[:,['Address','Height','Math']].head()

	Address	Height	Math
ID
1101	street_1	173	34.0
1102	street_2	192	32.5
1103	street_2	186	87.2
1104	street_2	167	80.4
1105	street_4	159	84.8

d）综合索引

1 df.loc[1102:2301,['Address','Height','Math']].head()

	Address	Height	Math
ID
1102	street_2	192	32.5
1103	street_2	186	87.2
1104	street_2	167	80.4
1105	street_4	159	84.8
1201	street_5	188	97.0

2）iloc：位置索引；遵循左闭右开

a）单行索引

1 df.head(9)

	School	Class	Gender	Address	Height	Weight	Math	Physics
ID
1101	S_1	C_1	M	street_1	173	63	34.0	A+
1102	S_1	C_1	F	street_2	192	73	32.5	B+
1103	S_1	C_1	M	street_2	186	82	87.2	B+
1104	S_1	C_1	F	street_2	167	81	80.4	B-
1105	S_1	C_1	F	street_4	159	64	84.8	B+
1201	S_1	C_2	M	street_5	188	68	97.0	A-
1202	S_1	C_2	F	street_4	176	94	63.5	B-
1203	S_1	C_2	M	street_6	160	53	58.8	A+
1204	S_1	C_2	F	street_5	162	63	33.8	B

1 df.iloc[2]

School          S_1
Class           C_1
Gender            M
Address    street_2
Height          186
Weight           82
Math           87.2
Physics          B+
Name: 1103, dtype: object

b）多行索引

1 df.iloc[2:6]

	School	Class	Gender	Address	Height	Weight	Math	Physics
ID
1103	S_1	C_1	M	street_2	186	82	87.2	B+
1104	S_1	C_1	F	street_2	167	81	80.4	B-
1105	S_1	C_1	F	street_4	159	64	84.8	B+
1201	S_1	C_2	M	street_5	188	68	97.0	A-

c）单例索引

1 df.iloc[:,4].head(3)

ID
1101    173
1102    192
1103    186
Name: Height, dtype: int64

d）多列索引

1 df.iloc[:,7::-2].head(3)

	Physics	Weight	Address	Class
ID
1101	A+	63	street_1	C_1
1102	B+	73	street_2	C_1
1103	B+	82	street_2	C_1

e）综合索引

1 df.iloc[2:6,7::-2].head(3)

	Physics	Weight	Address	Class
ID
1103	B+	82	street_2	C_1
1104	B-	81	street_2	C_1
1105	B+	64	street_4	C_1

3.常用索引函数

a）where函数 对条件为False的单元进行填充

1 df.head()

	School	Class	Gender	Address	Height	Weight	Math	Physics
ID
1101	S_1	C_1	M	street_1	173	63	34.0	A+
1102	S_1	C_1	F	street_2	192	73	32.5	B+
1103	S_1	C_1	M	street_2	186	82	87.2	B+
1104	S_1	C_1	F	street_2	167	81	80.4	B-
1105	S_1	C_1	F	street_4	159	64	84.8	B+

1 df['Gender'].unique()
2 array(['M', 'F'], dtype=object)
3 df.where(df['Gender']=='M').head()

	School	Class	Gender	Address	Height	Weight	Math	Physics
ID
1101	S_1	C_1	M	street_1	173.0	63.0	34.0	A+
1102	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
1103	S_1	C_1	M	street_2	186.0	82.0	87.2	B+
1104	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
1105	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN

1 aa = df.where(df['Gender']=='M').dropna().head()
2 #意思是：在通过以上的操作，删除掉单元格中不满足条件的行，或提取出筛选后的新数组
3 #mask对条件为True的单元进行填充
4 aa

	School	Class	Gender	Address	Height	Weight	Math	Physics
ID
1101	S_1	C_1	M	street_1	173.0	63.0	34.0	A+
1103	S_1	C_1	M	street_2	186.0	82.0	87.2	B+
1201	S_1	C_2	M	street_5	188.0	68.0	97.0	A-
1203	S_1	C_2	M	street_6	160.0	53.0	58.8	A+
1301	S_1	C_3	M	street_4	161.0	68.0	31.5	B+

小石小石摩西摩西的学习笔记，欢迎提问，欢迎指正！！！