3.5.2 索引

1.导入三方库

1 import numpy as np
2 import pandas as pd
3 df = pd.read_csv('table.csv',index_col='ID') #用来指定表格的索引值
4 5 df.head(2)
 
 SchoolClassGenderAddressHeightWeightMathPhysics
ID        
1101 S_1 C_1 M street_1 173 63 34.0 A+
1102 S_1 C_1 F street_2 192 73 32.5 B+

 

2.索引

1)loc:标签索引;遵循左闭右闭

a)单行索引

1 df.loc[1103]
School          S_1
Class           C_1
Gender            M
Address    street_2
Height          186
Weight           82
Math           87.2
Physics          B+
Name: 1103, dtype: object

 

b)多行索引

1 df.loc[[1101,1105,1204,1301]]

 

 SchoolClassGenderAddressHeightWeightMathPhysics
ID        
1101 S_1 C_1 M street_1 173 63 34.0 A+
1105 S_1 C_1 F street_4 159 64 84.8 B+
1204 S_1 C_2 F street_5 162 63 33.8 B
1301 S_1 C_3 M street_4 161 68 31.5 B+

 

1 df.loc[1103:1203]
 SchoolClassGenderAddressHeightWeightMathPhysics
ID        
1103 S_1 C_1 M street_2 186 82 87.2 B+
1104 S_1 C_1 F street_2 167 81 80.4 B-
1105 S_1 C_1 F street_4 159 64 84.8 B+
1201 S_1 C_2 M street_5 188 68 97.0 A-
1202 S_1 C_2 F street_4 176 94 63.5 B-
1203 S_1 C_2 M street_6 160 53 58.8 A+

 

c)单列索引

1 df.loc[:,'Weight'].head(3)
ID
1101    63
1102    73
1103    82
Name: Weight, dtype: int64

d)多列索引

1 df.loc[:,['Address','Height','Math']].head()
 AddressHeightMath
ID   
1101 street_1 173 34.0
1102 street_2 192 32.5
1103 street_2 186 87.2
1104 street_2 167 80.4
1105 street_4 159 84.8

 

 

d)综合索引

1 df.loc[1102:2301,['Address','Height','Math']].head()
 AddressHeightMath
ID   
1102 street_2 192 32.5
1103 street_2 186 87.2
1104 street_2 167 80.4
1105 street_4 159 84.8
1201 street_5 188 97.0

 

 

2)iloc:位置索引;遵循左闭右开

a)单行索引

1 df.head(9)

 

 SchoolClassGenderAddressHeightWeightMathPhysics
ID        
1101 S_1 C_1 M street_1 173 63 34.0 A+
1102 S_1 C_1 F street_2 192 73 32.5 B+
1103 S_1 C_1 M street_2 186 82 87.2 B+
1104 S_1 C_1 F street_2 167 81 80.4 B-
1105 S_1 C_1 F street_4 159 64 84.8 B+
1201 S_1 C_2 M street_5 188 68 97.0 A-
1202 S_1 C_2 F street_4 176 94 63.5 B-
1203 S_1 C_2 M street_6 160 53 58.8 A+
1204 S_1 C_2 F street_5 162 63 33.8 B

 

1 df.iloc[2]
School          S_1
Class           C_1
Gender            M
Address    street_2
Height          186
Weight           82
Math           87.2
Physics          B+
Name: 1103, dtype: object

 

b)多行索引

1 df.iloc[2:6]
 SchoolClassGenderAddressHeightWeightMathPhysics
ID        
1103 S_1 C_1 M street_2 186 82 87.2 B+
1104 S_1 C_1 F street_2 167 81 80.4 B-
1105 S_1 C_1 F street_4 159 64 84.8 B+
1201 S_1 C_2 M street_5 188 68 97.0 A-

 

c)单例索引

1 df.iloc[:,4].head(3)

 

ID
1101    173
1102    192
1103    186
Name: Height, dtype: int64

 

d)多列索引

1 df.iloc[:,7::-2].head(3)

 

 PhysicsWeightAddressClass
ID    
1101 A+ 63 street_1 C_1
1102 B+ 73 street_2 C_1
1103 B+ 82 street_2 C_1

 

e)综合索引

1 df.iloc[2:6,7::-2].head(3)

 

 PhysicsWeightAddressClass
ID    
1103 B+ 82 street_2 C_1
1104 B- 81 street_2 C_1
1105 B+ 64 street_4 C_1

 

3.常用索引函数

a)where函数 对条件为False的单元进行填充

1 df.head()

 

 SchoolClassGenderAddressHeightWeightMathPhysics
ID        
1101 S_1 C_1 M street_1 173 63 34.0 A+
1102 S_1 C_1 F street_2 192 73 32.5 B+
1103 S_1 C_1 M street_2 186 82 87.2 B+
1104 S_1 C_1 F street_2 167 81 80.4 B-
1105 S_1 C_1 F street_4 159 64 84.8 B+
1 df['Gender'].unique()
2 array(['M', 'F'], dtype=object)
3 df.where(df['Gender']=='M').head()

 

 
 SchoolClassGenderAddressHeightWeightMathPhysics
ID        
1101 S_1 C_1 M street_1 173.0 63.0 34.0 A+
1102 NaN NaN NaN NaN NaN NaN NaN NaN
1103 S_1 C_1 M street_2 186.0 82.0 87.2 B+
1104 NaN NaN NaN NaN NaN NaN NaN NaN
1105 NaN NaN NaN NaN NaN NaN NaN NaN

 

1 aa = df.where(df['Gender']=='M').dropna().head()
2 #意思是:在通过以上的操作,删除掉单元格中不满足条件的行,或提取出筛选后的新数组
3 #mask对条件为True的单元进行填充
4 aa

 

 
 SchoolClassGenderAddressHeightWeightMathPhysics
ID        
1101 S_1 C_1 M street_1 173.0 63.0 34.0 A+
1103 S_1 C_1 M street_2 186.0 82.0 87.2 B+
1201 S_1 C_2 M street_5 188.0 68.0 97.0 A-
1203 S_1 C_2 M street_6 160.0 53.0 58.8 A+
1301 S_1 C_3 M street_4 161.0 68.0 31.5 B+

 

 

小石小石摩西摩西的学习笔记,欢迎提问,欢迎指正!!!
原文地址:https://www.cnblogs.com/shijingwen/p/13700635.html