pandas组队学习最终季

EX1

1.使用正则表达式提取出所需信息:

  • df1提取模型的状态,精度和模型名称
  • df2提取模型训练时间
import pandas as pd
df=pd.read_table(r'C:UserslxhDownloads/benchmark.txt', header=None)
pat1='Benchmarking (w+) (w+) precision type (w+)'
pat2='(w+)  model average (w+) time :  (.+) ms'
df1=df[0].str.extract(pat1).rename(columns={0:'state',
								 1:'precision',
								 2:'model'}).dropna().reset_index(drop=True)
df2=df[0].str.extract(pat2).rename(columns={0:'model',		
                                              1:'state',			 
                                            2:'time'}).dropna().reset_index(drop=True)

得到df1:

state precision model
0 Training float mnasnet0_5
1 Training float mnasnet0_75
2 Training float mnasnet1_0
3 Training float mnasnet1_3
4 Training float resnet18

df2:

model state time
0 mnasnet0_5 train 28.5276
1 mnasnet0_75 train 34.1055
2 mnasnet1_0 train 34.3138
3 mnasnet1_3 train 35.5569
4 resnet18 train 18.6601

2.进行列操作:

  • 使用cat将df1的state列和precision列进行列的拼接,生成新的type列,然后将state列和precision列删除
  • 使用apply方法对df2的时间保留三位小数,并在df1中增加time列
df1['type'] = df1['state'].str.cat(df1['precision'],sep = '_')
df1 = df1.drop(labels=['state',"precision"],axis=1)
df1['time'] = df2['time'].apply(lambda x:round(float(x),3))
model type time
0 mnasnet0_5 Training_float 28.528
1 mnasnet0_75 Training_float 34.105
2 mnasnet1_0 Training_float 34.314
3 mnasnet1_3 Training_float 35.557
4 resnet18 Training_float 18.66

3.长宽表变形和列名变换:

  • 使用pivot函数将df1变成宽表
  • 重新设置列名,最后按model进行排序:
res  = df1.pivot(index = ['model'], columns = ['type'],values = ['time']).reset_index()
res.columns = res.columns.droplevel()
res = res.rename(columns ={'':'model'})
res = res.sort_values('model',ascending=True)
model Inference_double Inference_float Inference_half Training_double Training_float Training_half
0 densenet121 144.111 15.637 19.772 417.207 93.357 88.976
1 densenet161 511.177 31.75 27.555 1290.29 136.624 144.319
2 densenet169 175.808 21.598 26.371 511.404 104.84 121.556
3 densenet201 223.96 26.169 33.394 654.365 129.334 118.94
4 mnasnet0_5 11.87 8.039 6.929 48.232 28.528 27.198
原文地址:https://www.cnblogs.com/zwrAI/p/14274803.html