Python数据分析库pandas ------ GroupBy数据聚合、等级分组、组迭代、链式转换、聚合分组后取值

数据聚合(GroupBy)

 1 frame10 = pd.DataFrame({
 2     'color': ['white','red','green','red','green'],
 3     'object': ['pen','pencil','pencil','ashtray','pen'],
 4     'price1' : [5.56,4.20,1.30,0.56,2.75],
 5     'price2' : [4.75,4.12,1.60,0.75,3.15]
 6 })
 7 group = frame10['price1'].groupby(frame10['color'])
 8 print(group, "
-----*group*-----
")
 9 print(group.groups, "
-----*group.groups*-----
")
10 print(group.sum(), "
-----*group.sum*-----
")
11 print(group.mean(), "
-----*group.mean*-----")

  输出结果:

  

  令xgroup.sum(), 则可以取值为:

  x.values 

  x.values
  Out[20]: array([2.75, 1.3 , 0.56, 4.2 , 5.56])

 等级分组

1 ggroup = frame10['price1'].groupby([frame10['color'],frame10['object']])
2 print(ggroup.groups, "
-----*ggroup.groups*-----
")
3 print(ggroup.sum(), "
-----*ggroup.sum*-----
")
4 print(frame10[['price1','price2']].groupby(frame10['color']).mean())

  输出结果:

  

 组迭代

1 for name,group in frame10.groupby('color'):
2     print(name)
3     print(group)

  

 链式转换

1 result1 = frame10['price1'].groupby(frame10['color']).mean()
2 result2 = frame10.groupby(frame10['color']).mean()
3 print(result1, "
-----*result1*-----
")
4 print(result2, "
-----*result2*-----
")
5 print(frame10.groupby(frame10['color'])['price1'].mean(), "
-----*frame10.groupby(frame10['color'])['price1'].mean()*-----
")
6 print((frame10.groupby(frame10['color']).mean())['price1'], "
-----*-----
")
7 print(frame10.groupby('color').mean().add_prefix('mean_'))

  输出结果:

  

  

清澈的爱,只为中国
原文地址:https://www.cnblogs.com/dan-baishucaizi/p/9414840.html