python数据处理小函数集合

.pop

pop(index)方法是对可变序列中元素下标进行检索删除，返回删除值。list.pop(obj=list[-1]) //默认为 index=-1，删除最后一个列表值

remove(item)方法是直接对可变序中的元素进行检索删除，返回的是删除后的列表,不返回删除值（返回None）

del(list[index])方法是对可变序列中元素下标进行检索删除，不返回删除值

>>>list1=[1,3,6,7,8]
>>>del list[3]
>>>print list1

[1.3,6,8]

.shape

通过安装导入numpy库，矩阵（ndarray）的shape属性可以获取矩阵的形状（例如二维数组的行列），获取的结果是一个元组，因此相关代码如下：

import numpy as np
x = np.array([[1,2,5],[2,3,5],[3,4,5],[2,3,6]])
#输出数组的行和列数
print x.shape  #结果： (4, 3)
#只输出行数
print x.shape[0] #结果： 4
#只输出列数
print x.shape[1] #结果： 3

loc

loc函数主要通过行标签索引行数据

# 数据集为以下内容，所有操作均对df进行
       0   1     2       3
0  green   M  10.1  class1
1    red   L  13.5  class2
2   blue  XL  15.3  class1

loc[1] 选择行标签是1的（从0、1、2、3这几个行标签中）

In[1]:    df.loc[1]
Out[1]: 
0       red
1         L
2      13.5
3    class2


loc[0:1] 和 loc[0,1]的区别，其实最重要的是loc[0:1]和iloc[0:1]



In[10]: df.loc[0:1]  #取第一和第二行，loc[]中的数字其实是行索引，所以算是前闭+后闭
Out[10]: 
       0  1     2       3
0  green  M  10.1  class1
1    red  L  13.5  class2

In[12]:   df.iloc[0:1]
Out[12]: 
       0  1     2       3
0  green  M  10.1  class1

In[11]:   df.loc[0,1]
Out[11]: 'M'




iloc 主要是通过行号获取行数据
Python字符串截取的规则为-前闭后开

##.tolist()
将数组或者矩阵转换成列表
>>> from numpy import *
>>> a1 = [[1,2,3],[4,5,6]] #列表
>>> a2 = array(a1) #数组
>>> a2
array([[1, 2, 3],
       [4, 5, 6]])
>>> a3 = mat(a1) #矩阵
>>> a3
matrix([[1, 2, 3],
        [4, 5, 6]])
>>> a4 = a2.tolist()
>>> a4
[[1, 2, 3], [4, 5, 6]]
>>> a5 = a3.tolist()
>>> a5
[[1, 2, 3], [4, 5, 6]]
>>> a4 == a5
True