python常用函数总结

1.strip()去掉字符串头尾指定字符（默认为空格）

str.strip([chars]); 去掉str头尾的chars

2.split()通过指定分隔符对字符串进行切片，如果参数num 有指定值，则仅分隔 num 个子字符串

str.split(str="", num=string.count(str))

str -- 分隔符，默认为所有的空字符，包括空格、换行( )、制表符( )等。
num -- 分割次数

3.列表的append()和extend()

>>> a=[1,2,3]
>>> b=[4,5,6]
>>> a.append(b)
>>> print a

[1, 2, 3, [4, 5, 6]]

　　append()方法为列表尾部添加一个新元素

>>> a=[1,2,3]
>>> b=[4,5,6]
>>> a.append(b)
>>> print a
[1, 2, 3, 4, 5, 6]

　　extend()方法只接受一个列表作为参数，并将该参数的每个元素都添加到原有的列表中

4.读文件的read(),readline(),readlines()

read()方法：读取整个文件，将文件内容放到一个变量中；如果文件大于可用内存，不能使用此方法。

>>> fr=open('lenses.txt')
>>> a=fr.read()
>>> print a
young	myope	no	normal	soft
young	myope	yes	reduced	no lenses
young	myope	yes	normal	hard
......
>>> type(a)
<type 'str'>

readline()方法：readline()每次读取一行，比readlines()慢很多；readline()返回的是一个字符串对象，保存当前行的内容。

>>> fr=open('lenses.txt')
>>> line=fr.readline()
>>> print line
young	myope	no	reduced	no lenses

>>> type(line)
<type 'str'>

readlines()方法：一次性读取整个文件；自动将内容划分成一个含有多个列表的列表，每一行为一个列表

fr=open('lenses.txt')
>>> lines=fr.readlines()
>>> for line in lines:
...         print line

young myope no reduced no lenses

young myope no normal soft

young myope yes reduced no lenses

......

>>> type(lines)
<type 'list'>

　6.正则表达式和string.split()切分句子

使用split()切分句子：

>>> mysent='this book is the best book on python or m.l. i have ever laid eyes upon.'
>>> mysent.split()

['this', 'book', 'is', 'the', 'best', 'book', 'on', 'python', 'or', 'm.l., 'i', 'have', 'ever', 'laid', 'eyes', 'upon.']

　此时，标点符号也呗当做词的一部分。

使用正则表达式切分句子：

>>> mysent='this book is the best book on python or m.l. i have ever laid eyes upon'
>>> regEx=re.compile('\W*')
>>> list=regEx.split(mysent)
>>> list
['this', 'book', 'is', 'the', 'best', 'book', 'on', 'python', 'or', 'm', 'l', 'i', 'have', 'ever', 'laid', 'eyes', 'upon']

　7.pickle模块

使用pickle模块存储决策树

def storeTree(inputTree,filename):
	import pickle
	fw=open(filename,'w')
	#将python对象序列化保存到本地
	pickle.dump(inputTree,fw)
	fw.close()
def grabTree(filename):
	import pickle
	fr=open(filename)
	#加入本地文件，恢复python对象
	return pickle.load(fr)

　　8.feedparser模块

feedparser 号称是一个 universal feed parser，使用它我们可轻松地实现从任何 RSS 或 Atom 订阅源得到标题、链接和文章的条目。

RSS是RDF Site Summary 的缩写（RDF是Resource Description Framework的缩写），是指将网站摘要用xml语言描述。

安装feedparser之前先先安装setuptool：

下载

sudo wget https://pypi.python.org/packages/source/s/setuptools/setuptools-0.6c11.tar.gz#md5=7df2a529a074f613b509fb44feefe74e
解压
sudo tar zxvf setuptools-0.6c11.tar.gz
编译和安装
sudo wget https://bootstrap.pypa.io/ez_setup.py

安装feedparser：

下载：

sudo wget https://pypi.python.org/packages/source/f/feedparser/feedparser-5.1.3.tar.gz#md5=f2253de78085a1d5738f626fcc1d8f71 --no-check-certificate

解压

sudo tar zxvf feedparser-5.1.3.tar.gz

进入feedparser-5.1.3目录下：cd feedparser-5.1.3

在当前目录下执行：python setup.py install

　9.random（）函数总结

随机整数：
>>> import random
>>> random.randint(0,99)
21

随机选取0到100间的偶数：
>>> import random
>>> random.randrange(0, 101, 2)
42

随机浮点数：
>>> import random
>>> random.random() 
0.85415370477785668
>>> random.uniform(1, 10)
5.4221167969800881

随机字符：
>>> import random
>>> random.choice('abcdefg&#%^*f')
'd'

多个字符中选取特定数量的字符：
>>> import random
random.sample('abcdefghij',3) 
['a', 'd', 'b']

多个字符中选取特定数量的字符组成新字符串：
>>> import random
>>> import string
>>> string.join(random.sample(['a','b','c','d','e','f','g','h','i','j'], 3)).r
eplace(" ","")
'fih'

随机选取字符串：
>>> import random
>>> random.choice ( ['apple', 'pear', 'peach', 'orange', 'lemon'] )
'lemon'

洗牌：
>>> import random
>>> items = [1, 2, 3, 4, 5, 6]
>>> random.shuffle(items)
>>> items
[3, 2, 5, 6, 4, 1]

　9.np.random（）函数总结　

>>> import numpy as np
>>> np.random.rand(2,3)
array([[ 0.68950928,  0.03283966,  0.47028051],
       [ 0.37143456,  0.97159103,  0.59014986]])
#np.random.uniform(low,high,size)
>>> np.random.uniform(1,4,2)
array([ 1.8773977 ,  2.43266306])
>>> np.random.uniform(1,4,1)
array([ 1.84577045])

　10.np.asarray()和np.array()

>>> import numpy as np
>>> arr1=np.ones((3,3))
>>> arr2=np.array(arr1) 
>>> arr3=np.asarray(arr1) 
>>> arr1[2]=3
>>> print arr1
[[ 1.  1.  1.]
 [ 1.  1.  1.]
 [ 3.  3.  3.]]
>>> print arr2
[[ 1.  1.  1.]
 [ 1.  1.  1.]
 [ 1.  1.  1.]]
>>> print arr3
[[ 1.  1.  1.]
 [ 1.  1.  1.]
 [ 3.  3.  3.]]

使用np.array()函数复制之后会占用一个新的内存，而np.asarray不会。

11.np.dot()和*

>>> arr1=np.ones((3,3))
>>> arr2=np.array([[2,2,2],[2,2,2],[2,2,2]])
>>> arr1*arr2
array([[ 2.,  2.,  2.],
       [ 2.,  2.,  2.],
       [ 2.,  2.,  2.]])
>>> np.dot(arr1,arr2)
array([[ 6.,  6.,  6.],
       [ 6.,  6.,  6.],
       [ 6.,  6.,  6.]])

np.dot()是真正意思上的矩阵相乘，*是对应元素相乘。