Python Idioms and Efficiency

一要写出容易读的程序，应该使用什么样的风格（idioms）

读 the python cookbook ，尤其是前几章。那里有优秀的python风格的代码。

Build strings as a list and use ''.join at the end 。

join是由分隔符调用的字符串方法，而不是由list调用的。可以使用空串作为连接符来调用join方法，这是python比较怪异的地方。之所以如此，是因为，使用“+”操作符消耗的时间二次的而不是线性的。

Wrong: for s in strings: result += s
Right: result = ''.join(strings)

Always use an object's capabilities instead of its type。

python是一门动态类型的语言。你无需关心一个对象的类型，而只需要关心该对象是不是支持特定的接口操作即可。这个特性给你带来了方便的多态性。例如，在我的代码中，我会使用下面的方法检查一个字符串是否是字母组成的：

for char in string:
    if char not in alphabet:
        raise ValueError, "Char %s not in alphabet %a" % (char, alphabet)

只要alphabet 支持__contains__方法即可，而无需关心它是字符串，字典还是列表。

Use in whererver possible.

在自己的类中，可以通过覆盖__contains__方法来支持x in y 的操作，通过覆盖__iter__的方法来支持x in y。这样可以保证你的代码的通用性和多态性。

Better: for key in d: print key     #also works for arbitrary sequence
Worse:  for key in d.keys(): print key #limited to objects with keys()
Better: if key not in d: d[key] = []
Worse:  if not dict.has_key(key): d[key] = []

注意：如果你想改变一个字典，你仍然需要使用d.keys()。for key in d: del d[key]回引起RuntimeError，是因为在使用迭代器的时候改变了字典的大小。可以这样操作：for key in d.keys(): del d[key]

Use coercion if an object must be a particuar type. 如果x必须是生日那个类型才能工作的时候，可以使用str(x)代替isinstance(str,x)，可以使用try/catch来捕获转换中出现的错误。

Use if not x instead of instead of if x == 0 or if x == "" or if x == None or if x == False ，likewise，if x instead of if x != 0, if x != None 。

Use string methods rather than the string module 。

使用string的方法，而不是string模块。例如，使用s.startswith('abc')而不是startswith(s, 'abc')。这样可以避免模块间方法的冲突。

Use for line in infile, not for line in infile.readlines()。

readlines 和xreadlines 从2.3起被废弃了，转而使用新的迭代模式。for line in infile 的方式允许infile是任何可以作为文本行序列的对象。对于for line in lines ，你不用关心lines 是来自文件，还是字符串列表，或者是其他迭代器，字典的键值等等。

To reverse_sort a list

反转排序一个列表，可以如下实现：

list.sort()

list.reverse()

Use 'while 1:' for infinite loops

使用while 1代替死循环，也可以用其实想do while的功能

while 1:
    curr_line = reader.next()
    if not curr_line:
        break
    curr_line.process()

EAFP ('easier to ask forgiveness than permission')

使用捕获异常来代替避免错误的发生，即，让问题及早的出现，然后通过异常对其处理。

Worse:
#check whether int conversion will raise an error
if not isinstance(s, str) or not s.isdigit:
    return None
elif len(s) > 10:    #too many digits for int conversion
    return None
else:
    return int(str)

Better:
try:
    return int(str)
except (TypeError, ValueError, OverflowError): #int conversion failed
    return None

Catch only the approprite errors.

只捕获相关的异常。为了实现正确处理异常，对于不同的异常应该有不同的捕获处理

swap values without using temporary variables.

使用a, b = b, a来交换ab的值

Use zip to get a list's item with their indices

通过zip来获取列表的元素及其位置索引。

indices = xrange(maxint)    #only need this once; mine is in Utils.py
for d, index in zip(data, indices):
#do something with d and index here

二：怎么写出更快的程序

Alvays profile befor you optimize for speed.

在优化前通过使用profile.py找到程序的瓶颈所在。

Always use a good algorithm when ti is available .

算法。

Use the simplest option that could possible work.

只要能满足工作需求，越简单越好。

Build strings as a list and use ''.join at the end。

构建一个列表，最后使用join将其连接起来。

Use tests for object identity when appropriate .

合适的时候使用python的object identity，例如使用if x not None 代替if x != None

因为前者只是检查该内存地址。

Use dictionaries (or sets)for searching ,not lists.

使用字典或者集合来进行查找，而不是列表。

Use the build-in sort wherever possible.

尽可能的使用内置的sort方法。而不是自己提供排序方法。

aux_list = [i.Count, i.Name, ... i) for i in items]
aux_list.sort()    #sorts by Count, then Name, ... , then by item itself
sorted_list = [i[-1] for i in items] #extracts last item

Use map and /or filter to apply functions to lists.

使用map和filter在列表和序列上执行相应的方法。

Worse:
strings = []
for d in data:
    strings.append(str(d))

Better:
strings = map(str, data)

Use list comprehentions where there are condtions attached, or where the functions are methods or take more than one parameter.

某些情况下使用列表解析效果更佳。例如：

Worse:
result = []
for d in data:
    if d.Count > 4:
        result.append[3*d.Count]

Better:
result = [3*d.Count for d in data if d.Count > 4]

而如果使用map，filter方法的话需要：

def triple(x):
    """Returns 3 * x.Count: raises AttributeError if .Count missing."""
    return 3 * x.Count

def check_count(x):
    """Returns 1 if x.Count exists and is greater than 3, 0 otherwise."""
    try:
        return x.Count > 3
    except:
        return 0

result = map(triple, filter(check_count, data))

Use function factories to create ulility functions。

使用函数工厂的方法来创建工具方法。

http://www.oschina.net/code/snippet_70218_2436

Use the operator module and reduce to get sums, products, etc.

使用operator模块和reduce方法来获取和/乘积

Worse:
sum = 0
for d in data:
    sum += d
product = 1
for d in data:
    product *= d

Better:
from operator import add, mul
sum = reduce(add, data)
product = reduce(mul, data)

Use zip and dict to map fields to names.

使用zip和dict建立映射关系。

Bad:
line = 'Some GI data|Some Accession data|Some Description'  #These might come from a file
fields = line.split('|')
gi = fields[0]
accession = fields[1]
description = fields[2]
#etc.
lookup = {}
lookup['GI'] = gi
lookup['Accession'] = accession
lookup['Description'] = description
#etc.

Good:
fieldnames = ['GI', 'Accession', 'Description'] #etc.
fields = line.split('|')
lookup = dict(zip(fieldnames, fields))