Python之数据类型-[bisect,heap]

bisect

>>> import bisect
>>> 
>>> b = [ 20, 34, 35, 65, 78 ]
>>> 
>>> bisect.bisect(b,25) #查找25在列表中的合适插入位置
1
>>> 
>>> b
[20, 34, 35, 65, 78]
>>> 
>>> bisect.bisect_left(b,35) #如果待查找元素在列表中存在，则返回左侧插入位置
2
>>> 
>>> bisect.bisect_right(b,35)#如果待查找元素在列表中存在，则返回右侧插入位置

可以直接用insort_left()直接插入元素而非查找

>>> b
[20, 34, 35, 65, 78]
>>> bisect.insort_left(b,25)
>>> 
>>> bisect.insort_left(b,40)
>>> b
[20, 25, 34, 35, 40, 65, 78]

>>> def Sorted_list(list,*elment):        
...     for e in elment:                  
...             bisect.insort_left(list,e)
...     return list
... 
>>> Sorted_list([],3,2,1)                 
[1, 2, 3]
>>> Sorted_list([],8,9,10,4,5,3,2,1)
[1, 2, 3, 4, 5, 8, 9, 10]
>>>

使用bisect实现一个Sorted_list

http://docs.python.org/2.7/library/bisect.html#module-bisect

思考：如果使用bisect来实现ConsistentHashing算法，只要找到Key在Ring上的插入位置，其下一个有效元素就是我们的目标服务器配置？？

heapq

最小堆：完全平衡二叉树，所有节点都小于其子节点

http://docs.python.org/2.7/library/heapq.html#module-heapq

堆的意义：最快找到最大/最小值。在堆结构中插入或删除最小（最大）元素时进行重新构造时间复杂度为O(logN)，而其他方法最少为O(N)。堆在实际开发中的更倾向于算法调度而非排序。比如优先级调度时，每次取优先级最高的；时间驱动调度时，取时间最小或等待最长的等等

>>> from heapq import *
>>> from random import *
>>> 
>>> rand = sample(xrange(1000),10) #生成随机数序列
>>> rand
[65, 130, 964, 675, 422, 74, 93, 386, 213, 596]
>>> 
>>> heap = []
>>> for x in rand:
...     heappush(heap,x) #将随机数压入堆
... 
>>> heap #堆是树，并非排序列表
[65, 130, 74, 213, 422, 964, 93, 675, 386, 596]

>>> while heap:
...     print heappop(heap) #总弹出最小的元素
... 
65
74
93
130
213
386
422
596
675
964

其他函数：

将列表转换为堆

>>> array = sample(xrange(1000),10) #生成一个列表
>>> array
[943, 536, 93, 400, 736, 184, 876, 854, 988, 345]
>>> heapify(array)                 #将列表转换为堆
>>> array         #有序化
[93, 345, 184, 400, 536, 943, 876, 854, 988, 736]

合并两个有序序列

>>> a = range(1,10,2)
>>> a
[1, 3, 5, 7, 9]
>>> b = range(2,10,2) 
>>> b
[2, 4, 6, 8]
>>> 
>>> [x for x in merge(a,b)] #合并有序序列
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'merge' is not defined
#错误原因在于import heapq，而不是from heapq import *
>>> [x for x in heapq.merge(a,b)] #合并有序序列
[1, 2, 3, 4, 5, 6, 7, 8, 9]

取出n个最大元素：

>>> d = sample(xrange(100),10)
>>> d
[9, 82, 95, 96, 11, 15, 29, 14, 53, 2]
>>> heapq.nlargest(5,d)       
[96, 95, 82, 53, 29]
>>> heapq.nsmallest(5,d)
[2, 9, 11, 14, 15]

利用元组__cmp__，用数字表示对象优先级，实现优先级队列：

>>> from heapq import *
>>> from string import *                                      
>>> from random import *                                      
>>> 
>>> data = map(None,sample(xrange(100),10),sample(letters,10))
>>> 
>>> data
[(30, 'T'), (24, 'E'), (23, 'r'), (62, 'm'), (81, 'W'), (91, 'b'), (83, 'S'), (80, 'h'), (65, 'i'), (64, 'D')]
>>> 
>>> 
>>> heap=[]
>>> 
>>> for item in data:heappush(heap,item)
... 
>>> heap
[(23, 'r'), (30, 'T'), (24, 'E'), (62, 'm'), (64, 'D'), (91, 'b'), (83, 'S'), (80, 'h'), (65, 'i'), (81, 'W')]

或者重载自定义类型的__cmp__操作符。

如果自定义重载呢？

http://www.cnblogs.com/linyawen/archive/2012/04/11/2442424.html