Dijkstra in python

下面是一段由python实现的Dijkstra算法，一些地方的处理实在非常棒，相比于C，代码的数量已经缩减到了60行，所以我想通过本文简单的介绍一下这段代码的细节之处，首先给出源程序：

from sys import argv
 
def dijkstra_score(G, shortest_distances, v, w):
    return shortest_distances[v] + G[v][w]
 
def dijkstra(G, source):
    unprocessed = set(G.keys()) # vertices whose shortest paths from source have not yet been calculated
    unprocessed.remove(source)
    shortest_distances = {source: 0}
 
    for i in xrange(len(G) - 1):
        # find a vertex with the next shortest path, i.e. minimal Dijkstra score
        m, closest_head = float('inf'), 0
        for tail in shortest_distances:
            for head in G[tail]:
                if head in unprocessed:
                    d = dijkstra_score(G, shortest_distances, tail, head)
                    if d < m:
                        m, closest_head = d, head
 
        unprocessed.remove(closest_head)
        shortest_distances[closest_head] = m
 
    # in case G is not fully connected
    for vertex in unprocessed:
        shortest_distances[vertex] = float('inf')
 
    return shortest_distances
 
def get_graph():
    filename = argv[1]
    graph = {}
    with open(filename) as g:
        for line in g:
            l = line.split()
            vertex = int(l.pop(0))
            graph[vertex] = {}
            for x in l:
                adj_vert, distance = map(int, x.split(","))
                graph[vertex][adj_vert] = distance
    print "Got graph. Ex: line 1:", graph[1]
    return graph
 
def main():
    G = get_graph()
    """ Input is adjacency list on vertices labelled 1 to n, including segment length.
    
    Example line of file:
    1   3,45    92,4
    
    This means that v. 1 is adjacent to v. 3 with edge length 45 and adjacent to v. 92 with edge length 4.
    """
    source = int(raw_input("Enter source vertex: "))
    destination_vertices = map(int, raw_input("List destination vertices:
").split())
 
    distances = dijkstra(G, source)
 
    print "From vertex %d:" % source
    for vertex in destination_vertices:
        print "The distance to vertex %d is %d." % (vertex, distances[vertex])
 
if __name__ == '__main__':
    main()

使用方法：通过外部的文件定义图的构造，每一行的格式为：顶点到达的顶点，距离到达的顶点，距离

下面就从每一行值得注意的代码进行分析：

1、图的构造

def get_graph():
    filename = argv[1]
    graph = {}
    with open(filename) as g:
        for line in g:
            l = line.split()
            vertex = int(l.pop(0))
            graph[vertex] = {}
            for x in l:
                adj_vert, distance = map(int, x.split(","))
                graph[vertex][adj_vert] = distance
    print "Got graph. Ex: line 1:", graph[1]
    return graph

这里的图使用邻接表的形式存储，具体的实现采用的python当中的字典，一开始graph为空，graph={}

然后打开存储图的文件，注意这里采用了with语句，相当于try和finally的合体，open函数打开文件并将的返回值给了g。在文件g中的每一行使用split操作，去除空格，得到的l是一个列表，其中第一项就是原点，其余的各项就是原点达到的其他的顶点及其距离。所以将每一个原点放进图graph中作为字典下标，而字典的值仍旧是一个字典，包括了两项，第一项是原点到达的一个顶点，第二项是路径的权值，最后将这两项放入graph中对应的下标构成的字典中。

这样，图就算是构成了，得到的一个字典graph, 例如graph={1:{2,3}}表示的是顶点1到顶点2。

2、单源最短路径

接下来就是通过另一个函数来构造出最短路径了：

def dijkstra(G, source):
    unprocessed = set(G.keys()) # vertices whose shortest paths from source have not yet been calculated
    unprocessed.remove(source)
    shortest_distances = {source: 0}
 
    for i in xrange(len(G) - 1):
        # find a vertex with the next shortest path, i.e. minimal Dijkstra score
        m, closest_head = float('inf'), 0
        for tail in shortest_distances:
            for head in G[tail]:
                if head in unprocessed:
                    d = dijkstra_score(G, shortest_distances, tail, head)
                    if d < m:
                        m, closest_head = d, head
 
        unprocessed.remove(closest_head)
        shortest_distances[closest_head] = m
 
    # in case G is not fully connected
    for vertex in unprocessed:
        shortest_distances[vertex] = float('inf')
 
    return shortest_distances

首先，unprocessed保存了图G中所有顶点的集合，用以表示还没有加入到路径中的顶点，初始化时就是全部的顶点，然后，通过传入函数的source确定开始的顶点，并将该顶点从unprocessed中移除。而记录最短路径的方式则通过shortest_distance这个字典，初始化将自己加入，距离为0。

接下来就是按照Dijkstra算法的步骤一步步进行了：对每一个新加入的顶点找到和这个顶点相邻的边，更新每个顶点的最短距离，这里的实现方式就是通过一个大循环i执行len(G)-1次将每一个顶点都进行处理，每一次处理的开始，将m初始化为无穷大，将closest_head初始化为0，注意，m将会被用来存储最短的距离，而closest_head将会被用来存储最短距离的顶点编号。这里，可以将已经处理好的顶点想象成一个相连的图，而下一个加入到这个图中的顶点就是从原点到剩余顶点距离最短的那一个，具体实现是通过遍历shortest_distance处理完成的顶点，这个字典中每一项都记录了从原点到那个顶点的最短路径，然后图中剩下的没有处理的并且相连的节点，通过dijkstra_score这个函数计算从原点到达那个顶点的距离，将其最小值保存在m中，于是，经过所有的顶点的遍历，找到距离最小的那个点，将其放在shortest_distance中，那么这个顶点就处理完了，接下来就是去处理其他剩余的顶点了。

算法同时也考虑了加入没有连通的情况下的距离，将其设置为无穷大，当然，这里所做的一切都假定所有边的权值为非负，因为假如存在负数的权值，那么最短距离可能不存在。