数据结构---最小生成树

一：基本概念

本文转载于：http://lib.csdn.net/article/datastructure/9218

1：什么是生成树？

对于图G<V,E>,如果其子图G'<V',E'>满足V'=V，且G'是一棵树，那么G'就是图G的一颗生成树。生成树是一棵树，按照树的定义，每个顶点都能访问到任何一个其它顶点。（离散数学中的概念），其中V是顶点，E是边，通俗来讲生成树必须包含原图中的所有节点且是连通的。

关于图的几个概念定义：

连通图：在无向图中，若任意两个顶点
强连通图：在有向图中，若任意两个顶点
连通网：在连通图中，若图的边具有一定的意义，每一条边都对应着一个数，称为权；权代表着连接连个顶点的代价，称这种连通图叫做连通网。
生成树：一个连通图的生成树是指一个连通子图，它含有图中全部n个顶点，但只有足以构成一棵树的n-1条边。一颗有n个顶点的生成树有且仅有n-1条边，如果生成树中再添加一条边，则必定成环。
最小生成树：在连通网的所有生成树中，所有边的代价和最小的生成树，称为最小生成树。

最小生成树可以用kruskal（克鲁斯卡尔）算法或prim（普里姆）算法求出

Prim（普里姆）算法

MST（Minimum Spanning Tree，最小生成树）问题有两种通用的解法，Prim算法就是其中之一，它是从点的方面考虑构建一颗MST，大致思想是：设图G顶点集合为U，首先任意选择图G中的一点作为起始点a，将该点加入集合V，再从集合U-V中找到另一点b使得点b到V中任意一点的权值最小，此时将b点也加入集合V；以此类推，现在的集合V={a，b}，再从集合U-V中找到另一点c使得点c到V中任意一点的权值最小，此时将c点加入集合V，直至所有顶点全部被加入V，此时就构建出了一颗MST。因为有N个顶点，所以该MST就有N-1条边，每一次向集合V中加入一个点，就意味着找到一条MST的边。

用图示和代码说明：

初始状态：

设置2个数据结构：

lowcost[i]:表示以i为终点的边的最小权值,当lowcost[i]=0说明以i为终点的边的最小权值=0,也就是表示i点加入了MST

mst[i]:表示对应lowcost[i]的起点，即说明边<mst[i],i>是MST的一条边，当mst[i]=0表示起点i加入MST

我们假设V1是起始点，进行初始化（*代表无限大，即无通路）：

lowcost[2]=6，lowcost[3]=1，lowcost[4]=5，lowcost[5]=*，lowcost[6]=*

mst[2]=1，mst[3]=1，mst[4]=1，mst[5]=1，mst[6]=1，（所有点默认起点是V1）

明显看出，以V3为终点的边的权值最小=1，所以边<mst[3],3>=1加入MST

此时，因为点V3的加入，需要更新lowcost数组和mst数组：

lowcost[2]=5，lowcost[3]=0，lowcost[4]=5，lowcost[5]=6，lowcost[6]=4

mst[2]=3，mst[3]=0，mst[4]=1，mst[5]=3，mst[6]=3

明显看出，以V6为终点的边的权值最小=4，所以边<mst[6],6>=4加入MST

此时，因为点V6的加入，需要更新lowcost数组和mst数组：

lowcost[2]=5，lowcost[3]=0，lowcost[4]=2，lowcost[5]=6，lowcost[6]=0

mst[2]=3，mst[3]=0，mst[4]=6，mst[5]=3，mst[6]=0

明显看出，以V4为终点的边的权值最小=2，所以边<mst[4],4>=4加入MST

此时，因为点V4的加入，需要更新lowcost数组和mst数组：

lowcost[2]=5，lowcost[3]=0，lowcost[4]=0，lowcost[5]=6，lowcost[6]=0

mst[2]=3，mst[3]=0，mst[4]=0，mst[5]=3，mst[6]=0

明显看出，以V2为终点的边的权值最小=5，所以边<mst[2],2>=5加入MST

此时，因为点V2的加入，需要更新lowcost数组和mst数组：

lowcost[2]=0，lowcost[3]=0，lowcost[4]=0，lowcost[5]=3，lowcost[6]=0

mst[2]=0，mst[3]=0，mst[4]=0，mst[5]=2，mst[6]=0

很明显，以V5为终点的边的权值最小=3，所以边<mst[5],5>=3加入MST

lowcost[2]=0，lowcost[3]=0，lowcost[4]=0，lowcost[5]=0，lowcost[6]=0

mst[2]=0，mst[3]=0，mst[4]=0，mst[5]=0，mst[6]=0

至此，MST构建成功，如图所示：

根据上面的过程，可以容易的写出具体实现代码如下（cpp）：

  1 #include <stdio.h>
  2 #include <stdlib.h>
  3  
  4 #define MAX 100
  5 #define MAXCOST 0x7fffffff
  6 
  7 /*
  8 测试数据如下：
  9 7 11
 10 A B 7
 11 A D 5
 12 B C 8
 13 B D 9
 14 B E 7
 15 C E 5
 16 D E 15
 17 D F 6
 18 E F 8
 19 E G 9
 20 F G 11
 21 
 22 输出
 23 A - D : 5
 24 D - F : 6
 25 A - B : 7
 26 B - E : 7
 27 E - C : 5
 28 E - G : 9
 29 Total:39
 30 
 31 */
 32  
 33 int graph[MAX][MAX];
 34  
 35 int Prim(int graph[][MAX], int n)
 36 {
 37     /* lowcost[i]记录以i为终点的边的最小权值，当lowcost[i]=0时表示终点i加入生成树 */
 38     int lowcost[MAX];
 39  
 40     /* mst[i]记录对应lowcost[i]的起点，当mst[i]=0时表示起点i加入生成树 */
 41     int mst[MAX];
 42  
 43     int i, j, min, minid, sum = 0;
 44  
 45     /* 默认选择1号节点加入生成树，从2号节点开始初始化 */
 46     for (i = 2; i <= n; i++)
 47     {
 48         /* 最短距离初始化为其他节点到1号节点的距离 */
 49         lowcost[i] = graph[1][i];
 50  
 51         /* 标记所有节点的起点皆为默认的1号节点 */
 52         mst[i] = 1;
 53     }
 54  
 55     /* 标记1号节点加入生成树 */
 56     mst[1] = 0;
 57  
 58     /* n个节点至少需要n-1条边构成最小生成树 */
 59     for (i = 2; i <= n; i++)
 60     {
 61         min = MAXCOST;
 62         minid = 0;
 63  
 64         /* 找满足条件的最小权值边的节点minid */
 65         for (j = 2; j <= n; j++)
 66         {
 67             /* 边权值较小且不在生成树中 */
 68             if (lowcost[j] < min && lowcost[j] != 0)
 69             {
 70                 min = lowcost[j];
 71                 minid = j;
 72             }
 73         }
 74         /* 输出生成树边的信息:起点，终点，权值 */
 75         printf("%c - %c : %d/n", mst[minid] + 'A' - 1, minid + 'A' - 1, min);
 76  
 77         /* 累加权值 */
 78         sum += min;
 79  
 80         /* 标记节点minid加入生成树 */
 81         lowcost[minid] = 0;
 82  
 83         /* 更新当前节点minid到其他节点的权值 */
 84         for (j = 2; j <= n; j++)
 85         {
 86             /* 发现更小的权值 */
 87             if (graph[minid][j] < lowcost[j])
 88             {
 89                 /* 更新权值信息 */
 90                 lowcost[j] = graph[minid][j];
 91  
 92                 /* 更新最小权值边的起点 */
 93                 mst[j] = minid;//如果不用打出路径的话这个数组是不需要的
 94             }
 95         }
 96     }
 97     /* 返回最小权值和 */
 98     return sum;
 99 }
100  
101 int main()
102 {
103     int i, j, k, m, n;
104     int x, y, cost;
105     char chx, chy;
106  
107     /* 读取节点和边的数目 */
108     scanf("%d%d", &m, &n);
109     getchar();
110  
111     /* 初始化图，所有节点间距离为无穷大 */
112     for (i = 1; i <= m; i++)
113     {
114         for (j = 1; j <= m; j++)
115         {
116             graph[i][j] = MAXCOST;
117         }
118     }
119  
120     /* 读取边信息 */
121     for (k = 0; k < n; k++)
122     {
123         scanf("%c %c %d", &chx, &chy, &cost);
124         getchar();
125         i = chx - 'A' + 1;
126         j = chy - 'A' + 1;
127         graph[i][j] = cost;
128         graph[j][i] = cost;
129     }
130  
131     /* 求解最小生成树 */
132     cost = Prim(graph, m);
133  
134     /* 输出最小权值和 */
135     printf("Total:%d/n", cost);
136  
137     return 0;    
138 }

下面用poj一道模板题测试

Building Roads

Time Limit: 1000MS		Memory Limit: 65536K
Total Submissions: 9360		Accepted: 2690

Description

Farmer John had just acquired several new farms! He wants to connect the farms with roads so that he can travel from any farm to any other farm via a sequence of roads; roads already connect some of the farms.

Each of the N (1 ≤ N ≤ 1,000) farms (conveniently numbered 1..N) is represented by a position (Xi, Yi) on the plane (0 ≤ Xi ≤ 1,000,000; 0 ≤ Yi ≤ 1,000,000). Given the preexisting M roads (1 ≤ M ≤ 1,000) as pairs of connected farms, help Farmer John determine the smallest length of additional roads he must build to connect all his farms.

Input

* Line 1: Two space-separated integers: N and M
* Lines 2..N+1: Two space-separated integers: Xi and Yi
* Lines N+2..N+M+2: Two space-separated integers: i and j, indicating that there is already a road connecting the farm i and farm j.

Output

* Line 1: Smallest length of additional roads required to connect all farms, printed without rounding to two decimal places. Be sure to calculate distances as 64-bit floating point numbers.

Sample Input

Sample Output

4.00
实现如下：

 1 #include <iostream>   
 2 #include <cmath>   
 3 #include <cstring>   
 4 #include <cstdio>   
 5 #define INF 0x3f3f3f3f   
 6 using namespace std;  
 7 const int N = 1001;  
 8 double graph[N][N];  
 9 bool visit[N];  
10 int n,M;  
11 typedef struct  
12 {  
13     double x;  
14     double y;  
15 }dian;  
16 dian m[N];  
17   
18 double prim()  
19 {  
20   
21     memset(visit,0,sizeof(visit));  
22   
23     double low[1001];  
24     int pos = 1;  
25     visit[1] = 1;  
26     double result = 0;  
27   
28     for(int i = 2; i <= n; i++)  
29     {  
30         low[i] = graph[pos][i];  
31     }  
32   
33     for(int i = 0; i < n-1; i++)  
34     {  
35         double Min = INF;  
36   
37         for(int j = 1; j <= n; j++)  
38         {  
39             if(!visit[j] && Min > low[j])  
40             {  
41                 Min = low[j];  
42                 pos = j;  
43   
44             }  
45         }  
46         visit[pos] = 1;  
47         result += Min;  
48   
49         for(int i = 1; i <= n; i++)  
50         {  
51             if(!visit[i] && low[i] > graph[pos][i])  
52             {  
53                 low[i] = graph[pos][i];  
54             }  
55         }  
56   
57     }  
58     return result;  
59 }  
60   
61 double dis(dian a,dian b)  
62 {  
63     return sqrt((a.x - b.x)*(a.x - b.x) + (a.y - b.y)*(a.y - b.y));  
64 }  
65   
66 int main()  
67 {  
68    // freopen("in.txt","r",stdin);   
69     while(cin>>n>>M)  
70     {  
71         memset(graph,INF,sizeof(graph));  
72         for(int i = 1; i <=n;i++)  
73         {  
74             cin >> m[i].x>>m[i].y;  
75         }  
76         for(int i = 1; i <= n; i++)  
77         {  
78             for(int j = i + 1; j <= n; j++)  
79             {  
80                 graph[i][j] = graph[j][i] = dis(m[i],m[j]);  
81             }  
82         }  
83         for(int i = 0; i < M ; i++)  
84         {  
85             int a,b;  
86             cin>>a>>b;  
87             graph[a][b] = graph[b][a] = 0;  
88         }  
89         printf("%.2lf
",prim());  
90     }  
91     return 0;  
92 }

kruskal（克鲁斯卡尔）算法

定义：

求加权连通图的最小生成树的算法。kruskal算法总共选择n- 1条边，所使用的贪婪准则是：从剩下的边中选择一条不会产生环路的具有最小耗费的边加入已选择的边的集合中。注意到所选取的边若产生环路则不可能形成一棵生成树。kruskal算法分e 步，其中e 是网络中边的数目。按耗费递增的顺序来考虑这e 条边，每次考虑一条边。当考虑某条边时，若将其加入到已选边的集合中会出现环路，则将其抛弃，否则，将它选入。

算法思想：

将每条边按照权值从小到大排序，每次取最小的权值边，如果加上该条边会产生环的话则放弃这条边，继续往下找小的，直到找出n-1条边为止，这个可以利用并查集来实现。

算法演示：

将上面poj那题再用kruskal算法做一遍

  1 #include <iostream>
  2 #include <cstdio>
  3 #include <cmath>
  4 #include <cstring>
  5 #include <algorithm>
  6 using namespace std;
  7 const int N = 1001;
  8 const int E = 1000000;
  9 int n, M;
 10 int cent;
 11 int a[N];
 12 int Count = 0;
 13  
 14 typedef struct
 15 {
 16     int x;
 17     int y;
 18     double vaule;
 19 }dian;
 20 dian m[E];
 21  
 22 typedef struct
 23 {
 24     double x, y;
 25 }situation;
 26 situation p[N];
 27  
 28 double dis(situation a, situation b)
 29 {
 30     return sqrt((a.x - b.x)*(a.x - b.x) + (a.y - b.y) * (a.y - b.y));
 31 }
 32  
 33 bool cmp(dian a, dian b)
 34 {
 35     return a.vaule < b.vaule;
 36 }
 37  
 38 void init()
 39 {
 40     //                   cent 这里应该初始化到n
 41     for (int i = 1; i <= n; i++)
 42     {
 43         a[i] = i;
 44     }
 45 }
 46  
 47 int Find(int x)
 48 {
 49     while (x != a[x])
 50     {
 51         x = a[x];
 52     }
 53     return x;
 54 }
 55  
 56 void Union(int x, int y)
 57 {
 58     // 建议做路径压缩
 59     int fx = Find(x);
 60     int fy = Find(y);
 61     if (fx != fy)
 62     {
 63         a[fx] = fy;
 64     }
 65 }
 66 //kruskal模板 
 67 double Kruskal()
 68 {
 69     // init(); 不应该在这里init
 70     sort(m, m + cent, cmp);
 71     double result = 0;
 72     for (int i = 0; i < cent&&Count != n - 1; i++)
 73     {
 74         if (Find(m[i].x) != Find(m[i].y))
 75         {
 76             Union(m[i].x, m[i].y);
 77             result += m[i].vaule;
 78             Count++;
 79         }
 80     }
 81     return result;
 82 }
 83  
 84 int main()
 85 {
 86     while (cin >> n >> M)
 87     {
 88        
 89         for (int i = 1; i <= n; i++)
 90         {
 91             cin >> p[i].x >> p[i].y;
 92         }
 93         cent = 0;
 94         Count = 0;
 95         for (int i = 1; i <= n; i++)
 96         {
 97             for (int j = i + 1; j <= n; j++)
 98             {
 99                 m[cent].x = i;
100                 m[cent].y = j;
101                 m[cent++].vaule = dis(p[i], p[j]);
102             }
103         }
104         // init不应该放在Kruskal里面
105         init();
106         for (int i = 1; i <= M; i++)
107         {
108             int a, b;
109             cin >> a >> b;
110             // 这里还是要检查Find a 和 Find b是不是一样，不然Count会错
111             if (Find(a) != Find(b)) {
112                 Union(a, b);
113                 Count++;
114             }
115         }
116  
117         printf("%.2f
", Kruskal());
118     }
119     return 0;
120 }