排序算法八：归并排序

声明：引用请注明出处http://blog.csdn.net/lg1259156776/

引言

在我的博文《“主宰世界”的10种算法短评》中给出的首个算法就是高效的排序算法。本文将对排序算法做一个全面的梳理，从最简单的“冒泡”到高效的堆排序等。

系列博文的前七篇分别讲述了插入排序、交换排序和选择排序，本文讲述第四大类的排序算法：归并排序。

排序相关的的基本概念

排序：将一组杂乱无章的数据按一定的规律顺次排列起来。
- 数据表( data list): 它是待排序数据对象的有限集合。
- 排序码(key):通常数据对象有多个属性域，即多个数据成员组成,其中有一个属性域可用来区分对象,作为排序依据。该域即为排序码。每个数据表用哪个属性域作为排序码，要视具体的应用需要而定。
分类
- 内排序：指在排序期间数据对象全部存放在内存的排序；
- 外排序：指在排序期间全部对象个数太多，不能同时存放在内存，必须根据排序过程的要求，不断在内、外存之间移动的排序。

排序算法的分析

排序算法的稳定性

如果在对象序列中有两个对象r[i]和r[j] ,它们的排序码k[i]==k[j] 。如果排序前后,对象r[i]和r[j] 的相对位置不变，则称排序算法是稳定的；否则排序算法是不稳定的。

排序算法的评价

时间开销

排序的时间开销可用算法执行中的数据比较次数与数据移动次数来衡量。
算法运行时间代价的大略估算一般都按平均情况进行估算。对于那些受对象排序码序列初始排列及对象个数影响较大的，需要按最好情况和最坏情况进行估算

空间开销

算法执行时所需的附加存储。

归并排序(merge sort)

归并，是将两个或两个以上的有序表合并成一个新的有序表。

对象序列initList中两个有序表V[1]…V[m]和V[m+1]…V[n]。它们可归并成一个有序表,存于另一对象序列mergedList的V[1]…V[n]中。这种归并方法称为两路归并(2- way merging) 。

这里写图片描述

归并排序的特点和思想

采用分而治之（divide and conquer）的策略。
小的数据表排序比大的数据表要快。
从两个已经排好序的数据表中构造一个排好序的数据表要比从两个未排序的书记表中构造要少许多步骤。
它是一个稳定的排序算法。
可以从递归、迭代两种思想实现。

迭代的归并排序算法

迭代的归并排序算法就是利用两路归并过程进行排序的。其基本思想是：假设初始序列有n个对象，首先把它看n个长度为1的有序子序列,两两并归,重复上述操作,直到得到一个序列。

辅助空间占用多
稳定

递归的归并排序算法

与快速排序类似，归并排序也可以利用划分为子序列的方法递归实现。在递归的归并排序算法中，首先要把整个待排序序列划分为两个长度大致相等的部分,左子表和右子表。对子表分别递归地进行排序, 然后再把排好序的两个子表并归。

这里写图片描述

链表的归并排序方法的递归深度为O(log2n)，对象排序码的比较次数为O(nlog2n)。
稳定。

递归的归并排序算法的c plus plus实现

使用C++标准模板库STL中的vector数据结构来存储数据表，实现的代码如下：

#include <iostream>
#include <vector>

using namespace std;

void print(vector<int> v)
{
    for(int i = 0; i < v.size(); i++) cout << v[i] << " ";
    cout << endl;
}

vector<int> merge(vector<int> left, vector<int> right)
{
   vector<int> result;
   while ((int)left.size() > 0 || (int)right.size() > 0) {
      if ((int)left.size() > 0 && (int)right.size() > 0) {
         if ((int)left.front() <= (int)right.front()) {
            result.push_back((int)left.front());
            left.erase(left.begin());
         } 
     else {
            result.push_back((int)right.front());
            right.erase(right.begin());
         }
      }  else if ((int)left.size() > 0) {
            for (int i = 0; i < (int)left.size(); i++)
               result.push_back(left[i]);
            break;
      }  else if ((int)right.size() > 0) {
            for (int i = 0; i < (int)right.size(); i++)
               result.push_back(right[i]);
            break;
      }
   }
   return result;
}

vector<int> mergeSort(vector<int> m)
{
   if (m.size() <= 1)
      return m;

   vector<int> left, right, result;
   int middle = ((int)m.size()+ 1) / 2;

   for (int i = 0; i < middle; i++) {
      left.push_back(m[i]);
   }

   for (int i = middle; i < (int)m.size(); i++) {
      right.push_back(m[i]);
   }

   left = mergeSort(left);
   right = mergeSort(right);
   result = merge(left, right);

   return result;
}

int main()
{
   vector<int> v;

   v.push_back(38);
   v.push_back(27);
   v.push_back(43);
   v.push_back(3);
   v.push_back(9);
   v.push_back(82);
   v.push_back(10);

   print(v);
   cout << "------------------" << endl;

   v = mergeSort(v);

   print(v);
}

输出为：

38 27 43 3 9 82 10
------------------
3 9 10 27 38 43 82

使用数据作为数据表的存储结构，实现的代码如下：

#include <iostream>
using namespace std;

void print(int a[], int sz)
{
    for (int i = 0; i < sz; i++) cout << a[i] << " ";
    cout << endl;
}

void merge(int a[], const int low, const int mid, const int high)
{
    int *temp = new int[high-low+1];

    int left = low;
    int right = mid+1;
    int current = 0;
    // Merges the two arrays into temp[] 
    while(left <= mid && right <= high) {
        if(a[left] <= a[right]) {
            temp[current] = a[left];
            left++;
        }
        else { // if right element is smaller that the left
            temp[current] = a[right];  
            right++;
        }
        current++;
    }

    // Completes the array 

        // Extreme example a = 1, 2, 3 || 4, 5, 6
        // The temp array has already been filled with 1, 2, 3, 
        // So, the right side of array a will be used to fill temp.
    if(left > mid) { 
        for(int i=right; i <= high;i++) {
            temp[current] = a[i];
            current++;
        }
    }
        // Extreme example a = 6, 5, 4 || 3, 2, 1
        // The temp array has already been filled with 1, 2, 3
        // So, the left side of array a will be used to fill temp.
    else {  
        for(int i=left; i <= mid; i++) {
            temp[current] = a[i];
            current++;
        }
    }
    // into the original array
    for(int i=0; i<=high-low;i++) {
                a[i+low] = temp[i];
    }
    delete[] temp;
}

void merge_sort(int a[], const int low, const int high)
{
    if(low >= high) return;
    int mid = (low+high)/2;
    merge_sort(a, low, mid);  //left half
    merge_sort(a, mid+1, high);  //right half
    merge(a, low, mid, high);  //merge them
}

int main()
{        
    int a[] = {38, 27, 43, 3, 9, 82, 10};
    int arraySize = sizeof(a)/sizeof(int);

    print(a, arraySize);

    merge_sort(a, 0, (arraySize-1) );   

    print(a, arraySize);    
    return 0;
}

2015-9-27 艺少