洗牌算法（新转）

在工作中需要重写一个洗牌算法，根据网络中的资料分析了一下，已经有总结得很好的了，就直接总结转载了一下。

洗牌算法大致有3种，按发明时间先后顺序如下:

一、Fisher–Yates Shuffle

算法思想就是从原始数组中随机抽取一个新的数字到新数组中。算法英文描述如下：
Write down the numbers from 1 through N.
Pick a random number k between one and the number of unstruck numbers remaining (inclusive).
Counting from the low end, strike out the kth number not yet struck out, and write it down elsewhere.
Repeat from step 2 until all the numbers have been struck out.
The sequence of numbers written down in step 3 is now a random permutation of the original numbers.

#python 实现
#Fisher–Yates Shuffle
'''
 1. 从还没处理的数组（假如还剩k个）中，随机产生一个[0, k]之间的数字p（假设数组从0开始）；
 2. 从剩下的k个数中把第p个数取出；
 3. 重复步骤2和3直到数字全部取完；
 4. 从步骤3取出的数字序列便是一个打乱了的数列。
'''
import random

def shuffle(lis):
    result = []
    while lis:
        p = random.randrange(0, len(lis))
        result.append(lis[p])
        lis.pop(p)
    return result

r = shuffle([1, 2, 2, 3, 3, 4, 5, 10])
print(r)

随机抽出一张牌，检查这种牌是否被抽取过，如果已经被抽取过，则重新抽取，知道找到没有被抽取的牌；重复该过程，知道所有的牌都被抽取到。
这种算法是比较符合大脑的直观思维，这种算法有两种形式：

每次随机抽取后，将抽取的牌拿出来，则此时剩余的牌为(N-1)，这种算法避免了重复抽取，但是每次抽取一张牌后，都有一个删除操作，需要在原始数组中删除随机选中的牌(可使用Hashtable实现)

每次随机抽取后，将抽取的符合要求的牌做好标记，但并不删除；与1相比，省去了删除的操作，但增加了而外的存储标志为的空间，同时导致可每次可能会抽取之前抽过的牌
这种方法的时间/空间复杂度都不好。

二、Knuth-Durstenfeld Shuffle

Knuth 和Durstenfeld 在Fisher 等人的基础上对算法进行了改进。每次从未处理的数据中随机取出一个数字，然后把该数字放在数组的尾部，即数组尾部存放的是已经处理过的数字。这是一个原地打乱顺序的算法，算法时间复杂度也从Fisher算法的O(n2)提升到了O(n)。算法伪代码如下：

Knuth-Durstenfeld Shuffle

#Knuth-Durstenfeld Shuffle
def shuffle(lis):
    for i in range(len(lis) - 1, 0, -1):
        p = random.randrange(0, i + 1)
        lis[i], lis[p] = lis[p], lis[i]
    return lis

r = shuffle([1, 2, 2, 3, 3, 4, 5, 10])
print(r)

三、Inside-Out Algorithm

Knuth-Durstenfeld Shuffle 是一个in-place算法，原始数据被直接打乱，有些应用中可能需要保留原始数据，因此需要开辟一个新数组来存储打乱后的序列。Inside-Out Algorithm 算法的基本思想是设一游标i从前向后扫描原始数据的拷贝，在[0, i]之间随机一个下标j，然后用位置j的元素替换掉位置i的数字，再用原始数据位置i的元素替换掉拷贝数据位置j的元素。其作用相当于在拷贝数据中交换i与j位置处的值。伪代码如下：

Inside-Out Algorithm

def shuffle(lis):
    result = lis[:]
    for i in range(1, len(lis)):
        j = random.randrange(0, i)
        result[i] = result[j]
        result[j] = lis[i]
    return result

r = shuffle([1, 2, 2, 3, 3, 4, 5, 10])
print(r)

在看了以上三种算法后，我选择了第二种算法。在c++ STL中就已经有可用的函数 std::random_shuffle()

STL中的函数random_shuffle()用来对一个元素序列进行重新排序（随机的），函数原型如下：

template<class RandomAccessIterator>  
   void random_shuffle(  
      RandomAccessIterator _First, //指向序列首元素的迭代器  
      RandomAccessIterator _Last  //指向序列最后一个元素的下一个位置的迭代器  
   );

例子如下

#include "stdafx.h"
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;

int _tmain(int argc, _TCHAR* argv[])
{
     vector<string> str;
     str.push_back("hello");
     str.push_back("world");
     str.push_back("welcome");
     str.push_back("to");
     str.push_back("Beijing");

     std::random_shuffle(str.begin(),str.end());//迭代器

     for(int j = 0; j < str.size(); j++)
     {
         cout<<str[j].c_str()<<" ";
     }
     cout<<endl;
 
    system("pause");
    return 0;
}

//这是一个错误的用法
#include "stdafx.h"  
#include <iostream>  
#include <vector>  
#include <algorithm>  
using namespace std;  
  
int _tmain(int argc, _TCHAR* argv[])  
{  
    char arr[] = {'a', 'b', 'c', 'd', 'e', 'f'};  
  
     std::random_shuffle(arr,arr+6);//迭代器  
  
     for(int j = 0; j < 6; j++)  
     {  
         cout<<arr[j]<<" ";  
     }  
     cout<<endl;  
   
    system("pause");  
    return 0;  
}

以上的两个例子并不是正确的使用，因为random_shuffle函数需要随机选取，要初始化随机种子，上面的没有初始化随机种子，导致每次的顺序是一样的。
下面是做的一个测试例子：

#include <iostream>
using namespace std;

int randomInt(int* pArr, int max)
{
    for (int nIndex = 0; nIndex < max; nIndex ++)
    {
        pArr[nIndex] = nIndex;
    }
    std::random_shuffle(pArr, pArr + max);
    for (int nIndex = 0; nIndex < max; nIndex ++)
    {
        cout <<  pArr[nIndex] << "   " ;
    }
    cout << endl;
    return 0;
}

int main()
{
    int arr[60];
    for (int nIndex = 0; nIndex < 1; ++nIndex)
    {
        randomInt(arr, 4);
    }
    return 0;
}

输出如下：

0   1   3   2

多次运行都是这个顺序,后面加上随机种子初始化问题解决。

#include <iostream>
#include <ctime>
using namespace std;

int randomInt(int* pArr, int max)
{
    srand((unsigned)time(NULL));   
    for (int nIndex = 0; nIndex < max; nIndex ++)
    {
        pArr[nIndex] = nIndex;
    }
    std::random_shuffle(pArr, pArr + max);
    for (int nIndex = 0; nIndex < max; nIndex ++)
    {
        cout <<  pArr[nIndex] << "   " ;
    }
    cout << endl;
    return 0;
}

int main()
{
    int arr[60];
    for (int nIndex = 0; nIndex < 1; ++nIndex)
    {
        randomInt(arr, 4);
    }
    return 0;
}

参考资料

Fisher–Yates shuffle
洗牌算法shuffle
洗牌程序
 STL中的random_shuffle()方法的使用
 c++关于random_shuffle的问题

作者：树林里的小怪兽
链接：https://www.jianshu.com/p/8daea2214fb6
来源：简书
著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。