Array.Sort()实现细节和效率

Array.Sort有N个重载,都是调用同一个方法实现排序(Array.Sort<T>先不管)

public static void Sort(Array keys, Array items, int index, int length, IComparer comparer)
{
    // 省略一堆参数检查代码
    if ((length > 1) && (((comparer != Comparer.Default) && (comparer != null)) || !TrySZSort(keys, items, index, (index + length) - 1)))
    {
        object[] objArray = keys as object[];
        object[] objArray2 = null;
        if (objArray != null)
        {
            objArray2 = items as object[];
        }
        if ((objArray != null) && ((items == null) || (objArray2 != null)))
        {
            new SorterObjectArray(objArray, objArray2, comparer).QuickSort(index, (index + length) - 1);
        }
        else
        {
            new SorterGenericArray(keys, items, comparer).QuickSort(index, (index + length) - 1);
        }
    }
}

其中跟排序有关的代码有三处SorterObjectArray和SorterGenericArray都是Array的内部类,还有个是TrySZSort方法。

TrySZSort方法从签名来看应该是调用外部非托管的代码。

[MethodImpl(MethodImplOptions.InternalCall), ReliabilityContract(Consistency.MayCorruptInstance, Cer.MayFail)]
private static extern bool TrySZSort(Array keys, Array items, int left, int right);

也就是说,当没有自定义的Comparer对象时,Array优先尝试调用非托管代码对数组进行排序。

接下来看看SorterGenericArray和SorterObjectArray的唯一区别:
SorterObjectArray中直接使用Array.GetValue()/Array.SetValue()对数组进行读写;
SorterGenericArray中则通过Array[]索引器对数组进行读写。

Array[]索引器是怎样的呢,先看看Array里索引器的实现代码:

object IList.this[int index]
{
    get
    {
        return this.GetValue(index);
    }
    set
    {
        this.SetValue(value, index);
    }
}

Array索引器都是通过Array.GetValue()/Array.SetValue()对数组进行读写的,为什么SorterGenericArray还多此一举呢?

应当注意的是,这里的索引器是显式IList接口的索引器的实现方法,并不是Array[]索引器的实现方法,对于泛型,情况则不一样:

IList<T>泛型接口的定义如下:

public interface IList<T> : ICollection<T>, IEnumerable<T>, IEnumerable
{
    // ...
    T this[int index] { get; set; }
}

IList<T>并不实现IList接口,而是自定义了一个强类型的索引器。

再来看看List<T>泛型类的定义,List实现了List和IList<T>两种接口:
public class List<T> : IList<T>, ICollection<T>, IEnumerable<T>, IList, ICollection, IEnumerable

List<T>索引器的实现代码:

object IList.this[int index]
{
    get
    {
        return this[index];
    }
    set
    {
        List<T>.VerifyValueType(value);
        this[index] = (T) value;
    }
}
public T this[int index]
{
    get
    {
        if (index >= this._size)
        {
            ThrowHelper.ThrowArgumentOutOfRangeException();
        }
        return this._items[index];
    }
    set
    {
        if (index >= this._size)
        {
            ThrowHelper.ThrowArgumentOutOfRangeException();
        }
        this._items[index] = value;
        this._version++;
    }
}

注意List<T>的代码在实现IList的索引器时是用显示接口声明的,但IList<T>则没有显式写明接口,于是:

List<int> list = new List<int>(10);
list[0]; //这里调用的是IList<int>.this[int]索引器,返回的是int类型
((IList)list)[0]; //这里调用的是IList.this[int]索引器,返回的是object类型

值得注意的是Array.Sort()接受的是一个Array对象

List<int> myList = new List<int>();
Array.Sort(myList); // 编译错误:参数类型不匹配
Array.Sort<int>(myList); // 编译错误:注意Array.Sort<T>接受参数的是T[]数组

其实对于泛型,使用myList.Sort()就可以了(List<T>.Sort<T>实际还是调用了Array.Sort<T>方法,把内部管理的强类型数组传了过去)

现在回来看Array.Sort方法,既然Array只接受Array对象,为什么还要弄一个SorterGenericArray出来呢?
我们可以注意到只有当keys或者items(item为null时除外)不能转换成一个数组时(items),才会使用SorterGenericArray.QuickSort进行排序

有时候会给泛型类型增加一个显示/隐式类型转换,把一个泛型类型转化成一个Array数组(并不是调用ToArray哟),在这种情况下,还是可以调用Array.Sort方法的,如果这时候把这种数组交给SorterObjectArray排序,就可能有两个问题:一是泛型的强类型无法保证,二是IComparer.CompareTo调用时也会有类型问题,所以这里引入了SorterGenericArray类专门处理这类情况,毫无疑问,多了一重类型检查,排序效率相对会低点,可以在下面测试中看出来。于是有了下面的对比测试:

A.把SorterGenericArray代码提取出来,去除所有try等异常检查,作为MySorterWithComparer类主要代码进行排序
B.将A的所有GetValue/SetValue直接使用强类型索引器,并使用<和>操作符代替所有Comparer,作为MySorterWithIntArray类主要代码进行排序
C.Array.Sort()
D.Array.Sort<int>()
E.List<int>.Sort<int>()排序

对长度分别为100/500/1000/10000/10000的int型数组进行排序,发现以下情况:
1. 无论任何情况,使用Comparer和Array.GetValue()/Array.SetValue()效率都很低
2. Array.Sort()/Array.Sort<int>()/List<int>.Sort<int>()性能十分接近
3. 数组长度较小的情况下直接调用Array.Sort的效率比MySorterWithIntArray代码高很多
4. 数组长度超过1000后,MySorterWithIntArray排序比直接调用Array.Sort()效率高一点点

上面现象说明:
1. SorterGenericArray.QuickSort的实现是比较低效的
2. 或许是因为Array是定义在mscorlib中的,MySorterWithIntArray的代码则是在自己生成的程序集中,排序时使用了mscorlib程序集的类和方法。在数组长度较小的情况下,可能由于跨程序集调用的开销比实际排序的开销要大,所以没有直接调用Array.Sort高效。另外一个可能的原因是Array.Sort调用了非托管的代码。
3. 数组长度达到一定规模后,跨程序集调用的开销跟实际排序的开销比已经微乎其微,这时频繁的异常检查(Array.Sort内部有个try...catch)的开销逐渐显现。

其实就算数组比较大(>10w),MySorterWithIntArray花费的时间也只是比Array.Sort快了不到10%,总体上来说,使用Array.Sort还是一种高效方便的排序方法。

回想现在公司的笔试题,还考啥老掉牙的冒泡排序,事实就是.Net自带的快速排序函数已经非常高效,.Net程序员考排序这种事还是少来吧。

下面是测试记录(Release+代码优化版本,数据很接近,这里不重复了):
Array Size: 100
Measure Name: MySorterWithComparer.QuickSort
Time Elapsed: 3ms
CPU Cycles: 4,662,548
Gen 0:   0
Gen 1:   0
Gen 2:   0

Measure Name: MySorterWithIntArray.QuickSort
Time Elapsed: 1ms
CPU Cycles: 1,856,063
Gen 0:   0
Gen 1:   0
Gen 2:   0

Measure Name: Array.Sort
Time Elapsed: 0ms
CPU Cycles: 29,953
Gen 0:   0
Gen 1:   0
Gen 2:   0

Measure Name: Array.Sort<int>
Time Elapsed: 0ms
CPU Cycles: 29,084
Gen 0:   0
Gen 1:   0
Gen 2:   0

Measure Name: List.Sort
Time Elapsed: 0ms
CPU Cycles: 24,893
Gen 0:   0
Gen 1:   0
Gen 2:   0

Array Size: 500
Measure Name: MySorterWithComparer.QuickSort
Time Elapsed: 3ms
CPU Cycles: 6,358,979
Gen 0:   0
Gen 1:   0
Gen 2:   0

Measure Name: MySorterWithIntArray.QuickSort
Time Elapsed: 0ms
CPU Cycles: 103,741
Gen 0:   0
Gen 1:   0
Gen 2:   0

Measure Name: Array.Sort
Time Elapsed: 0ms
CPU Cycles: 119,614
Gen 0:   0
Gen 1:   0
Gen 2:   0

Measure Name: Array.Sort<int>
Time Elapsed: 0ms
CPU Cycles: 122,639
Gen 0:   0
Gen 1:   0
Gen 2:   0

Measure Name: List.Sort
Time Elapsed: 0ms
CPU Cycles: 118,514
Gen 0:   0
Gen 1:   0
Gen 2:   0

Array Size: 1000
Measure Name: MySorterWithComparer.QuickSort
Time Elapsed: 8ms
CPU Cycles: 14,291,893
Gen 0:   0
Gen 1:   0
Gen 2:   0

Measure Name: MySorterWithIntArray.QuickSort
Time Elapsed: 0ms
CPU Cycles: 201,663
Gen 0:   0
Gen 1:   0
Gen 2:   0

Measure Name: Array.Sort
Time Elapsed: 0ms
CPU Cycles: 251,647
Gen 0:   0
Gen 1:   0
Gen 2:   0

Measure Name: Array.Sort<int>
Time Elapsed: 0ms
CPU Cycles: 248,611
Gen 0:   0
Gen 1:   0
Gen 2:   0

Measure Name: List.Sort
Time Elapsed: 0ms
CPU Cycles: 236,148
Gen 0:   0
Gen 1:   0
Gen 2:   0

Array Size: 10000
Measure Name: MySorterWithComparer.QuickSort
Time Elapsed: 108ms
CPU Cycles: 180,057,306
Gen 0:   3
Gen 1:   0
Gen 2:   0

Measure Name: MySorterWithIntArray.QuickSort
Time Elapsed: 1ms
CPU Cycles: 2,456,267
Gen 0:   0
Gen 1:   0
Gen 2:   0

Measure Name: Array.Sort
Time Elapsed: 1ms
CPU Cycles: 2,663,441
Gen 0:   0
Gen 1:   0
Gen 2:   0

Measure Name: Array.Sort<int>
Time Elapsed: 1ms
CPU Cycles: 2,671,900
Gen 0:   0
Gen 1:   0
Gen 2:   0

Measure Name: List.Sort
Time Elapsed: 1ms
CPU Cycles: 2,683,780
Gen 0:   0
Gen 1:   0
Gen 2:   0

Array Size: 100000
Measure Name: MySorterWithComparer.QuickSort
Time Elapsed: 1,224ms
CPU Cycles: 2,174,465,964
Gen 0:   39
Gen 1:   0
Gen 2:   0

Measure Name: MySorterWithIntArray.QuickSort
Time Elapsed: 15ms
CPU Cycles: 28,609,746
Gen 0:   0
Gen 1:   0
Gen 2:   0

Measure Name: Array.Sort
Time Elapsed: 17ms
CPU Cycles: 30,748,663
Gen 0:   0
Gen 1:   0
Gen 2:   0

Measure Name: Array.Sort<int>
Time Elapsed: 17ms
CPU Cycles: 30,883,336
Gen 0:   0
Gen 1:   0
Gen 2:   0

Measure Name: List.Sort
Time Elapsed: 17ms
CPU Cycles: 30,843,142
Gen 0:   0
Gen 1:   0
Gen 2:   0

测试代码:

namespace Just4Test.SortTest
{
    class SortTestCode
    {
        static void Main(string[] args)
        {
            TestSort(100);
            TestSort(500);
            TestSort(1000);
            TestSort(10000);
            TestSort(100000);
        }
        static int[] GenerateArray(int count)
        {
            Random rnd = new Random();
            int[] array = new int[count];
            for (int i = 0; i < count; i++)
            {
                array[i] = rnd.Next(int.MinValue, int.MaxValue);
            }
            return array;
        }
        static void TestSort(int count)
        {
            Console.WriteLine("Array Size: " + count.ToString());
            int[] src = GenerateArray(count);
            MySorterWithComparer arr1 = new MySorterWithComparer((int[])(src.Clone()));
            MySorterWithIntArray arr2 = new MySorterWithIntArray((int[])(src.Clone()));
            int[] arr3 = (int[])(src.Clone());
            int[] arr4 = (int[])(src.Clone());
            List<int> list = new List<int>((int[])(src.Clone()));
            GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
            CodeTimer.BeginMeasure("MySorterWithComparer.QuickSort");
            arr1.QuickSort(0, count - 1);
            CodeTimer.EndMeasure();
            GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
            CodeTimer.BeginMeasure("MySorterWithIntArray.QuickSort");
            arr2.QuickSort(0, count - 1);
            CodeTimer.EndMeasure();
            GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
            CodeTimer.BeginMeasure("Array.Sort");
            Array.Sort(arr3);
            CodeTimer.EndMeasure();
            GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
            CodeTimer.BeginMeasure("Array.Sort<int>");
            Array.Sort<int>(arr4);
            CodeTimer.EndMeasure();
            GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
            CodeTimer.BeginMeasure("List.Sort");
            list.Sort();
            CodeTimer.EndMeasure();
            GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
        }
    }
    class MySorterWithComparer
    {
        private Array array;
        private IComparer comparer;
        internal MySorterWithComparer(Array array)
        {
            this.array = array;
            this.comparer = Comparer.Default;
        }
        internal void SwapIfGreaterWithItems(int a, int b)
        {
            if (a != b)
            {
                if (this.comparer.Compare(this.array.GetValue(a), this.array.GetValue(b)) > 0)
                {
                    object obj2 = this.array.GetValue(a);
                    this.array.SetValue(this.array.GetValue(b), a);
                    this.array.SetValue(obj2, b);
                }
            }
        }
        internal void QuickSort(int left, int right)
        {
            do
            {
                int low = left;
                int hi = right;
                int median = (low + ((hi - low) >> 1));
                this.SwapIfGreaterWithItems(low, median);
                this.SwapIfGreaterWithItems(low, hi);
                this.SwapIfGreaterWithItems(median, hi);
                object y = this.array.GetValue(median);
                do
                {
                    while (this.comparer.Compare(this.array.GetValue(low), y) < 0)
                    {
                        low++;
                    }
                    while (this.comparer.Compare(y, this.array.GetValue(hi)) < 0)
                    {
                        hi--;
                    }
                    if (low > hi)
                    {
                        break;
                    }
                    if (low < hi)
                    {
                        object obj3 = this.array.GetValue(low);
                        this.array.SetValue(this.array.GetValue(hi), low);
                        this.array.SetValue(obj3, hi);
                    }
                    if (low != 0x7fffffff)
                    {
                        low++;
                    }
                    if (hi != -2147483648)
                    {
                        hi--;
                    }
                }
                while (low <= hi);
                if ((hi - left) <= (right - low))
                {
                    if (left < hi)
                    {
                        this.QuickSort(left, hi);
                    }
                    left = low;
                }
                else
                {
                    if (low < right)
                    {
                        this.QuickSort(low, right);
                    }
                    right = hi;
                }
            }
            while (left < right);
        }
    }
class MySorterWithIntArray
    {
        private int[] array;
        internal MySorterWithIntArray(int[] array)
        {
            this.array = array;
        }
        internal void SwapIfGreaterWithItems(int a, int b)
        {
            if (a != b && array[a] > array[b])
            {
                int c = array[a];
                array[a] = array[b];
                array[b] = c;
            }
        }
        internal void QuickSort(int left, int right)
        {
            do
            {
                int low = left;
                int high = right;
                int median = (low + ((high - low) >> 1));
                SwapIfGreaterWithItems(low, median);
                SwapIfGreaterWithItems(low, high);
                SwapIfGreaterWithItems(median, high);
                int y = array[median];
                do
                {
                    while (array[low] < y) low++;
                    while (y < array[high]) high--;
                    if (low > high) break;
                    if (low < high)
                    {
                        int tmp = array[low];
                        array[low] = array[low];
                        array[high] = tmp;
                    }
                    low++;
                    high--;
                }
                while (low <= high);
                if ((high - left) <= (right - low))
                {
                    if (left < high) QuickSort(left, high);
                    left = low;
                }
                else
                {
                    if (low < right) QuickSort(low, right);
                    right = high;
                }
            }
            while (left < right);
        }
    }
}
原文地址:https://www.cnblogs.com/neutra/p/1770246.html