Java 基础：数组

一、数组声明:

int[] x;
int x[];

在Java中一般使用前者，机把int[]看做一个类型，C++中只能后者

二、数组初始化:

直接提供值：

int[]   x = {1, 3, 4};
int[][] y = {
                   {1, 2, 3},
                   {4, 5}
            };

各个维度的长度信息直接根据提供的值得出。这种使用大括号包裹的值序列作为数组看待，仅仅在数组初始化时才成立，其他情况均认为语法错误。

赋值初始化：

1.默认数组对象：

int[] x = new int[5];
Integer[] y = new Integer[5];

基本类型数组元素为0/false，引用类型的数组元素初值全为null

2.匿名数组对象

int x[] = new int[] { 1, 2 };

匿名数组可以指定数组内的值，不仅可以用在数组类型变量的初始化（其实就是一个赋值）还可以作为函数参数：

    public static void main(final String[] args) {
        print(new String[] { "first line", "second line" });
    }

    public static void print(final String[] lines) {
        for (String line : lines) {
            System.out.println(line);
        }
    }

匿名数组再指定其内值后不能再指定其维度上的长度，也就是说默认数组对象和匿名数组对象只能二选一。

三、数组类型

数组是协变类型(covariant)，参见stackoverflow上的讨论：http://stackoverflow.com/questions/18666710/why-are-arrays-covariant-but-generics-are-invariant

简单来说就是如果Male是Man的子类那么，Male[]也是一个Man[]。由讨论可知早期的Java是没有泛型的，所以类似的任务需要把不同类型一些通用处理函数的形参定为Object[]来处理。

现在依旧可以直接使用如下代码：

        ArrayList x = new ArrayList();
        x.add(new Integer(100));
        System.out.println(x.get(0));

只不过会给出警告:

"ArrayList is a raw type. References to generic type ArrayList<E> should be parameterized" raw type就是Object类型。

当加入类型参数后就不会有警告了，即有了泛型后，可以保证容器在编译阶段的类型安全，但实际上编译后使用的其实还是raw type那份代码，不像C++模板真的会有一份独立的代码生成，这个过程称为类型擦除。

如果把数组和泛型结合起来会失败：

        List<String>[] x = new List<String>[] { new ArrayList<String>() };
        List<String>[] y = new List<String>[2];

以上两行均有错误提示不能创建对应的泛型list容器的数组。按照Thinking in Java上的说法是：类型擦除会移除数组的类型信息，而数组必须知道他们所持有的确切类型，以强制保证类型安全。不过我们知道，List<String>是List的子类型（这么说是否合适？），而数组又是协变的，那么可以这么做实现上述操作：

        List<String>[] z = new List[10];
        z[0] = new ArrayList<String>();
        z[0].add("haha");
        System.out.println(z[0].get(0));

只有第一行会出现警告，后面的操作其实都包含了泛型检查（信息来自List<String>数组定义）。

四、数组拷贝

Arrays实用类中给出了一些用于数组拷贝的方法：

copyOf(type[] a, int length)

copyOfRange(type[] a, int from, int to)

Java Core 中文版第八版第一卷中的P81中列出的Arrays API有错，将上述后者也写成了copyOf

其中type对应于不同的基本类型和Object类型，以int类型为例：

    public static int[] copyOf(int[] original, int newLength) {
        int[] copy = new int[newLength];
        System.arraycopy(original, 0, copy, 0,
                         Math.min(original.length, newLength));
        return copy;
    }

如果newLength比原有的数组来的大那么在多出来的元素位置上其值被初始化为对应基本类型元素的零值（数值-0，布尔-false，char-null字符，引用-null，其实就是与数组初始化时的情况一样，因为方法内创建一个了长度较大的数组这些元素就是处于末尾的，未被赋值的）

另外还有各自的泛型版本：

    public static <T,U> T[] copyOf(U[] original, int newLength, Class<? extends T[]> newType) {
        T[] copy = ((Object)newType == (Object)Object[].class)
            ? (T[]) new Object[newLength]
            : (T[]) Array.newInstance(newType.getComponentType(), newLength);
        System.arraycopy(original, 0, copy, 0,
                         Math.min(original.length, newLength));
        return copy;
    }

由此我们可以看到通过Array类创建数组的方法：

        Integer[] ix = (Integer[]) Array.newInstance(Integer.class, 10);
        ix[0] = 123;
        System.out.println(Arrays.deepToString(ix));

实际上这些数组拷贝的代码最后都调用到了一个System.arraycopy，给出其API说明如下：

Copies an array from the specified source array, beginning at the specified position, to the specified position of the destination array. A subsequence of array components are copied from the source array referenced by src to the destination array referenced by dest. The number of components copied is equal to the length argument. The components at positions srcPos through srcPos+length-1 in the source array are copied into positions destPos through destPos+length-1, respectively, of the destination array.

If the src and dest arguments refer to the same array object, then the copying is performed as if the components at positions srcPos through srcPos+length-1 were first copied to a temporary array with length components and then the contents of the temporary array were copied into positions destPos through destPos+length-1 of the destination array.

If dest is null, then a NullPointerException is thrown.

If src is null, then a NullPointerException is thrown and the destination array is not modified.

Otherwise, if any of the following is true, an ArrayStoreException is thrown and the destination is not modified:

The src argument refers to an object that is not an array.
The dest argument refers to an object that is not an array.
The src argument and dest argument refer to arrays whose component types are different primitive types.
The src argument refers to an array with a primitive component type and the dest argument refers to an array with a reference component type.
The src argument refers to an array with a reference component type and the dest argument refers to an array with a primitive component type.
Otherwise, if any of the following is true, an IndexOutOfBoundsException is thrown and the destination is not modified:

The srcPos argument is negative.
The destPos argument is negative.
The length argument is negative.
srcPos+length is greater than src.length, the length of the source array.
destPos+length is greater than dest.length, the length of the destination array.
Otherwise, if any actual component of the source array from position srcPos through srcPos+length-1 cannot be converted to the component type of the destination array by assignment conversion, an ArrayStoreException is thrown. In this case, let k be the smallest nonnegative integer less than length such that src[srcPos+k] cannot be converted to the component type of the destination array; when the exception is thrown, source array components from positions srcPos through srcPos+k-1 will already have been copied to destination array positions destPos through destPos+k-1 and no other positions of the destination array will have been modified. (Because of the restrictions already itemized, this paragraph effectively applies only to the situation where both arrays have component types that are reference types.)

特别的对于原数组和目的数组都是同一个数组的情况，arraycopy会把范围内的拷贝到一个临时数组中在把值赋值到目标数组上，所以不会造成覆盖问题。

五、数组与列表转换

数组和列表(List)是非常常用的数据结构，有时候需要在他们之间进行转换

Array->List

使用Arrays实用类中的asList方法：

    public static <T> List<T> asList(T... a) {
        return new ArrayList<>(a);
    }

注意这里的ArrayList是在Arrays中的并不是一般我们用到的那个ArrayList，它继承了AbstractList但只是把参数中的a用作内部的数据存储，并没有提供像真正的ArrayList那样的添加操作的实现，只是提供了一个数组的List视图，提供有限的操作

        Integer[] x = new Integer[] { 1, 2, 3, 4, 5 };
        List<Integer> ls = Arrays.asList(x);
        
        System.out.println(ls);
        
        ls.add(100); // this line will throw UnsupportedOperationException

即那些能够改变数组a大小的AbstractList中的抽象方法都没有复写（父类直接抛异常），上述代码可以通过编译，但是在运行到ls.add(100)这一行时会抛出异常。整个类实现如下：

    private static class ArrayList<E> extends AbstractList<E>
        implements RandomAccess, java.io.Serializable
    {
        private static final long serialVersionUID = -2764017481108945198L;
        private final E[] a;

        ArrayList(E[] array) {
            if (array==null)
                throw new NullPointerException();
            a = array;
        }

        public int size() {
            return a.length;
        }

        public Object[] toArray() {
            return a.clone();
        }

        public <T> T[] toArray(T[] a) {
            int size = size();
            if (a.length < size)
                return Arrays.copyOf(this.a, size,
                                     (Class<? extends T[]>) a.getClass());
            System.arraycopy(this.a, 0, a, 0, size);
            if (a.length > size)
                a[size] = null;
            return a;
        }

        public E get(int index) {
            return a[index];
        }

        public E set(int index, E element) {
            E oldValue = a[index];
            a[index] = element;
            return oldValue;
        }

        public int indexOf(Object o) {
            if (o==null) {
                for (int i=0; i<a.length; i++)
                    if (a[i]==null)
                        return i;
            } else {
                for (int i=0; i<a.length; i++)
                    if (o.equals(a[i]))
                        return i;
            }
            return -1;
        }

        public boolean contains(Object o) {
            return indexOf(o) != -1;
        }
    }

由于set操作不改变原有数组大小，就是一般的赋值操作，所以是支持的。asList中使用基本类型数组时会与预期不符，如下：

        int[] x = new int[] { 1, 2, 3, 4, 5 };
        Arrays.asList(x);

虽然以上代码能够通过编译，但是其得到的List是List<int[]>类型，即int[].class整体成为一个泛型类型参数，基本类型没有对应的Class<?>对象，使用的时候要注意。

正如asList方法的注释所述，这个方法建立了基于数组的API和基于Collection集合框架的API之间的联系：

Returns a fixed-size list backed by the specified array. (Changes to the returned list "write through" to the array.) This method acts as bridge between array-based and collection-based APIs, in combination with Collection.toArray. The returned list is serializable and implements RandomAccess. 

This method also provides a convenient way to create a fixed-size list initialized to contain several elements: 

     List<String> stooges = Arrays.asList("Larry", "Moe", "Curly");

List->Array

List接口中提供了两个转换方法：

Object[] toArray();
<T> T[] toArray(T[] a);

前者直接返回一个Object数组，需要用户自己进行类型转换工作，后者为泛型方式直接通过提供的数组类型觉得返回数组类型。如果提供的数组长度大于实际需要的长度length，那边返回数组的index=length处被复制为null，如果不够则会返回一个由方法创建的同类型数组。

此处的转换方式依据不同List实现而不同，总体而言就是生成一个数组遍历列表中的值赋值为数组对应索引上的值。对于ArrayList这样的实现，因为其内部存储就是一个数组，就可以在创建返回数组后直接调用Arrays.copyOf或者System.arraycopy:

    @SuppressWarnings("unchecked")
    public <T> T[] toArray(T[] a) {
        if (a.length < size)
            // Make a new array of a's runtime type, but my contents:
            return (T[]) Arrays.copyOf(elementData, size, a.getClass());
        System.arraycopy(elementData, 0, a, 0, size);
        if (a.length > size)
            a[size] = null;
        return a;
    }

再给出LinkedList的对应实现：

    @SuppressWarnings("unchecked")
    public <T> T[] toArray(T[] a) {
        if (a.length < size)
            a = (T[])java.lang.reflect.Array.newInstance(
                                a.getClass().getComponentType(), size);
        int i = 0;
        Object[] result = a;
        for (Node<E> x = first; x != null; x = x.next)
            result[i++] = x.item;

        if (a.length > size)
            a[size] = null;

        return a;
    }

根据toArray的特性我们可以写如下代码，提供一个长度为0的匿名数组，始终让其帮我们创建存储数组：

        List<Integer> li = new ArrayList<Integer>();
        li.add(100);
        li.add(200);

        Integer[] ai = li.toArray(new Integer[] {});

        System.out.println(Arrays.deepToString(ai));
        // [100, 200]

六、数组比较

数组对象本身没有overwrite继承自Object的原始equals方法，但是Arrays.equals(type[] a, type[] b)提供一个比较两个数组是否相等：

首先数组长度相等，如果都为null也作为相等

然后比较各个位置上的元素对应相等，对于Object类型的元素执行equals相等（a[i].equals(b[i])）,如果都为null则作为相等

七、数组toString

与equals类似，数组对象本身并没有覆盖来自Object的toString方法，而是在Arrays实用类中添加了

toString(type[] a)

deepToString(Object[] a)

后者会递归其内的所有数组，并带有循环引用的检测：

        Object[] ref = new Object[] { new String[] { "a", "b" }, null };
        ref[1] = ref;
        System.out.println(Arrays.deepToString(ref));
        // [[a, b], [...]]

其中带有循环引用的部分会以'[...]'代替

八、排序与搜索

Arrays类提供排序与二分搜索函数，在指定的范围进行搜索之前先要将其排序好

Arrays.sort(type[] a)

根据API文档注释可知该排序是稳定的，叫TimSort，借鉴自Python源码，sort还有其他几个版本可以指定排序区间与comparator：

    public static <T> void sort(T[] a, int fromIndex, int toIndex,
                                Comparator<? super T> c) {
        if (LegacyMergeSort.userRequested)
            legacyMergeSort(a, fromIndex, toIndex, c);
        else
            TimSort.sort(a, fromIndex, toIndex, c);
    }

注意这里有一个userRequested开关变量，即用户可以通过java运行参数调整使用的排序方法如这篇blog（http://www.lifebackup.cn/timsort-java7.html）所述：

-Djava.util.Arrays.useLegacyMergeSort=true

这篇blog中提到了如果实现的Comparable/Comparator接口不正确的话会导致TimSort运行异常，这个和C++中比较函数要求是Strict weak ordering一样

给出一个使用实例：

        Integer[] xi = new Integer[] { 6, 5, 4, 1, 2, 3 };
        Arrays.sort(xi, 0, 2);
        System.out.println(Arrays.toString(xi));
        // [5, 6, 4, 3, 2, 1]

        Arrays.sort(xi, new Comparator<Integer>() {
            @Override
            public int compare(final Integer o1, final Integer o2) {
                return o2 - o1;
            }
        });
        System.out.println(Arrays.toString(xi));
        // [6, 5, 4, 3, 2, 1]

上述例子中的倒序排序也可以使用Collections类中的方法生成：

Arrays.sort(xi, Collections.reverseOrder());

sort是按照元素比较进行排序所以数组中的元素必须都能相互比较，除了基本类型外都要实现Comparable接口除非提供了Comparator

Arrays搜索在已经排序的范围内进行，使用二分搜索，泛型带范围的一份搜索代码如下：

    public static <T> int binarySearch(T[] a, int fromIndex, int toIndex,
                                       T key, Comparator<? super T> c) {
        rangeCheck(a.length, fromIndex, toIndex);
        return binarySearch0(a, fromIndex, toIndex, key, c);
    }

    // Like public version, but without range checks.
    private static <T> int binarySearch0(T[] a, int fromIndex, int toIndex,
                                         T key, Comparator<? super T> c) {
        if (c == null) {
            return binarySearch0(a, fromIndex, toIndex, key);
        }
        int low = fromIndex;
        int high = toIndex - 1;

        while (low <= high) {
            int mid = (low + high) >>> 1;
            T midVal = a[mid];
            int cmp = c.compare(midVal, key);
            if (cmp < 0)
                low = mid + 1;
            else if (cmp > 0)
                high = mid - 1;
            else
                return mid; // key found
        }
        return -(low + 1);  // key not found.
    }

如果值未在指定范围内找到会返回其应该插入的索引值index-1，这里为什么要-1，因为如果一个数组是[1,2,3,4]，此时搜索-100这个值那么它不在数组内，搜索方法应该返回它所在的位置减一，即-1，如果不减一就是0（是一个合法的索引值），无法和找到元素时的结果区分。