数学知识巧学JCF(Java Collections framework)

　不知你是否还记得高中我们学过的集合，映射，函数，数学确实很牛逼，拿它来研究java集合类，轻而易举的就把知识理解了。本篇文章适合初学java集合类的小白，也适合补充知识漏缺的学习者，同时也是面试者可以参考的一份资料。

数学知识

回顾一下之前所学的知识，结合我多年的高中数学教学经验，相信你会对某些知识有一些新的感悟。

集合：一般地，我们把研究对象统称为元素（element）,把一些元素组成的总体叫做集合（set）。

对于一个给定的集合，其具有的特征：

确定性：集合中的元素都是确定的。

互异性：集合中的元素都是不同的。

无序性：集合中的元素的顺序是无序的。

映射：一般地，我们有：

设A,B是两个非空的集合，如果按照某一个确定的对应关系f.是对应集合A中的任意一个元素x,在集合B中都有唯一确定的元素y与之对应，那么就称对应f:A—>B为集合A到集合B的一个映射（mapping）。

其实简单的来讲，何谓映射，就是函数上将的关系对应，例如：

函数 f(x)=x^2 那么每一个x都有唯一的y与之对应，这就是映射关系的一个模型。

而方程 x^2+y^2=1，这个很明显是圆心为（0,0）的半径为1的圆，任取一个x可能会有一个或者两个y与之对应，这就不能称为映射，进而不能称为函数。（1,0）或者（-1,0）这时候的x只有唯一的确定的y和它对应。

集合类的学习

集合类产生的原因：在一般的情况下，我们在写程序时并不知道将需要多少个对象，或者是否需要更加复杂的方式存储对象，显然使用具有固定长度的数组已经不能解决这个问题了。所以java 实用类库提供了一套相当完整的容器类来解决这个问题。

基本概念

java容器类类库的用途是“保存对象”，可将其划分为两个不同的概念：

1）collection.独立元素的序列。主要包含List(序列),Set（集合）,Queue（队列）

List：按照插入的顺序保存元素；

Set：不能有重复的元素；

Queue:按照排队规则来确定对象产生的顺序（通常与它们被插入的顺序相同）；

2）Map：一组成对的“键值对”对象，允许我们使用键来查找值。

针对经常使用的类库，我们只列出List,Set,Map之间的继承关系：

List

List接口在Collection的基础上添加了大量的方法，使得可以在List的中间插入和删除元素。

继承自List的子类有ArrayList, LinkedList ,Vector三类。

list的特征：

1 有序的Collection
2 允许重复的元素，允许空的元素。
3 插入类似的数据：{1,2,4,{5,2},1,3};

ArrayList（类似于顺序表）

其主要用于查找，对于删除和插入，耗时巨大。ArrayList是以数组实现的列表，不支持同步。

优点：利用索引位置可以快速的定位访问

适合变动不大，主要用于查询的数据

和java的数组相比较，其容量是可以动态调整的。

缺点：不适合指定位置的插入，删除操作。

--ArrayList在元素填满容器是会自动扩充容器大小的50%

ArrayListTest 代码分析：

add()方法，添加元素，默认是在后面添加。

add(index,value),在指定索引处添加元素。会进行元素的移动。源码如下：

1 public void add(int index, E element) {
2         rangeCheckForAdd(index);
3         ensureCapacityInternal(size + 1);  // Increments modCount!!
4         System.arraycopy(elementData, index, elementData, index + 1,
5                          size - index);
6         elementData[index] = element;
7         size++;
8  }

remove(index)删除指定位置上的元素。源码如下：

 1 public E remove(int index) {
 2         rangeCheck(index);
 3         modCount++;
 4         E oldValue = elementData(index);
 5         int numMoved = size - index - 1;
 6         if (numMoved > 0)
 7             System.arraycopy(elementData, index+1, elementData, index,
 8                              numMoved);
 9         elementData[--size] = null; // clear to let GC do its work
10         return oldValue;
11     }

　　从源码可以分析出，在ArrayList进行插入和删除的时候，会进行类似顺序表的操作，移动元素，空出位置，然后插入元素。删除：依次移动后面的元素覆盖指定位置的元素。这就会大大减慢ArrayList插入和删除的效率。

举一个应用的例子，更好的理解ArrayList:

 1 public class ArrayListTest {
 2   public static void main(String[] args) {
 3      //泛型的用法，只允许Integer类型的元素插入。
 4     ArrayList<Integer> arrayList =new ArrayList<Integer>();
 5   //增加元素 
 6            arrayList.add(2);
 7     arrayList.add(3);
 8     arrayList.add(4);
 9     arrayList.add(5);
10     arrayList.add(4);
11     arrayList.add(null);//ArrayList允许空值插入，
12     arrayList.add(new Integer(3));
13     System.out.println(arrayList);//   [2, 3, 4, 5, 4, null, 3]
14   //查看元素的个数
15            System.out.println(arrayList.size());// 7
16     arrayList.remove(0);
17     System.out.println(arrayList);//  [3, 4, 5, 4, null, 3]
18     arrayList.add(1, new Integer(9)); 
19     System.out.println(arrayList);//   [3, 9, 4, 5, 4, null, 3]
20     System.out.println("-----------遍历方法-------");
21     ArrayList<Integer> as=new ArrayList<Integer>(100000);
22     for(int i=0;i<100000;i++){
23       as.add(i);
24     }
25     traverseByIterator(as);
26     traverseByFor(as);
27     traverseByForEach(as);
28   }
29   public static void traverseByIterator(ArrayList<Integer>al){
30     System.out.println("---------迭代器遍历-------------");
31     long startTime=System.nanoTime();//开始时间
32     Iterator it=al.iterator();
33     while(it.hasNext()){//
34       it.next();
35     }
36     long endTime=System.nanoTime();//结束时间
37     System.out.println((endTime-startTime)+"纳秒");
38   }
39   public static void traverseByFor(ArrayList<Integer>al){
40     System.out.println("---------索引遍历-------------");
41     long startTime=System.nanoTime();//开始时间
42     for(int i=0;i<al.size();i++) al.get(i);
43     long endTime=System.nanoTime();//结束时间
44     System.out.println((endTime-startTime)+"纳秒");
45   }
46   public static void traverseByForEach(ArrayList<Integer>al){
47     System.out.println("---------Foreach遍历-------------");
48     long startTime=System.nanoTime();//开始时间
49     for(Integer temp:al);
50     long endTime=System.nanoTime();//结束时间
51     System.out.println((endTime-startTime)+"纳秒");
52   }
53 }
54 -----------遍历方法-------
55 ---------迭代器遍历-------------
56 10407039纳秒
57 ---------索引遍历-------------
58 7094470纳秒
59 ---------Foreach遍历-------------
60 9063813纳秒
61 可以看到利用索引遍历，相对来说是快一些。

View Code

迭代器 Iterator

1 hasNext()  判断是否有下一个元素
2 next()    获取下一个元素
3 remove ()  删除某个元素

LinkedList:（主要用于增加和修改！）

--以双向链表实现的列表，不支持同步。

--可以被当做堆栈、队列和双端队列进行操作

--顺序访问高效，随机访问较差，中间插入和删除高效

--适合经常变化的数据

addFirst()在头部添加元素

add(3,10);将10插入到第四个位置上

remove（3）删除第四个位置的元素

代码详解：

 1 public class LinkedListTest {
 2   public static void main(String[] args) {
 3     LinkedList<Integer> linkedList=new LinkedList<Integer>();
 4     linkedList.add(2);
 5     linkedList.add(3);
 6     linkedList.add(9);
 7     linkedList.add(6);
 8     linkedList.add(7);
 9     System.out.println(linkedList);
10     //linkedList.addFirst(1);
11     //linkedList.addLast(10);
12     //System.out.println(linkedList);
13     linkedList.add(3, 4);
14     System.out.println(linkedList);
15     System.out.println(linkedList.get(4));
16     LinkedList<Integer> as=new LinkedList<Integer>();
17     for(int i=0;i<100000;i++){
18       as.add(i);
19     }
20     traverseByIterator(as);
21     traverseByFor(as);
22     traverseByForEach(as);
23   }
24   public static void traverseByIterator(LinkedList<Integer>al){
25     System.out.println("---------迭代器遍历-------------");
26     long startTime=System.nanoTime();//开始时间
27     Iterator it=al.iterator();
28     while(it.hasNext()){
29       it.next();
30     }
31     long endTime=System.nanoTime();//结束时间
32     System.out.println((endTime-startTime)+"纳秒");
33   }
34   public static void traverseByFor(LinkedList<Integer>al){
35     System.out.println("---------索引遍历-------------");
36     long startTime=System.nanoTime();//开始时间
37     for(int i=0;i<al.size();i++) al.get(i);
38     long endTime=System.nanoTime();//结束时间
39     System.out.println((endTime-startTime)+"纳秒");
40   }
41   public static void traverseByForEach(LinkedList<Integer>al){
42     System.out.println("---------Foreach遍历-------------");
43     long startTime=System.nanoTime();//开始时间
44     for(Integer temp:al);
45     long endTime=System.nanoTime();//结束时间
46     System.out.println((endTime-startTime)+"纳秒");
47   }
48 }
49 ---------迭代器遍历-------------
50 6562423纳秒
51 ---------索引遍历-------------
52 4565606240纳秒
53 ---------Foreach遍历-------------
54 4594622纳秒
55 可以看出使用索引遍历，对于linkedList真的很费时间！

add(index,value)源码分析：我们可以看到，这就是双引用（双指针）的赋值操作。

 1 void linkBefore(E e, Node<E> succ) {
 2         // assert succ != null;
 3         final Node<E> pred = succ.prev;
 4         final Node<E> newNode = new Node<>(pred, e, succ);
 5         succ.prev = newNode;
 6         if (pred == null)
 7             first = newNode;
 8         else
 9             pred.next = newNode;
10         size++;
11         modCount++;
12     }

remove(index)源码分析：同样，这也是对引用的更改操作，方面多了！

 1 E unlink(Node<E> x) {
 2         // assert x != null;
 3         final E element = x.item;
 4         final Node<E> next = x.next;
 5         final Node<E> prev = x.prev;
 6 
 7         if (prev == null) {
 8             first = next;
 9         } else {
10             prev.next = next;
11             x.prev = null;
12         }
13 
14         if (next == null) {
15             last = prev;
16         } else {
17             next.prev = prev;
18             x.next = null;
19         }
20 
21         x.item = null;
22         size--;
23         modCount++;
24         return element;
25     }

get(index)源码分析：利用指针挨个往后查找，直到找到位置为index的元素。当然了，找的时候也是要注意方法的，比如说利用二分查找。

 1 Node<E> node(int index) {
 2         // assert isElementIndex(index);
 3 
 4         if (index < (size >> 1)) {
 5             Node<E> x = first;
 6             for (int i = 0; i < index; i++)
 7                 x = x.next;
 8             return x;
 9         } else {
10             Node<E> x = last;
11             for (int i = size - 1; i > index; i--)
12                 x = x.prev;
13             return x;
14         }
15     }

Vector

-和ArrayList类似，可变数组实现的列表

-Vector同步，适合在多线程下使用

-原先不属于JCF框架，属于java最早的数据结构，性能较差

-从JDK1.2开始，Vector被重写，并纳入JCF中

-官方文档建议在非同步的情况下，优先采用ArrayList

其实vector类似于ArrayList,所以在一般情况下，我们能优先使用ArrayList,在同步的情况下，是可以考虑使用Vector

代码例子：

 1 public class VectorTest {
 2   public static void main(String[] args) {
 3     Vector<Integer> vs=new Vector<Integer>();
 4     vs.add(1);
 5     vs.add(4);
 6     vs.add(3);
 7     vs.add(5);
 8     vs.add(2);
 9     vs.add(6);
10     vs.add(9);
11     System.out.println(vs);
12     System.out.println(vs.get(0));
13     vs.remove(5);
14     System.out.println(vs);
15     /*Integer []a=new Integer[vs.size()];
16     vs.toArray(a);
17     for(Integer m:a){
18       System.out.print(m+" ");
19     }*/
20     Vector <Integer> as=new Vector <Integer>(100000);
21     for(int i=0;i<1000000;i++){
22       as.add(i);
23     }
24     traverseByIterator(as);
25     traverseByFor(as);
26     traverseByForEach(as);
27     traverseEm(as);
28   }
29   public static void traverseByIterator(Vector<Integer>al){
30     System.out.println("---------迭代器遍历-------------");
31     long startTime=System.nanoTime();//开始时间
32     Iterator it=al.iterator();
33     while(it.hasNext()){
34       it.next();
35     }
36     long endTime=System.nanoTime();//结束时间
37     System.out.println((endTime-startTime)+"纳秒");
38   }
39   public static void traverseByFor(Vector<Integer>al){
40     System.out.println("---------索引遍历-------------");
41     long startTime=System.nanoTime();//开始时间
42     for(int i=0;i<al.size();i++) al.get(i);
43     long endTime=System.nanoTime();//结束时间
44     System.out.println((endTime-startTime)+"纳秒");
45   }
46   public static void traverseByForEach(Vector<Integer>al){
47     System.out.println("---------Foreach遍历-------------");
48     long startTime=System.nanoTime();//开始时间
49     for(Integer temp:al){
50       temp.intValue();
51     }
52     long endTime=System.nanoTime();//结束时间
53     System.out.println((endTime-startTime)+"纳秒");
54   }
55   public static void traverseEm(Vector<Integer>al){
56     System.out.println("---------Enumeration遍历-------------");
57     long startTime=System.nanoTime();//开始时间
58     for(Enumeration <Integer> ei=al.elements();ei.hasMoreElements();){
59       ei.nextElement();
60     }
61     long endTime=System.nanoTime();//结束时间
62     System.out.println((endTime-startTime)+"纳秒");
63   }
64 }
65 ---------迭代器遍历-------------
66 28927404纳秒
67 ---------索引遍历-------------
68 32122768纳秒
69 ---------Foreach遍历-------------
70 25191768纳秒
71 ---------Enumeration遍历-------------
72 26901515纳秒
73 可以看到Foreach遍历要快于其他的遍历方法。

add(index,value)源码剖析：这个和ArrayList类似，需要进行元素的复制，所以很慢

 1 public synchronized void insertElementAt(E obj, int index) {
 2         modCount++;
 3         if (index > elementCount) {
 4             throw new ArrayIndexOutOfBoundsException(index
 5                                                      + " > " + elementCount);
 6         }
 7         ensureCapacityHelper(elementCount + 1);
 8         System.arraycopy(elementData, index, elementData, index + 1, elementCount - index);
 9         elementData[index] = obj;
10         elementCount++;
11     }

get(index)源码剖析：可以看到，直接根据元素的下表返回数组元素。非常快！

1 public synchronized E get(int index) {
2         if (index >= elementCount)
3             throw new ArrayIndexOutOfBoundsException(index);
4 
5         return elementData(index);
6     }
7 E elementData(int index) {
8         return (E) elementData[index];
9     }

其实List这部分内容用的数学知识不是很多，但是set和Map确实是类似于数学模型的概念。期待后续Set，Map的学习。

个人微信公众号