Java集合框架——容器的快速报错机制 fail-fast 是什么？

前言：最近看 java 集合方面的源码，了解到集合使用了 fail-fast 的机制，这里就记录一下这个机制是什么，有什么用，如何实现的。

一、fail-fast 简介

　　fail-fast 机制，即快速失败机制，是java集合（Collection）中的一种错误检测机制。当在迭代集合的过程中该集合在结构上发生改变的时候，就有可能会发生 fail-fast，即抛出 ConcurrentModificationException 异常。fail-fast 机制并不保证在不同步的修改下一定会抛出异常，它只是尽最大努力去抛出，所以这种机制一般仅用于检测 bug。　　

　　fail-fast 机制出现在 java 集合的 ArrayList、HashMap 等，在多线程或者单线程里面都有可能出现快速报错，即出现 ConcurrentModificationException 异常。

二、fail-fast 有什么用？

　　为了检测在迭代集合的过程中，这个集合是否发生了增加、删除等（add、remove、clear）使结构发生变化的事。

三、测试

　　这个单测跑下来是成功的，下面有讲解为什么。

import org.junit.Assert;
import org.junit.Before;
import org.junit.Test;

import java.util.ArrayList;
import java.util.ConcurrentModificationException;
import java.util.Iterator;
import java.util.List;

/**
 * @author yule
 * @date 2018/9/13 17:05
 */
public class ArrayListTest {
    private List<Integer> list = null;

    @Before
    public void buildList(){
        list = new ArrayList<>();
        for(int i = 0; i < 10; i++){
            list.add(i);
        }
        Assert.assertEquals("[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]", list.toString());
    }

    @Test
    public void testFor(){
        for(int i = 0; i < list.size(); i++){
            if(i == 2){
                //这个会删除成功，这里没有 fail-fast 的机制
                list.remove(2);
            }
        }
        Assert.assertEquals("[0, 1, 3, 4, 5, 6, 7, 8, 9]", list.toString());
    }

    @Test(expected = ConcurrentModificationException.class)
    public void testForEach(){
        for(int x : list){
            if(x == 2){
                //这个抛出异常，这里就是 fail-fast 的机制，毕竟 foreach 底层就是 Iterator
                list.remove(2);
            }
        }
    }

    @Test(expected = ConcurrentModificationException.class)
    public void testIterator(){
        Iterator<Integer> iterator = list.iterator();
        while (iterator.hasNext()){
            int x = iterator.next();
            if(x == 2){
                //这个抛出异常，这里就是 fail-fast 的机制
                list.remove(2);
            }
        }
    }

    @Test
    public void testIteratorSuccess(){
        Iterator<Integer> iterator = list.iterator();
        while (iterator.hasNext()){
            int x = iterator.next();
            if(x == 2){
                //这个会成功，不会抛出异常，这里也是 fail-fast 的机制，不知道这里为什么会成功，可以继续看下去
                iterator.remove();
            }
        }　　
　　　　 Assert.assertEquals("[0, 1, 3, 4, 5, 6, 7, 8, 9]", list.toString());
　　} 
}

　　第一个 testFor() 单测成功，是因为普通的 for 循环没有 fail-fast 机制，因为 fail-fast 机制只针对迭代集合的过程。

　　第二个 testForEach() 单测抛异常，是因为 forEach 底层就是使用了迭代器，其原因和 testIterator() 单测一致。

　　第三个 testIterator() 单测抛异常，是因为 ArrayList.remove() 方法只修改了 modCount++，而没有修改 Itr 的 expectedModCount。（详见下面原理）

　　第四个 testIteratorSuccess() 单测成功，是因为 ArrayList 的迭代器的 remove() 方法不仅仅是修改了 modCount，也修改了 Itr 的 expectedModCount。（详见下面原理）

四、fail-fast 在源码中（ArrayList）如何实现的？（原理）

　　首先，必须了解 fail-fast 两个关键的东西，ArrayList 的 modCount 和 ArrayList.Itr 内部类的 expectedModCount。

　　其中，modCount 是抽象类 AbstractList 中的变量，默认为 0，而 ArrayList 继承了 AbstractList ，所以也有这个变量，modCount 用于记录集合操作过程中作的修改次数，并不是 size。每次 add 或者 remove modCount 都会 ++。

　　其中，expectedModCount 是 ArrayList 内部类 Itr 的成员变量，初始值为 modCount。执行迭代器的 remove、add 方法，都会先执行 ArrayList 的 remove、add 方法（modCount++），然后会执行 expectedModCount = modCount。

　　上面说到 fail-fast 只针对迭代器，所以需要知道 ArrayList 的迭代器的实现源码：

    /**
     * Returns an iterator over the elements in this list in proper sequence.
     *
     * <p>The returned iterator is <a href="#fail-fast"><i>fail-fast</i></a>.
     *
     * @return an iterator over the elements in this list in proper sequence
     */
    public Iterator<E> iterator() {
        return new Itr();
    }

　　ArrayList 的内部类 Itr 实现了 Iterator 接口，源码如下：

/**
     * An optimized version of AbstractList.Itr
     */
    private class Itr implements Iterator<E> {
　　　　// 指集合遍历过程中的即将遍历的元素的索引
        int cursor;       // index of next element to return　　
        int lastRet = -1; // index of last element returned; -1 if no such
        // 这个就是 fail-fast 判断的关键变量，初始值就为ArrayList中的modCount
        int expectedModCount = modCount;

        public boolean hasNext() {
            return cursor != size;
        }

        @SuppressWarnings("unchecked")
        public E next() {
            checkForComodification();
            int i = cursor;
            if (i >= size)
                throw new NoSuchElementException();
            Object[] elementData = ArrayList.this.elementData;
            if (i >= elementData.length)
                throw new ConcurrentModificationException();
            cursor = i + 1;
            return (E) elementData[lastRet = i];
        }

        public void remove() {
            if (lastRet < 0)
                throw new IllegalStateException();
            checkForComodification();

            try {
                ArrayList.this.remove(lastRet);
                cursor = lastRet;
                lastRet = -1;
                expectedModCount = modCount;
            } catch (IndexOutOfBoundsException ex) {
                throw new ConcurrentModificationException();
            }
        }

        @Override
        @SuppressWarnings("unchecked")
        public void forEachRemaining(Consumer<? super E> consumer) {
            Objects.requireNonNull(consumer);
            final int size = ArrayList.this.size;
            int i = cursor;
            if (i >= size) {
                return;
            }
            final Object[] elementData = ArrayList.this.elementData;
            if (i >= elementData.length) {
                throw new ConcurrentModificationException();
            }
            while (i != size && modCount == expectedModCount) {
                consumer.accept((E) elementData[i++]);
            }
            // update once at end of iteration to reduce heap write traffic
            cursor = i;
            lastRet = i - 1;
            checkForComodification();
        }

        final void checkForComodification() {
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
        }
    }

　　其中，cursor 是指集合遍历过程中的即将遍历的元素的索引。

　　lastRet 是 cursor - 1，默认为 -1，即不存在上一个时，为 -1，它主要用于记录刚刚遍历过的元素的索引。

　　迭代器迭代结束的标志就是 hasNext() 返回 false，而该方法就是用 cursor 游标和 size (集合中的元素数目)进行对比，当 cursor 等于 size 时，表示已经遍历完成。

五、解惑

1、什么是指集合在结构上发生变化呢？

　　其实就是指 size 有变动，比如 ArrayList 的 add 和 delete、clear。

2、迭代集合的过程又是指什么呢？（这里涉及到 forEach 的原理）

　　迭代集合的过程指的就是使用代迭代器 Iterator 或者 forEach 语法，实际上一个类要使用 forEach 就必须实现 Iterable 接口并且重写它的 Iterator 方法所以 forEach 在本质上还是使用的 Iterator。因为 forEach 原理：在编译的时候编译器会自动将对 for 这个关键字的使用转化为对目标的迭代器的使用。

　　所以得出结论：

　　1、ArrayList 之所以能使用 foreach 循环遍历，是因为 ArrayList 所有的 List 都是 Collection 的子接口，而 Collection 是 Iterable 的子接口，ArrayList 的父类 AbstractList 正确地实现了 Iterable 接口的 iterator 方法。

　　2、任何一个集合，无论是 JDK 提供的还是自己写的，只要想使用 foreach 循环遍历，就必须正确地实现 Iterable 接口

　　实际上，这种做法就是23中设计模式中的迭代器模式。

3、那么又有一个问题：数组并没有实现 Iterable 接口啊，为什么数组也可以用 foreach 循环遍历呢？

　　其实：Java 将对于数组的 foreach 循环转换为对于这个数组每一个的循环引用。

　　所以结论为：编译器对集合的 forEach 会调用集合的迭代器；对数组的 forEach 会调用数组的 for 循环。

　　测试：java文件

　　编译后：class 文件

参考：https://blog.csdn.net/zymx14/article/details/78394464