迭代器Iterator与ConcurrentModificationException详解

背景：一直以来对迭代器的问题理解不是很透彻，特别是迭代器和异常ConcurrentModificationException之间的联系。通过debug，详细了解其底层的具体实现过程。

简介

Iterator必须依附于Collection对象，若有一个Iterator对象，则必然有一个与之关联的Collection对象。

Iterator提供了两个方法来迭代访问Collection集合里的元素，并可通过remove()来删除集合中上一次next()方法返回的集合元素。

当使用Iterator迭代访问Collection集合元素时，Collection集合里的元素不能被改变，只有通过Iterator的remove()方法删除上一次next()方法返回的集合元素才可以；否则会引发java.util.ConcurrentModificationException异常。

之所以会出现这样的异常，是因为Iterator迭代器采用的是快速失败（fast-fail）机制，一旦在迭代过程中检测到该集合已经被修改（通常是程序中其它线程修改），程序立即引发ConcurrentModificationException，

而不是显示修改后结果，这样可以避免共享资源而引发的潜在问题。

ConcurrentModificationException发生在Iterator#next()方法实现中，每次调用都会检查容器的结构是否发生变化，目的是为了避免共享资源而引发的潜在问题。

观察HashMap和ArrayList底层Iterator#next(), 可以看到fast-fail只会增加或者删除（非Iterator#remove()）抛出异常；改变容器中元素的内容不存在这个问题（主要是modCount没发生变化）。

在单线程中使用迭代器，对非线程安全的容器，但是只能用Iterator#remove；否则会抛出异常。

在多线程中使用迭代器，可以使用线程安全的容器来避免异常。

使用普通的for循环遍历，效率虽然比较低下，但是不存在ConcurrentModificationException异常问题。用的也比较少。

ps：java在设计工具类时候，分别设计出线程安全和非安全的工具类，也是致力于解决这些多线程操作问题。所以无须纠结，直接使用就行。

    /**
     * The number of times this HashMap has been structurally modified
     * Structural modifications are those that change the number of mappings in
     * the HashMap or otherwise modify its internal structure (e.g.,
     * rehash).  This field is used to make iterators on Collection-views of
     * the HashMap fail-fast.  (See ConcurrentModificationException).
     */
    transient int modCount;

在观察底层实现时，可以看到容器对象的modCount值在改变容器结构时才发生改变。

集合Collection Map改变对迭代器遍历的影响

这里说的容器结构的改变是指增加或者删除元素，导致集合的大小发生改变。

观察源码发现，不论Collection或Map，对于Iterator来说：

异常是在next方法中抛出的，我们在使用迭代器的时候，一般会先进行hasNext方法的判断，再调用next方法来使用元素。

以下是对于ArrayList、 HashMap、ConcurrentHashMap三个容器的迭代器测试用例：

/**
 * Project Name:Spring0725
 * File Name:Test5.java
 * Package Name:work1201.basic
 * Date:2017年12月1日下午4:16:25
 * Copyright (c) 2017, 深圳金融电子结算中心 All Rights Reserved.
 *
*/

import java.util.ArrayList;
import java.util.ConcurrentModificationException;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map.Entry;
import java.util.Set;
import java.util.concurrent.ConcurrentHashMap;

import org.junit.Test;

/**
 * ClassName:TestIterator <br/>
 * Function: 测试Collection Map 改变集合结构对迭代器遍历有无影响
 * 在使用Iterator时候，对于Collection集合，改变集合的结构会触发ConcurrentModificationException异常；改变集合中元素的内容不会触发异常
 * 对于Map集合，线程安全map——ConcurrentHashMap改变集合Map的结构，无异常发生
 * 对于非线程安全的HashMap，使用Iterator遍历的时候，改变集合的结构，会触发ConcurrentModificationException
 * 
 * 非线程安全类的Collection或者Map，在使用Iterator过程中，有时候改变容器结构，并不会发生异常，这主要和底层的实现有关。
 * 如：List删除倒数第二个元素无异常；Map删除set集合中的最后一个元素，无异常
 * Date:     2017年12月1日 下午4:16:25 <br/>
 * @author   prd-lxw
 * @version   1.0
 * @since    JDK 1.7
 * @see      
 */
public class TestIterator {

    /**
     * 方法1
     * 迭代器更改Collection集合结构，非倒数第二个元素，会发生异常
     */
    @Test(expected = ConcurrentModificationException.class)
    public void testDeletCollection() {
        ArrayList<String> list = new ArrayList<String>();
        for (int i = 0; i < 20; i++) {
            list.add(i + "");
        }

        Iterator<String> it = list.iterator();
        String ss = 19 + "";
        while (it.hasNext()) {
            if (it.next().equals(ss)) {
                System.out.println("找到元素：" + ss);
                list.remove(ss); //集合大小发生改变 ConcurrentModificationException
                //                list.add(102+""); //集合大小发生改变 ConcurrentModificationException
                //                list.set(19, 211+"");  //不会触发异常，因为没有改变Collection集合的大小
                //                it.remove();//正常
            }
        }
    }

    /**
     * 方法2
     * 删除Collection集合的倒数第二个元素，不会发生异常
     */
    @Test
    public void testDeletBack2Collection() {
        ArrayList<String> list = new ArrayList<String>();
        for (int i = 0; i < 20; i++) {
            list.add(i + "");
        }

        Iterator<String> it = list.iterator();
        String ss = 18 + "";//倒数第2个元素
        while (it.hasNext()) {
            if (it.next().equals(ss)) {
                System.out.println("找到元素：" + ss);
                list.remove(ss); //集合大小发生改变 ConcurrentModificationException
                //              list.add(102+""); //集合大小发生改变 ConcurrentModificationException
                //              it.remove();//正常
            }
        }
    }

    /**
     * 方法3
     * 普通for方法遍历Collection，改变集合结构无异常
     */
    @Test
    public void testDeletCollectionFor() {
        ArrayList<String> list = new ArrayList<String>();
        for (int i = 0; i < 20; i++) {
            list.add(i + "");
        }
        String ss = 15 + "";
        for (int i = 0; i < list.size(); i++) {
            if (list.get(i).equals(ss)) {
                list.remove(i);
            }
        }

    }

    /**
     * 方法4
     * 使用增强型的foreach遍历Collection，改变集合结构，会发生异常
     * 起底层实现和使用Iterator一致
     */
    @Test(expected = ConcurrentModificationException.class)
    public void testDeletCollectionForEach() {
        ArrayList<String> list = new ArrayList<String>();
        for (int i = 0; i < 20; i++) {
            list.add(i + "");
        }
        String ss = 18 + "";
        for (String str : list) {
            if (str.equals(ss)) {
                list.remove(str);
            }
        }

    }

    /**
     * 方法5
     * 使用迭代器的过程中，改变map的结构，会触发ConcurrentModificationException异常
     * 但是如果删除的key在entrySet的结尾，比如key=10+""  就不会发生这个异常
     */
    @Test(expected = ConcurrentModificationException.class)
    public void testIteratorMapEntry() {
        HashMap<String, String> map = new HashMap<String, String>();
        for (int i = 0; i < 20; i++) {
            map.put(i + "", i + "");
        }
        Set<Entry<String, String>> entrySet = map.entrySet(); //打印entrySet集合可以发现，key=10是集合的最后一个元素
        Iterator<Entry<String, String>> it = entrySet.iterator();
        String key = 10 + "";
        while (it.hasNext()) {
            if (it.next().getKey().equals(key)) {
                System.out.println("testIteratorMapEntry找到元素：" + key);
                //改变Map
                //                map.remove(key); //ConcurrentModificationException
                map.put(21 + "", 21 + ""); //ConcurrentModificationException
                //                map.replace(key, 30 + "");//正常
                //                it.remove(); //正常
                System.out.println(map.size() + ":" + map.get(key));
            }
        }
    }

    /**
     * 方法6
     * 使用迭代器的过程中，改变map的结构，会触发ConcurrentModificationException异常
     * 但是如果删除的key在keySet的结尾，比如key=10+""  就不会发生这个异常
     * 
     * 对于非线程安全的map 使用Iterator遍历 keySet valueSet entrySet实验结果都一致
     */
    @Test(expected = ConcurrentModificationException.class)
    public void testIteratorMapKey() {
        HashMap<String, String> map = new HashMap<String, String>();
        for (int i = 0; i < 20; i++) {
            map.put(i + "", i + "");
        }

        Set<String> mapKeySet = map.keySet();//打印keySet集合可以发现，key=10是集合的最后一个元素
        Iterator<String> it = mapKeySet.iterator();
        String key = 11 + "";
        while (it.hasNext()) {
            if (it.next().equals(key)) {
                System.out.println("testIteratorMapKey找到元素：" + key);
                //改变Map
                map.remove(key); //ConcurrentModificationException
                //                map.put(21 + "", 21 + ""); //ConcurrentModificationException
                //                map.replace(key, 30 + "");//正常
                //                it.remove(); //正常
                System.out.println(map.size() + ":" + map.get(key));
            }
        }
    }

    //    ################# 线程安全类ConcurrentHashMap不存在ConcurrentModificationException问题
    /**
     * 方法7
     * 线程安全类Map entrySet操作，无异常发生
     */
    @Test
    public void testConIteratorMapEntry() {
        ConcurrentHashMap<String, String> map = new ConcurrentHashMap<String, String>();
        for (int i = 0; i < 20; i++) {
            map.put(i + "", i + "");
        }

        Iterator<Entry<String, String>> it = map.entrySet().iterator();
        String key = 12 + "";
        while (it.hasNext()) {
            if (it.next().getKey().equals(key)) {
                System.out.println("testConIteratorMapEntry找到元素：" + key);
                //改变Map
                map.remove(key); //正常
                //                map.put(21 + "", 21 + ""); //正常
                //                map.replace(key, 30 + "");//正常
                //                it.remove(); //正常
                System.out.println(map.size() + ":" + map.get(key));
            }
        }
    }

    /**
     * 方法8
     * 无异常
     */
    @Test
    public void testConIteratorMapKey() {
        ConcurrentHashMap<String, String> map = new ConcurrentHashMap<String, String>();
        for (int i = 0; i < 20; i++) {
            map.put(i + "", i + "");
        }

        Iterator<String> it = map.keySet().iterator();
        String key = 10 + "";
        while (it.hasNext()) {
            if (it.next().equals(key)) {

                System.out.println("testConIteratorMapKey找到元素：" + key);
                //改变Map
                //                map.remove(key); //正常
                map.put(21 + "", 21 + ""); //正常
                map.replace(key, 30 + "");//正常
                //                it.remove(); //正常
                System.out.println(map.size() + ":" + map.get(key));
            }
        }
    }

}

View Code

ArryList的迭代器底层实现

问题

观察上述方法1 和方法2 ，可以发现一个问题——

在使用迭代器遍历ArrayList时候，如果删除的是链表中倒数第二个元素，不会发生ConcurrentModificationException异常，否则就会发生异常。

方法1 和2实现一样，只不过删除的元素下标不一样。

分析

分析ArrayList中Iterator中next() 和hasNext() 的具体实现

从底层实现来理解上述差异产生的原因：

因为异常都是发生在Iterator#next()方法中，所以可以打开iterator()方法的实现。

 Iterator<String> it = list.iterator();

将ArrayList.class中关于迭代器的代码摘录如下

 /**
     * Returns an iterator over the elements in this list in proper sequence.
     * 通过这个方法获取到list的迭代器   内部类Itr()实现了Iterator接口
     * <p>The returned iterator is <a href="#fail-fast"><i>fail-fast</i></a>.
     *
     * @return an iterator over the elements in this list in proper sequence
     */
    public Iterator<E> iterator() {
        return new Itr();
    }

    /**
     * An optimized version of AbstractList.Itr
     */
    private class Itr implements Iterator<E> {
        int cursor;       // index of next element to return
        int lastRet = -1; // index of last element returned; -1 if no such
        int expectedModCount = modCount; //迭代器初始化时候设置的，一旦容器结构发生变化，会改变modCount的值，进而引发后面的异常

        public boolean hasNext() {
            return cursor != size;
        }

        @SuppressWarnings("unchecked")
        public E next() {
            checkForComodification();
            int i = cursor;
            if (i >= size)
                throw new NoSuchElementException();
            Object[] elementData = ArrayList.this.elementData;
            if (i >= elementData.length)
                throw new ConcurrentModificationException();
            cursor = i + 1;
            return (E) elementData[lastRet = i];
        }

        public void remove() {
            if (lastRet < 0)
                throw new IllegalStateException();
            checkForComodification();

            try {
                ArrayList.this.remove(lastRet);
                cursor = lastRet;
                lastRet = -1;
                expectedModCount = modCount;
            } catch (IndexOutOfBoundsException ex) {
                throw new ConcurrentModificationException();
            }
        }

        @Override
        @SuppressWarnings("unchecked")
        public void forEachRemaining(Consumer<? super E> consumer) {
            Objects.requireNonNull(consumer);
            final int size = ArrayList.this.size;
            int i = cursor;
            if (i >= size) {
                return;
            }
            final Object[] elementData = ArrayList.this.elementData;
            if (i >= elementData.length) {
                throw new ConcurrentModificationException();
            }
            while (i != size && modCount == expectedModCount) {
                consumer.accept((E) elementData[i++]);
            }
            // update once at end of iteration to reduce heap write traffic
            cursor = i;
            lastRet = i - 1;
            checkForComodification();
        }
　　　　　　
        final void checkForComodification() {
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
        }
    }

从源码可以看到，ArrayList#iterator()调用的是Itr类，Itr实现了Iterator接口，该类中实现了接口中的next()和hasNext()方法，所以

ArrayList#iterator()#hasNext() 就是调用Itr#hasNext()

ArrayList#iterator()#next() 就是调用Itr#next()

原因

执行hasNext（）会判断下一个元素下标与集合大小是否相等（元素存在与否），相等返回false；不等 返回true；
ps:对于一个list的，lastIndex = size-1

在执行next（）方法前会进行结构是否变化的检查，modCount != expectedModCount，返回true就抛异常，返回false就不抛出异常。
ps：一旦容器结构发生变化，modCount的值会发生变化，每次累加1；expectedModCount在生成迭代器时候进行初始化，代表初始化时候容器的modCount。.modCount是属于ArrayList对象的，expectedModCount是属于迭代器对象的。
每次next()方法执行完之后，cursor都代表当前元素的下一个元素的下标。

而对于方法1和方法2：

hasNext（）的判断是根据下一个元素的下标是否与容器的大小相等做判断

删除倒数第二个元素没有异常； cursor == size

删除非倒数第二个元素有异常：cursor ！= size，所以会进入next（），而容器结构的变化导致modCount != expectedModCount，从而抛出异常。

Map与迭代器之间的底层实现

问题

这里讨论的是非线程安全类的Map，对Map的keySet、 valueSet、 entrySet三个集合可以使用Iterator进行遍历。

这里举例解释方法6中存在的问题：

map中放入0-19个元素，删除key=10+“”会抛出异常；其它的则不会。

分析

分析HashMap中Iterator中next() 和hasNext() 的具体实现

打开方法6中的 HashMap#keySet()实现

 Set<String> mapKeSet = map.keySet();//打印keySet集合可以发现，key=10是集合的最后一个元素
        Iterator<String> it = mapKeSet.iterator();

在HashMap.class中可以看到如下的实现

  public Set<K> keySet() {
        Set<K> ks = keySet;
        if (ks == null) {
            ks = new KeySet();
            keySet = ks;
        }
        return ks;
    }

    final class KeySet extends AbstractSet<K> {
        public final int size()                 { return size; }
        public final void clear()               { HashMap.this.clear(); }
        public final Iterator<K> iterator()     { return new KeyIterator(); } //Iterator接口的具体实现类
        public final boolean contains(Object o) { return containsKey(o); }
        public final boolean remove(Object key) {
            return removeNode(hash(key), key, null, false, true) != null;
        }
        public final Spliterator<K> spliterator() {
            return new KeySpliterator<>(HashMap.this, 0, -1, 0, 0);
        }
        public final void forEach(Consumer<? super K> action) {
            Node<K,V>[] tab;
            if (action == null)
                throw new NullPointerException();
            if (size > 0 && (tab = table) != null) {
                int mc = modCount;
                for (int i = 0; i < tab.length; ++i) {
                    for (Node<K,V> e = tab[i]; e != null; e = e.next)
                        action.accept(e.key);
                }
                if (modCount != mc)
                    throw new ConcurrentModificationException();
            }
        }
    }

分析： HashMap#keySet()覆写了父类AbstractMap#keySet()方法，跟踪变量keySet的初始化过程，可以发现其初始值默认为null。

HashMap#keySet()方法中接口Set的实现类为KeySet，进一步跟踪可以发现，mapKeSet.iterator()返回的对象是实例化KeyIterator生成的对象，跟踪改类的具体实现——

下面是Map与Iterator的核心代码

    // iterators

    abstract class HashIterator {
        Node<K,V> next;        // next entry to return
        Node<K,V> current;     // current entry
        int expectedModCount;  // for fast-fail
        int index;             // current slot

        HashIterator() {
            expectedModCount = modCount;
            Node<K,V>[] t = table;
            current = next = null;
            index = 0;
            if (t != null && size > 0) { // advance to first entry
                do {} while (index < t.length && (next = t[index++]) == null);
            }
        }

        public final boolean hasNext() {
            return next != null;
        }

        final Node<K,V> nextNode() {
            Node<K,V>[] t;
            Node<K,V> e = next;
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
            if (e == null)
                throw new NoSuchElementException();
            if ((next = (current = e).next) == null && (t = table) != null) {
                do {} while (index < t.length && (next = t[index++]) == null);
            }
            return e;
        }

        public final void remove() {
            Node<K,V> p = current;
            if (p == null)
                throw new IllegalStateException();
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
            current = null;
            K key = p.key;
            removeNode(hash(key), key, null, false, false);
            expectedModCount = modCount;
        }
    }

    final class KeyIterator extends HashIterator
        implements Iterator<K> {
        public final K next() { return nextNode().key; }
    }

    final class ValueIterator extends HashIterator
        implements Iterator<V> {
        public final V next() { return nextNode().value; }
    }

    final class EntryIterator extends HashIterator
        implements Iterator<Map.Entry<K,V>> {
        public final Map.Entry<K,V> next() { return nextNode(); }
    }

我们从源码可以看到 KeyIterator实现了Iterator接口，并且继承了抽象类HashIterator，在该类中实现了接口Iterator#next()方法；而KeyIterator#next()又是调用其父类HashIterator#nextNode()方法。

所以其底层实现：

HashMap#keySet()#iterator()#hasNext()  调用的是HashIterator#hasNext()
HashMap#keySet()#iterator()#next()  调用的就是HashIterator#nextNode().key

原因

对于Map：

进行hasNext 是判断next是否为null，true-后面有元素。false-后面没元素。

执行next的时候，会进行map结构的检查，modCount != expectedModCount，返回true就抛异常，返回false就不抛出异常。

当我们删除keySet最后一个元素时候，hasNext返回false，不会进入Next，自然不发生异常

当删除非最后一个元素的时候，执行next的时候触发结构检查，发生异常。

是否是最后一个可以通过观察set的输出结果。

在方法6中，通过debug可以观察到mapKeySet中的key值排列如下

[11, 12, 13, 14, 15, 16, 17, 18, 19, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

当删除key=10+“”元素的时候，迭代器循环结束，不会在进入hasNext方法，所以就没有异常产生了。

多线程环境下，程序示例

多线程环境下的结果和单测中的结果一致，这里只是为了模拟观察

/**
 * Project Name:Spring0725
 * File Name:IteratorListDelete.java
 * Package Name:work12.day05
 * Date:2017年12月5日下午9:58:53
 * Copyright (c) 2017, 深圳金融电子结算中心 All Rights Reserved.
 *
 */

package work12.day05;

import java.util.ArrayList;
import java.util.Iterator;

/**
 * ClassName:IteratorListDelete <br/>
 * Function: 测试多线程下 改变Collection集合 对迭代器遍历造成的影响 Date: 2017年12月5日 下午9:58:53 <br/>
 * 
 * @author prd-lxw
 * @version 1.0
 * @since JDK 1.7
 * @see
 */
public class IteratorListDelete {
    private final ArrayList<String> list;

    public IteratorListDelete(ArrayList<String> list) {
        super();
        this.list = list;
    }

    public ArrayList<String> getList() {
        return list;
    }

    public static void main(String[] args) {
        ArrayList<String> aList = new ArrayList<String>();
        IteratorListDelete tt = new IteratorListDelete(aList);
        for (int i = 0; i < 100; i++) {
            tt.getList().add(i + "");
        }
        new Thread(new TraverseList(tt), "子线程").start();
        // System.out.println(tt.getList().get(28));
        try {
            Thread.sleep(10);
            // tt.getList().remove(28+""); // 集合大小发生改变 ConcurrentModificationException
            tt.getList().add(101 + ""); // 集合大小发生改变 ConcurrentModificationException
            // tt.getList().set(28, 201+""); //改变集合内容，不会触发异常，因为没有改变Collection集合的大小
        } catch (InterruptedException e) {
            e.printStackTrace();

        }

    }

}

/**
 * ClassName: TraverseList <br/>
 * Function: 线程一直循环遍历collection集合 date: 2017年12月5日 下午10:35:06 <br/>
 *
 * @author prd-lxw
 * @version 1.0
 * @since JDK 1.7
 */
class TraverseList implements Runnable {
    private final IteratorListDelete tt;

    public TraverseList(IteratorListDelete tt) {
        super();
        this.tt = tt;
    }

    public void run() {
        try {
            Thread.sleep(5);
        } catch (Exception e) {
            // TODO: handle exception
        }
        while (true) {
            Iterator<String> it = tt.getList().iterator();
            while (it.hasNext()) {
                System.out.println(Thread.currentThread().getName() + "循环遍历："
                        + it.next());
            }
        }

    }

}

View Code

java中为什么要使用迭代器

ps:阿里面试时候问到这个问题，当时是一脸的懵逼。

迭代模式是访问集合类的通用方法，只要集合类实现了Iterator接口，就可以用迭代的方式来访问集合类内部的数据，Iterator访问方式把对不同集合类的访问逻辑抽象出来，使得不用暴露集合内部的结构而达到循环遍历集合的效果。
例如，如果没有使用Iterator，遍历一个数组的方法是使用索引：

Ruby代码

for(int i=0; i<array.length; i++) { ... get(i) ... }

   这种方法的缺点就是事先必须知道集合的数据结构，而且当我换了一种集合的话代码不可重用，要修改，比如我用set，就不能通过索引来遍历了。访问代码和集合是紧耦合，无法将访问逻辑从集合类和客户端代码中剥离出来，每一种集合类对应一种访问方式，代码不可重用。
   为解决以上问题，Iterator模式总是用同一种逻辑来遍历集合。
   每一种集合类返回的Iterator具体类型可能不同，Array可能返回ArrayIterator，Set可能返回SetIterator，Tree 可能返回TreeIterator，但是它们都实现了Iterator接口，因此，客户端不关心到底是哪种Iterator，它只需要获得这个 Iterator接口即可，这就是面向对象的威力。
这就是针对抽象编程的原则：对具体类的依赖性最小。