为什么说HashMap是线程不安全的，而ConcurrentHashMap是线程安全的

最近在学习多线程编程的时候知道了HashMap是线程安全的，而ConcurrentHashMap是线程不安全的，所以在多线程并发的情况下应该使用ConcurrentHashMap来确保线程安全。话虽这么说，而然耳听为虚，眼见为实，如果不能通过代码复现一下HashMap在多线程条件下失效的场景，很难直接去相信这个结论的。于是在网上去查了查能够复现HashMap线程不安全的例子，但是都不能很好复现出这个场景，思考再三，还是决定自己去尝试用代码复现一下这个问题。

1.准备一个用作map的key值的实体类，并重写equals方法和hashcode()方法

public class Entity {

    private int a;

    private int b;

    private int c;

    public Entity(int a, int b, int c) {
        this.a = a;
        this.b = b;
        this.c = c;
    }
    
    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Entity entity = (Entity) o;
        return a == entity.a &&
                b == entity.b && c == entity.c;
    }

    // 重写hashcode方法，仅取前两个成员变量计算hashCode值
    @Override
    public int hashCode() {
        return Objects.hash(a, b);
    }
}


 public static void main(String[] args) throws InterruptedException {
        Entity entity1 = new Entity(1, 1, 2);
        Entity entity2 = new Entity(1, 1, 3);

        System.out.println(entity1.equals(entity2)); // false
        System.out.println(entity1.hashCode() == entity2.hashCode()); // true
    }

通过设计一个简单的实体类并重写其equals方法和hashcode方法，最终能够实现通过一个类的两个对象并不equals但是这两个对象的hashcode值却一样的目的，而这也是为了之后HashMap做put()操作时创造hashcode冲突做准备。

2. 使用多线程并发执行HashMap和ConcurrentHashMap的put()方法

public static void main(String[] args) throws InterruptedException {
        Entity entity1 = new Entity(1, 1, 2);
        Entity entity2 = new Entity(1, 1, 3);
        
        HashMap map = new HashMap<>();

        ForkJoinPool forkJoinPool = new ForkJoinPool(10);
        forkJoinPool.execute(() -> IntStream.rangeClosed(0, 10).parallel().forEach(i -> {
            for (int j = 0; j < 10000; j++) {
                Entity entity = new Entity(1, 1, j);
                map.put(entity, i);
            }
        }));
        
        // 等待所有任务完成
        forkJoinPool.shutdown();
        forkJoinPool.awaitTermination(1, TimeUnit.HOURS);

        System.out.println(map.size());
    }

输出结果

Exception in thread "ForkJoinPool-1-worker-9" java.lang.ClassCastException
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:598)
	at java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:677)
	at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:735)
	at java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160)
	at java.util.stream.ForEachOps$ForEachOp$OfInt.evaluateParallel(ForEachOps.java:189)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
	at java.util.stream.IntPipeline.forEach(IntPipeline.java:405)
	at java.util.stream.IntPipeline$Head.forEach(IntPipeline.java:562)
	at thread.ch3.case01.HashMapTest.lambda$main$1(HashMapTest.java:21)
	at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: java.lang.ClassCastException: java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode
	at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1835)
	at java.util.HashMap$TreeNode.putTreeVal(HashMap.java:2014)
	at java.util.HashMap.putVal(HashMap.java:638)
	at java.util.HashMap.put(HashMap.java:612)
	at thread.ch3.case01.HashMapTest.lambda$null$0(HashMapTest.java:24)
	at java.util.stream.ForEachOps$ForEachOp$OfInt.accept(ForEachOps.java:205)
	at java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:110)
	at java.util.Spliterator$OfInt.forEachRemaining(Spliterator.java:693)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
	at java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:291)
	at java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731)
	... 4 more
18833

通过结果可以看出，我们本来期望是用10个线程并发的创建10000个hashcode值相等的entity对象，并将这些对象作为map的key保存到HashMap中，但是结果显示HashMap抛出了异常，而且其内部的key的数量不等于10000，显然这是线程不安全的。那么接下来我们有ConcurrentHashMap试一下。

public static void main(String[] args) throws InterruptedException {
        Entity entity1 = new Entity(1, 1, 2);
        Entity entity2 = new Entity(1, 1, 3);
        
        ConcurrentHashMap map = new ConcurrentHashMap();

        ForkJoinPool forkJoinPool = new ForkJoinPool(10);
        forkJoinPool.execute(() -> IntStream.rangeClosed(0, 10).parallel().forEach(i -> {
            for (int j = 0; j < 10000; j++) {
                Entity entity = new Entity(1, 1, j);
                map.put(entity, i);
            }
        }));
        
        // 等待所有任务完成
        forkJoinPool.shutdown();
        forkJoinPool.awaitTermination(1, TimeUnit.HOURS);

        System.out.println(map.size());
    }

输出结果

10000

Process finished with exit code 0

显然采用ConcurrentHashMap之后线程安全执行，且ConcurrentHashMap的size符合我们的预期，由此可见HashMap是线程不安全，ConcurrentHashMap是线程安全的结论是正确的。那么这是为什么呢？

3. 分析HashMap和ConcurrentHashMap的源码

HashMap的put()方法源码

public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }
    
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        // 根据hash值计算节点在hash表的位置并将这个位置上的元素复制给p, 如果为空则New一个node节点放在这个位置
        if ((p = tab[i = (n - 1) & hash]) == null) 
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

对于这段源码，重点看一下写了备注的那一行，在我们的例子代码中，由于所有的key的hashcode值都一致，故所有的节点都会在添加在hash表的同一个位置上。假如该位置初始为空，线程1和线程2同时执行到这行代码，那么线程1和线程2各自会new一个新的节点node1和node2,且其next指针都指向null。如果线程1先于线程2执行，那么node2就会覆盖掉node1，从而使得HashMap丢失值。

ConcurrentHashMap的put()方法

final V putVal(K key, V value, boolean onlyIfAbsent) {
        if (key == null || value == null) throw new NullPointerException();
        int hash = spread(key.hashCode());
        int binCount = 0;
        for (Node<K,V>[] tab = table;;) {
            Node<K,V> f; int n, i, fh;
            if (tab == null || (n = tab.length) == 0)
                tab = initTable();
            else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
                // 在new新的Node时通过CAS来保障线程的安全性
                if (casTabAt(tab, i, null,
                             new Node<K,V>(hash, key, value, null)))
                    break;                   // no lock when adding to empty bin
            }
            else if ((fh = f.hash) == MOVED)
                tab = helpTransfer(tab, f);
            else {
                V oldVal = null;
                synchronized (f) {
                    if (tabAt(tab, i) == f) {
                        if (fh >= 0) {
                            binCount = 1;
                            for (Node<K,V> e = f;; ++binCount) {
                                K ek;
                                if (e.hash == hash &&
                                    ((ek = e.key) == key ||
                                     (ek != null && key.equals(ek)))) {
                                    oldVal = e.val;
                                    if (!onlyIfAbsent)
                                        e.val = value;
                                    break;
                                }
                                Node<K,V> pred = e;
                                if ((e = e.next) == null) {
                                    pred.next = new Node<K,V>(hash, key,
                                                              value, null);
                                    break;
                                }
                            }
                        }
                        else if (f instanceof TreeBin) {
                            Node<K,V> p;
                            binCount = 2;
                            if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
                                                           value)) != null) {
                                oldVal = p.val;
                                if (!onlyIfAbsent)
                                    p.val = value;
                            }
                        }
                    }
                }
                if (binCount != 0) {
                    if (binCount >= TREEIFY_THRESHOLD)
                        treeifyBin(tab, i);
                    if (oldVal != null)
                        return oldVal;
                    break;
                }
            }
        }
        addCount(1L, binCount);
        return null;
    }

同样看有备注行的代码，与HashMap不同的是，ConcurrentHashMap在new一个新的节点时，其通过调用CAS的方法来保障线程的安全性，所以在使用ConcurrentHashMap的put()方法时并不会产生线程安全的问题。

本文只是尝试通过一个简单的例子来复现HashMap的线程安全性问题，至于更深层次的原理之后可能会重新写一篇来进行分析。