HashMap 1.7

概述

HashMap基于Map接口实现，元素以键值对的方式存储，并且允许使用null 建和null值，因为key不允许重复，因此只能有一个键为null,另外HashMap不能保证放入元素的顺序，它是无序的，和放入的顺序并不能相同。HashMap是线程不安全的。

继承关系

public class HashMap<K,V>extends AbstractMap<K,V>
    implements Map<K,V>, Cloneable, Serializable

基本属性

static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; //默认初始化大小 16 
static final float DEFAULT_LOAD_FACTOR = 0.75f;     //负载因子0.75
static final Entry<?,?>[] EMPTY_TABLE = {};         //初始化的默认数组
transient int size;     //HashMap中元素的数量
int threshold;          //判断是否需要调整HashMap的容量
//threshold在inflateTable函数中会置为
//threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);

HashMap由数组和链表来实现对数据的存储

HashMap采用Entry数组来存储key-value对，每一个键值对组成了一个Entry实体，Entry类实际上是一个单向的链表结构，它具有Next指针，可以连接下一个Entry实体，以此来解决Hash冲突的问题。

数组存储区间是连续的，占用内存严重，故空间复杂的很大。但数组的二分查找时间复杂度小，为O(1)；数组的特点是：寻址容易，插入和删除困难；

链表存储区间离散，占用内存比较宽松，故空间复杂度很小，但时间复杂度很大，达O（N）。链表的特点是：寻址困难，插入和删除容易

计算hash方式

    final int hash(Object k) {
        //这一部分是为了减少设计不好的字符串hash，多个字符串得到相同hash值
        int h = hashSeed;//初始是0
        if (0 != h && k instanceof String) {
            return sun.misc.Hashing.stringHash32((String) k);
        }
    
        h ^= k.hashCode();
    
        //下面这一部分是为了减少冲突，最多冲突(链表长度)约为8的意思
        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }

    //下面是为了让这个hash值转为数组下标，length是2的幂，减一之后位数0之后的都是1，与操作之后刚好在0到length-1下面
    /**
     * Returns index for hash code h.
     */
    static int indexFor(int h, int length) {
        // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
        return h & (length-1);
    }

插入元素是头插法

插入的Entry的next指向数组中当前位置，然后下移这一部分，让插入的Entry为数组当前位置

    public V put(K key, V value) {
        //数组为空的时候
        if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }
        if (key == null)
            return putForNullKey(value);
        int hash = hash(key);
        int i = indexFor(hash, table.length);
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            ////如果key在链表中已存在，则替换为新value
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }
        //快速失败机制？，比如在遍历的时候中途add就会报错
        modCount++;
        addEntry(hash, key, value, i);
        return null;
    }

    void addEntry(int hash, K key, V value, int bucketIndex) {
        //如果size超过threshold以及当前这个bucket桶非空，则扩充table大小。再散列，然后用新的table.length求下标,新的table.length会在首部多一个1，那么求下标函数的时候就会用到hash值的多一位，然后进行&运算，这个时候就可能把一个链表的有些元素移到后面新增的bucket上面去
        if ((size >= threshold) && (null != table[bucketIndex])) {
            resize(2 * table.length);
            hash = (null != key) ? hash(key) : 0;
            bucketIndex = indexFor(hash, table.length);
        }

        createEntry(hash, key, value, bucketIndex);
    }

    void createEntry(int hash, K key, V value, int bucketIndex) {
        Entry<K,V> e = table[bucketIndex];
        table[bucketIndex] = new Entry<>(hash, key, value, e);
        size++;
    }

扩容resize函数内部的transfer函数时候会倒序，从而在多线程环境下有成环的问题
见https://www.bilibili.com/video/BV1vE411v7cR?p=4
已经下载存在E盘，视频是很乱的，只需看前几节

    void resize(int newCapacity) {
        Entry[] oldTable = table;
        int oldCapacity = oldTable.length;
        if (oldCapacity == MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return;
        }

        Entry[] newTable = new Entry[newCapacity];
        transfer(newTable, initHashSeedAsNeeded(newCapacity));
        table = newTable;
        threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
    }


    void transfer(Entry[] newTable, boolean rehash) {
        int newCapacity = newTable.length;
        for (Entry<K,V> e : table) {
            while(null != e) {
                Entry<K,V> next = e.next;
                if (rehash) {
                    e.hash = null == e.key ? 0 : hash(e.key);
                }
                int i = indexFor(e.hash, newCapacity);
                //就是下面这几行会导致逆序
                e.next = newTable[i];
                newTable[i] = e;
                e = next;
            }
        }
    }

资料

种一棵树最好的时间是十年前，其次是现在。