Java技术

发布日期: 2018-11-27

用代码说HashMap in JDK8

本文章所有源码均来自jdk8 源码

HashMap设计思想：以时间换取空间，尽量减少hash冲突，增加查找效率。

其底层数据结构是数组称之为哈希桶，每个桶里面放的是链表，链表中的每个节点，就是哈希表中的每个元素。
在JDK8中，当链表长度达到8，会转化成红黑树，以提升它的查询、插入效率，它实现了Map<K,V>, Cloneable, Serializable接口。

常量

//hashmap的初始容量
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
    
//hashmap的最大容量
static final int MAXIMUM_CAPACITY = 1 << 30;

//默认的加载因子
static final float DEFAULT_LOAD_FACTOR = 0.75f;

//jdk1.8的优化！   当一个Entry后挂载的节点过多（>8个）后 将entry链表转为红黑树
static final int TREEIFY_THRESHOLD = 8;

/**
 * The bin count threshold for untreeifying a (split) bin during a
 * resize operation. Should be less than TREEIFY_THRESHOLD, and at
 * most 6 to mesh with shrinkage detection under removal.
 */
static final int UNTREEIFY_THRESHOLD = 6;

//一颗红黑树最小的大小为65
static final int MIN_TREEIFY_CAPACITY = 64;

变量

int threshold; //threshold表示当HashMap的size大于threshold时会执行resize操作。 

transient int size;//size是HashMap的大小，它是HashMap保存的键值对的数量。 

final float loadFactor;

HashMap部分经典代码

Node节点的数据结构。

从这里我们就可以看出 key和value时存在一起的！

static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;

        Node(int hash, K key, V value, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }

        public final K getKey()        { return key; }
        public final V getValue()      { return value; }
        public final String toString() { return key + "=" + value; }

        public final int hashCode() {
            return Objects.hashCode(key) ^ Objects.hashCode(value);
        }

        public final V setValue(V newValue) {
            V oldValue = value;
            value = newValue;
            return oldValue;
        }

        public final boolean equals(Object o) {
            if (o == this)
                return true;
            if (o instanceof Map.Entry) {
                Map.Entry<?,?> e = (Map.Entry<?,?>)o;
                if (Objects.equals(key, e.getKey()) &&
                    Objects.equals(value, e.getValue()))
                    return true;
            }
            return false;
        }
    }

tableSizeFor方法

牛逼的算法。通过该方法可以找到一个大于cap的最小整数（我们说的整数是=2^n）

static final int tableSizeFor(int cap) {
            int n = cap - 1;
            n |= n >>> 1;
            n |= n >>> 2;
            n |= n >>> 4;
            n |= n >>> 8;
            n |= n >>> 16;
            return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}

该算法每一次 |= n >>> 1会使得前面多一个1 ，整数个1出来～～～ +1就变为了整数。

用于计算hashcode的工具hash方法

static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

直接使用hashCode 假设在桶的容量为16的时候真正生效的只有低四位，为了减少hash冲突，将前16位与后16位进行异或运算便可以减少部分hash冲突。

算法的小优化

1.高效计算取余 用位运算代替取余数 a&b = a&(b-1) 条件b = 2^n

扩容

HashMap的扩容机制就是通过threshold = length * Load factor来做是否进行扩容的决策。也就是说，在数组定义好长度之后，负载因子越大，所能容纳的键值对个数越多。当然，负载因子也不是越大越好，JDK设计者给出了一个相对来说比较均衡的方案，Load factor为负载因子(默认值是0.75f)，一般我们不对这个参数做修改。

when?

当

f (++size > threshold)
   resize();

桶内Node的个数超过了threshold的时候就会扩容

How?

步骤：

1.第一次初始化的时候（put函数被调用）调用resize（）函数进行初始化，若无参数构造，则默认初始化，若有会根据用户的输入计算出容量（tableSizeFor（）方法）进行初始化。

2.需要扩容的时候，先判断当前的容量是否已经达到最大2<<30 (之所以是这个数因为int 只能放这么多啊 -2^31~2^31-1),若达到最大将threshold设置为最大的int值（2^31-1）,否则容量扩容到当前容量的两倍

3.rehash 这一步是最费时间的，首先需要便利整个位桶，计算key的hash ，若hash在上一次容量的范围内则位置不动，否则移动到【扩容前容量+扩容前位置】的桶中（原理：(e.hash & oldCap) == 0 这样就能判断是否在上次容量的范围内，【扩容前容量+扩容前位置】%当前容量就能在扩容后得到和扩容前相同位置的Node（重新映射））

final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table;
        int oldCap = (oldTab == null) ? 0 : oldTab.length; //Cap代表了位桶的数量。也就是table的长度
        //Threshold代表阀值当size>Threshold时扩容
        int oldThr = threshold; 
        int newCap, newThr = 0;

        //696
        if (oldCap > 0) {
            // 老表的容量已经大于 HashMap的最大容量 2^ 30 则将容量修改为最大Integer即2^31-1
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            // 还未到达HashMap定义的上限 则每次乘二
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                newThr = oldThr << 1; // double threshold
        }
        //这种情况 只有在初始化的时候能遇到，因为只有未初始化的时候oldCap>0 才未成立
        //有参初始化
        else if (oldThr > 0) // initial capacity was placed in threshold
            newCap = oldThr;
        //无参初始化
        else {               // zero initial threshold signifies using defaults
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }

        //对应696点情况 调整扩容后的 Threshold
        if (newThr == 0) {
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }



        //设置临界值为新值
        threshold = newThr;
        @SuppressWarnings({"rawtypes","unchecked"})


        //开始扩容
            //建立一个新的容量的newTab
            Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];

        //将老表table转化为新的table  table被覆盖了没有关系。还有oldTab保存着
        table = newTab;

        //老表转为新表
        if (oldTab != null) {
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                if ((e = oldTab[j]) != null) {
                    oldTab[j] = null;
                    if (e.next == null)
                        newTab[e.hash & (newCap - 1)] = e;
                    else if (e instanceof TreeNode)
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // preserve order
                        Node<K,V> loHead = null, loTail = null;
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do {
                            next = e.next;
                            if ((e.hash & oldCap) == 0) {
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            }
                            else {
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        if (loTail != null) {
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if (hiTail != null) {
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }

关于扩容的线程安全

分析jdk1.8 hashmap的源码我们可以发现，多线程put操作，发生resize的时候不会和1.7那样链表被反转发生环形数据结构导致死循环的情况了。然而HashMap仍然是线程不安全的！仍然会发生数据不一致的问题！

解决方法:

1.使用Map m = Collections.synchronizeMap(hashMap); 同步

2.synchronized

3.用lock

4.读写锁

5.用java.util.concurrent.ConcurrentHashMap类

put方法详解

public V put(K key, V value) {
       return putVal(hash(key), key, value, false, true);
   }

put会调用putVal方法

确定哈希桶数组索引位置算法

名称：final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict)
核心代码

1.索引处无节点（table[i]的位置下没有节点）它也是头节点

if ((p = tab[i = (n - 1) & hash]) == null)
           tab[i] = newNode(hash, key, value, null);

2.索引处存在节点，用p变量暂存该头节点

1。如果该头节点的key值与待存key相同，直接覆盖value（这部是小优化，因为大部分时候可能不存在hash冲突）

2。如果该根据该头节点判断是一颗红黑树，调用红黑树的putVal方法

3。其他情况（是个链表），需要遍历链表

- 找到节点的key值与待存key相同，直接覆盖value

- 没有找到节点的key值与待存key相同，在尾部新建一个节点

get方法详解

get方法和put方法代码有部分相似之处（key已存在取覆盖value的时候）

public V get(Object key) {
    Node<K,V> e;
    return (e = getNode(hash(key), key)) == null ? null : e.value;
}

final Node<K,V> getNode(int hash, Object key) {
    Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (first = tab[(n - 1) & hash]) != null) {
        if (first.hash == hash && // always check first node
            ((k = first.key) == key || (key != null && key.equals(k))))
            return first;
        if ((e = first.next) != null) {
            if (first instanceof TreeNode)
                return ((TreeNode<K,V>)first).getTreeNode(hash, key);
            do {
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    return e;
            } while ((e = e.next) != null);
        }
    }
    return null;
}