hashCode和identityHashCode底层是怎么生成的

前言：在工作中使用==埋下的坑这篇博文的最后，我想到了两个问题，其中一个是——为什么 int int1=99;int int2=99;int1和int2的identityHashCode是一样的哪？为什么float float1=99;float float2=99;float1和float2的identityHashCode是不一样的哪？那就需要了解identityHashCode的生成规则了，需要了解一下java的内存地址分配规则了。

今天的事情不多，我就查了查资料，找到了对应的底层实现的方式，并且也验证了hashCode和identityHashCode 的关系这篇博文中的部分观点。

本文将根据openJDK 6源码，向你展示Java语言中的Object对象的hashCode() 方法和System对象的identityHashCode()方法的底层实现的神秘面纱，我将一步一步地向你介绍Java Object对象的hashCode() 方法和System对象的identityHashCode()方法到底底层调用了什么函数。为了更好地了解这个过程，你可以自己下载openJDK 6 源码，亲自查看和跟踪源码，了解Object对象的hashCode() 方法和System对象的identityHashCode()方法的生成过程：

openJDK 6 下载地址：http://download.java.net/openjdk/jdk6/

1：java.lang.System.java类的identityHashCode()方法如下所示——在 openjdk-6-src-b27-26_oct_2012jdksrcshareclassesjavalang 目录下，它是一个静态的本地方法

    /**
     * Returns the same hash code for the given object as
     * would be returned by the default method hashCode(),
     * whether or not the given object's class overrides
     * hashCode().
     * The hash code for the null reference is zero.
     *
     * @param x object for which the hashCode is to be calculated
     * @return  the hashCode
     * @since   JDK1.1
     */
    public static native int identityHashCode(Object x);

2：java.lang.System.java类的identityHashCode()方法的本地c语言的实现——System.c——在 openjdk-6-src-b27-26_oct_2012jdksrcshare ativejavalang 目录下，他调用的是JVM_IHashCode()方法

JNIEXPORT jint JNICALL
Java_java_lang_System_identityHashCode(JNIEnv *env, jobject this, jobject x)
{
    return JVM_IHashCode(env, x);
}

3：JVM_IHashCode()方法在 openjdk-6-src-b27-26_oct_2012hotspotsrcsharevmprimsjvm.cpp文件中，它又调用ObjectSynchronizer::FastHashCode()方法

// java.lang.Object ///////////////////////////////////////////////


JVM_ENTRY(jint, JVM_IHashCode(JNIEnv* env, jobject handle))
  JVMWrapper("JVM_IHashCode");
  // as implemented in the classic virtual machine; return 0 if object is NULL
  return handle == NULL ? 0 : ObjectSynchronizer::FastHashCode (THREAD, JNIHandles::resolve_non_null(handle)) ;
JVM_END

4：ObjectSynchronizer::FastHashCode()方法在 openjdk-6-src-b27-26_oct_2012hotspotsrcsharevm untimesynchronizer.cpp文件中，它是最终实现hashCode()和identityHashCode()方法的方法，核心的实现代码如下，我们从这里也可以看得到，还是比较复杂的，并不是简单取一个对象的引用地址那么简单。

//
intptr_t ObjectSynchronizer::FastHashCode (Thread * Self, oop obj) {
  if (UseBiasedLocking) {
    // NOTE: many places throughout the JVM do not expect a safepoint
    // to be taken here, in particular most operations on perm gen
    // objects. However, we only ever bias Java instances and all of
    // the call sites of identity_hash that might revoke biases have
    // been checked to make sure they can handle a safepoint. The
    // added check of the bias pattern is to avoid useless calls to
    // thread-local storage.
    if (obj->mark()->has_bias_pattern()) {
      // Box and unbox the raw reference just in case we cause a STW safepoint.
      Handle hobj (Self, obj) ;
      // Relaxing assertion for bug 6320749.
      assert (Universe::verify_in_progress() ||
              !SafepointSynchronize::is_at_safepoint(),
             "biases should not be seen by VM thread here");
      BiasedLocking::revoke_and_rebias(hobj, false, JavaThread::current());
      obj = hobj() ;
      assert(!obj->mark()->has_bias_pattern(), "biases should be revoked by now");
    }
  }

  // hashCode() is a heap mutator ...
  // Relaxing assertion for bug 6320749.
  assert (Universe::verify_in_progress() ||
          !SafepointSynchronize::is_at_safepoint(), "invariant") ;
  assert (Universe::verify_in_progress() ||
          Self->is_Java_thread() , "invariant") ;
  assert (Universe::verify_in_progress() ||
         ((JavaThread *)Self)->thread_state() != _thread_blocked, "invariant") ;

  ObjectMonitor* monitor = NULL;
  markOop temp, test;
  intptr_t hash;
  markOop mark = ReadStableMark (obj);

  // object should remain ineligible for biased locking
  assert (!mark->has_bias_pattern(), "invariant") ;

  if (mark->is_neutral()) {
    hash = mark->hash();              // this is a normal header
    if (hash) {                       // if it has hash, just return it
      return hash;
    }
    hash = get_next_hash(Self, obj);  // allocate a new hash code
    temp = mark->copy_set_hash(hash); // merge the hash code into header
    // use (machine word version) atomic operation to install the hash
    test = (markOop) Atomic::cmpxchg_ptr(temp, obj->mark_addr(), mark);
    if (test == mark) {
      return hash;
    }
    // If atomic operation failed, we must inflate the header
    // into heavy weight monitor. We could add more code here
    // for fast path, but it does not worth the complexity.
  } else if (mark->has_monitor()) {
    monitor = mark->monitor();
    temp = monitor->header();
    assert (temp->is_neutral(), "invariant") ;
    hash = temp->hash();
    if (hash) {
      return hash;
    }
    // Skip to the following code to reduce code size
  } else if (Self->is_lock_owned((address)mark->locker())) {
    temp = mark->displaced_mark_helper(); // this is a lightweight monitor owned
    assert (temp->is_neutral(), "invariant") ;
    hash = temp->hash();              // by current thread, check if the displaced
    if (hash) {                       // header contains hash code
      return hash;
    }
    // WARNING:
    //   The displaced header is strictly immutable.
    // It can NOT be changed in ANY cases. So we have
    // to inflate the header into heavyweight monitor
    // even the current thread owns the lock. The reason
    // is the BasicLock (stack slot) will be asynchronously
    // read by other threads during the inflate() function.
    // Any change to stack may not propagate to other threads
    // correctly.
  }

  // Inflate the monitor to set hash code
  monitor = ObjectSynchronizer::inflate(Self, obj);
  // Load displaced header and check it has hash code
  mark = monitor->header();
  assert (mark->is_neutral(), "invariant") ;
  hash = mark->hash();
  if (hash == 0) {
    hash = get_next_hash(Self, obj);
    temp = mark->copy_set_hash(hash); // merge hash code into header
    assert (temp->is_neutral(), "invariant") ;
    test = (markOop) Atomic::cmpxchg_ptr(temp, monitor, mark);
    if (test != mark) {
      // The only update to the header in the monitor (outside GC)
      // is install the hash code. If someone add new usage of
      // displaced header, please update this code
      hash = test->hash();
      assert (test->is_neutral(), "invariant") ;
      assert (hash != 0, "Trivial unexpected object/monitor header usage.");
    }
  }
  // We finally get the hash
  return hash;
}

5：java.lang.Object.java类的hashCode()方法如下所示——在 openjdk-6-src-b27-26_oct_2012jdksrcshareclassesjavalang 目录下，它是一个本地方法

    /**
     * Returns a hash code value for the object. This method is
     * supported for the benefit of hashtables such as those provided by
     * <code>java.util.Hashtable</code>.
     * <p>
     * The general contract of <code>hashCode</code> is:
     * <ul>
     * <li>Whenever it is invoked on the same object more than once during
     *     an execution of a Java application, the <tt>hashCode</tt> method
     *     must consistently return the same integer, provided no information
     *     used in <tt>equals</tt> comparisons on the object is modified.
     *     This integer need not remain consistent from one execution of an
     *     application to another execution of the same application.
     * <li>If two objects are equal according to the <tt>equals(Object)</tt>
     *     method, then calling the <code>hashCode</code> method on each of
     *     the two objects must produce the same integer result.
     * <li>It is <em>not</em> required that if two objects are unequal
     *     according to the {@link java.lang.Object#equals(java.lang.Object)}
     *     method, then calling the <tt>hashCode</tt> method on each of the
     *     two objects must produce distinct integer results.  However, the
     *     programmer should be aware that producing distinct integer results
     *     for unequal objects may improve the performance of hashtables.
     * </ul>
     * <p>
     * As much as is reasonably practical, the hashCode method defined by
     * class <tt>Object</tt> does return distinct integers for distinct
     * objects. (This is typically implemented by converting the internal
     * address of the object into an integer, but this implementation
     * technique is not required by the
     * Java<font size="-2"><sup>TM</sup></font> programming language.)
     *
     * @return  a hash code value for this object.
     * @see     java.lang.Object#equals(java.lang.Object)
     * @see     java.util.Hashtable
     */
    public native int hashCode();

6：java.lang.Object.java类的hashCode()方法的本地c语言的实现——Object.c——在 openjdk-6-src-b27-26_oct_2012jdksrcshare ativejavalang 目录下，他调用的也是JVM_IHashCode()方法，由此可见我们在hashCode和identityHashCode 的关系中的观点，在此处也得到了再次的验证

static JNINativeMethod methods[] = {
    {"hashCode",    "()I",                    (void *)&JVM_IHashCode},
    {"wait",        "(J)V",                   (void *)&JVM_MonitorWait},
    {"notify",      "()V",                    (void *)&JVM_MonitorNotify},
    {"notifyAll",   "()V",                    (void *)&JVM_MonitorNotifyAll},
    {"clone",       "()Ljava/lang/Object;",   (void *)&JVM_Clone},
};

7：如上所示，经过一步步的分析，我们已经了解到了 hashCode和identityHashCode底层到底是怎么生成的，不过有些事情这里要在下面补充一下。

7-1：本地方法是什么东西？

本地方法是指用本地程序设计语言，比如：c或者c++，来编写的特殊方法。在java语言中通过native关键字来修饰，通过Java Native Interface(JNI)技术来支持java应用程序来调用本地方法。

7-2：本地方法的特点是什么？

本地方法在本地语言中可以执行任意的计算任务，并返回到java程序设计语言中。

7-3：本地方法的用途是有哪些？

从历史上看本地方法主要有三种用途。

1）提供“访问特定于平台的机制”的能力，比如：访问注册表和文件锁。

2）提供访问遗留代码库的能力，从而可以访问遗留数据。

3）可以通过本地语言，编写应用程序中注重性能的部分，以提高系统的性能。

7-4：使用本地方法的优缺点是什么？

总体来讲使用本地方需要格外的谨慎，因为本地代码中的一个bug就有可能破坏掉整个应用程序。

使用本地代码的优点是：提高系统性能，访问特定于平台的机制。

使用本地代码的缺点是：

1）因为本地语言是不安全的，所以，使用本地方法的应用程序也不能免受内存毁坏错误的影响。

2）因为本地语言是与平台相关的，使用本地方法的应用程序也不再是可自由移植的。

3）使用本地方法的应用程序更难调试

4）在进入和退出本地代码时，需要相关的固定开销，所以，如果本地代码时做的少量的工作，本地方法就可能降低性能。

5）需要“胶合代码”的本地方法编写起来的单调乏味，并且难以阅读。

8：参考

Java语言中Object对象的hashCode()取值的底层算法是怎样实现的？

Effective Java 中文第二版