MethodHandleVS反射

Method Handle与反射

如无特殊说明，本文所有代码均基于JDK1.8.0_221

Method Handle入门

反射我们都知道，为我们提供了运行时对类的成员方法访问的手段，极大地提高了Java语言的动态性，但是反射往往意味着效率低下，但是在JDK7以前为了利用反射带来的动态性，我们又不得不使用反射，随着JDK7中新加入的一组API，JDK为我们提供了全新的选择，也就是Method Handle

什么是Method Handle？

这里引用JDK中的说明

A method handle is a typed, directly executable reference to an underlying method, constructor, field, or similar low-level operation, with optional transformations of arguments or return values.

方法句柄是一个有类型的，可以直接执行的指向底层方法、构造器、field等的引用，可以简单理解为函数指针，它是一种更加底层的查找、调整和调用方法的机制。

如何使用Method Handle?

首先我们需要一个Lookup，lookup是一个创建method handles的工厂，同样引用JDK的说明如下：

A lookup object is a factory for creating method handles,when the creation requires access checking. Method handles do not perform access checks when they are called, but rather when they are created.
Therefore, method handle access restrictions must be enforced when a method handle is created.

上面就其实以及提到了Method handle的不同之处，它的访问检查在创建时就完成了，而发射需要等到调用时，这个等两者对比的时候再说

根据方法修饰符的不同，采用不同的工厂

//访问public方法
MethodHandles.Lookup lookup1 = MethodHandles.publicLookup();
//访问private、protected方法
MethodHandles.Lookup lookup2 = MethodHandles.lookup();

然后我们还需要创建Method Type，它用来描述被访问的方法的参数类型、返回值类型，引用MethodType类的注释如下

A method type represents the arguments and return type accepted and returned by a method handle, or the arguments and return type passed and expected by a method handle caller. Method types must be properly matched between a method handle and all its callers, and the JVM's operations enforce this matching at, specifically during calls to {@link MethodHandle#invokeExact MethodHandle.invokeExact} and {@link MethodHandle#invoke MethodHandle.invoke}, and during execution of {@code invokedynamic} instructions.

可以看到，JVM强制要求声明的Method Type与实际调用方法的参数类型必须匹配。

通过Method Type的静态方法，我们可以非常简单的声明一个Method Type，传入方法的返回值类型和参数类型即可

//以String的length方法为例(void同理，void.class)
MethodType mt = MethodType.methodType(int.class);

再者，通过lookup创建我们的MethodHandle

访问普通方法

//接上面的length方法
MethodHandle methodHandle = lookup1.findVirtual(String.class, "length", mt);

访问静态方法

//以valueOf方法为例
public static String valueOf(Object obj) {
    return (obj == null) ? "null" : obj.toString();
}

MethodType mt2 = MethodType.methodType(String.class,Object.class);
MethodHandle valueOf = lookup1.findStatic(String.class, "valueOf", mt2);

访问构造函数

MethodType mt3= MethodType.methodType(void.class,String.class);
MethodHandle string = lookup1.findConstructor(String.class, mt3);

访问私有方法

//以checkBounds方法为例
private static void checkBounds(byte[] bytes, int offset, int length) {
    if (length < 0)
        throw new StringIndexOutOfBoundsException(length);
    if (offset < 0)
        throw new StringIndexOutOfBoundsException(offset);
    if (offset > bytes.length - length)
        throw new StringIndexOutOfBoundsException(offset + length);
}

Method checkBounds = String.class.getDeclaredMethod("checkBounds", byte[].class, int.class, int.class);
checkBounds.setAccessible(true);
MethodHandle unreflect = lookup2.unreflect(checkBounds);
unreflect.invoke(new byte[]{},-1,-1);

访问公有成员

在JDK8中，我还没有找到访问私有成员的方法。

//访问一个自定义类的共有成员
MethodHandle value = lookup2.findGetter(A.class, "value", int.class);
int val = (int)value.invoke(new A(2));
System.out.println(val);

static class A{
    int value;
    A(int value){
        this.value=value;
    }
}

最后一步就是调用Method Handle了

按照对参数数目、参数类型的要求限制不同，分为三类invokeWithArguments(),invoke(),invokeExact()

invokeWithArguments要求最低，它接收变长参数，允许参数拆装箱类型转换
invoke要求第二，它接收固定的参数列表，允许参拆装箱，类型转换
invokeExact要求最严格，它啥都不允许，参数类型不匹配就报错

示例如下：

//invokeWithArguments，接收变长数组
MethodType mt5 = MethodType.methodType(List.class, Object[].class);
MethodHandle asList = lookup1.findStatic(Arrays.class, "asList", mt5);
List<Integer> integers = (List<Integer>) asList.invokeWithArguments(1, 2);
System.out.println(integers);

//invokeExact,直接报错
MethodType mt = MethodType.methodType(int.class, int.class, int.class);
MethodHandle sumMH = lookup1.findStatic(Integer.class, "sum", mt);
int sum = (int) sumMH.invokeExact(1, 1l);
System.out.println(sum);

//Exception in thread "main" java.lang.invoke.WrongMethodTypeException: expected (int,int)int but found (int,long)int

Method Handle和反射性能对比

测试程序如下：

package com.hustdj.jdkStudy.methodHandle;

import java.lang.invoke.MethodHandle;
import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodType;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;

public class MethodHandleVsReflect {
    public static void main(String[] args)  {
        MethodHandleVsReflect methodHandleVsReflect = new MethodHandleVsReflect();
        try {
            methodHandleVsReflect.testDirect();
            methodHandleVsReflect.testReflect();
            methodHandleVsReflect.testMethodHandle();
        } catch (Throwable throwable) {
            throwable.printStackTrace();
        }
    }

    public void testDirect(){
        A a = new A();
        B b = new B();
        //预热
        for (long i = 0; i < 100_0000_0000L; i++) {
            if ((i&1)==0){
                a.count(1);
            }else{
                b.count(1);
            }
        }
        //统计性能
        long startNano=System.nanoTime();
        for (long i = 0; i < 100_0000_0000L; i++) {
            if ((i&1)==0){
                a.count(1);
            }else{
                b.count(1);
            }
        }
        System.out.format("计算结果为: a: %d b: %d",a.i,b.i);
        double average = (System.nanoTime() - startNano) / 100_0000_0000.0;
        System.out.println("直接调用平均耗时(ns)："+average);
    }

    public void testReflect() throws NoSuchMethodException, InvocationTargetException, IllegalAccessException {
        //反射
        Method countA = A.class.getMethod("count", Integer.class);
        Method countB = B.class.getMethod("count",Integer.class);
        A a = new A();
        B b = new B();
        //预热
        for (long i = 0; i < 100_0000_0000L; i++) {
            if ((i&1)==1){
                countA.invoke(a,1);
            }else{
                countB.invoke(b,1);
            }
        }
        //统计性能
        long startNano=System.nanoTime();
        for (long i = 0; i < 100_0000_0000L; i++) {
            if ((i&1)==1){
                countA.invoke(a,1);
            }else{
                countB.invoke(b,1);
            }
        }
        System.out.format("计算结果为: a: %d b: %d",a.i,b.i);
        double average = (System.nanoTime() - startNano) / 100_0000_0000.0;
        System.out.println("反射平均耗时(ns)："+average);
    }

    public void testMethodHandle() throws Throwable {
        A a = new A();
        B b = new B();
        MethodHandles.Lookup publicLookup = MethodHandles.publicLookup();
        MethodType mt = MethodType.methodType(void.class,Integer.class);
        MethodHandle countA = publicLookup.findVirtual(A.class, "count", mt);
        MethodHandle countB = publicLookup.findVirtual(B.class, "count", mt);
        Integer int_1 = new Integer(1);
        //预热
        for (long i = 0; i < 100_0000_0000L; i++) {
            if ((i&1)==1){
                countA.invoke(a,1);
            }else{
                countB.invoke(b,1);
            }
        }
        //统计性能
        long startNano=System.nanoTime();
        for (long i = 0; i < 100_0000_0000L; i++) {
            if ((i&1)==1){
                countA.invoke(a,1);
            }else{
                countB.invoke(b,1);
            }
        }
        System.out.format("计算结果为: a: %d b: %d",a.i,b.i);
        double average = (System.nanoTime() - startNano) / 100_0000_0000.0;
        System.out.println("methodHandle平均耗时(ns)："+average);
    }

    public class A{
        long i=0;
        public void count(Integer a){
            i++;
        }
    }

    public class B{
        long i=0;
        public void count(Integer a){
            i++;
        }
    }
}

测试结果如下：

//1
计算结果为: a: 10000000000 b: 10000000000直接调用平均耗时(ns)：1.20156184
计算结果为: a: 10000000000 b: 10000000000反射平均耗时(ns)：3.05123386
计算结果为: a: 10000000000 b: 10000000000methodHandle平均耗时(ns)：7.09741082

//2
计算结果为: a: 10000000000 b: 10000000000直接调用平均耗时(ns)：1.02183322
计算结果为: a: 10000000000 b: 10000000000反射平均耗时(ns)：3.44056289
计算结果为: a: 10000000000 b: 10000000000methodHandle平均耗时(ns)：6.08221384

//3
计算结果为: a: 10000000000 b: 10000000000直接调用平均耗时(ns)：1.22246659
计算结果为: a: 10000000000 b: 10000000000反射平均耗时(ns)：3.41759047
计算结果为: a: 10000000000 b: 10000000000methodHandle平均耗时(ns)：5.90614517

通过给JVM添加参数

-XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining

可以发现通过反射调用的invoke已经进行了方法内联

@ 147 java.lang.reflect.Method::invoke (62 bytes) inline (hot)
@ 15 sun.reflect.Reflection::quickCheckMemberAccess (10 bytes) inline (hot)
@ 1 sun.reflect.Reflection::getClassAccessFlags (0 bytes) (intrinsic)
@ 6 java.lang.reflect.Modifier::isPublic (12 bytes) inline (hot)
@ 56 sun.reflect.DelegatingMethodAccessorImpl::invoke (10 bytes) inline (hot)
-> TypeProfile (18366/18366 counts) = sun/reflect/DelegatingMethodAccessorImpl
! @ 6 sun.reflect.GeneratedMethodAccessor2::invoke (66 bytes) inline (hot)
! @ 6 sun.reflect.GeneratedMethodAccessor1::invoke (66 bytes) inline (hot)
-> TypeProfile (5296/10593 counts) = sun/reflect/GeneratedMethodAccessor1
-> TypeProfile (5297/10593 counts) = sun/reflect/GeneratedMethodAccessor2
@ 40 com.hustdj.jdkStudy.methodHandle.MethodHandleVsReflect$A::count (11 bytes) inline (hot)

@ 168 java.lang.reflect.Method::invoke (62 bytes) inline (hot)
@ 15 sun.reflect.Reflection::quickCheckMemberAccess (10 bytes) inline (hot)
@ 1 sun.reflect.Reflection::getClassAccessFlags (0 bytes) (intrinsic)
@ 6 java.lang.reflect.Modifier::isPublic (12 bytes) inline (hot)
@ 56 sun.reflect.DelegatingMethodAccessorImpl::invoke (10 bytes) inline (hot)
-> TypeProfile (18366/18366 counts) = sun/reflect/DelegatingMethodAccessorImpl
! @ 6 sun.reflect.GeneratedMethodAccessor2::invoke (66 bytes) inline (hot)
! @ 6 sun.reflect.GeneratedMethodAccessor1::invoke (66 bytes) inline (hot)
-> TypeProfile (5296/10593 counts) = sun/reflect/GeneratedMethodAccessor1
-> TypeProfile (5297/10593 counts) = sun/reflect/GeneratedMethodAccessor2
@ 40 com.hustdj.jdkStudy.methodHandle.MethodHandleVsReflect$B::count (11 bytes) inline (hot)

难道method handle的性能止步于此了嘛？

不，通过将method handle置为static final的变量，我们甚至可以达到直接调用的效率

static final MethodHandles.Lookup publicLookup = MethodHandles.publicLookup();
static final MethodType mt = MethodType.methodType(void.class,Integer.class);
private static final MethodHandle countA=getCountA();
private static final MethodHandle countB=getCountB();
private static MethodHandle getCountA(){
    try {
        return publicLookup.findVirtual(A.class, "count", mt);
    } catch (Throwable e) {
        e.printStackTrace();
        return null;
    }
}

private static MethodHandle getCountB(){
    try {
        return publicLookup.findVirtual(B.class, "count", mt);
    } catch (Throwable e) {
        e.printStackTrace();
        return null;
    }
}

调用结果为：

//1
计算结果为: a: 10000000000 b: 10000000000直接调用平均耗时(ns)：0.98582866
计算结果为: a: 10000000000 b: 10000000000反射平均耗时(ns)：3.34687653
计算结果为: a: 10000000000 b: 10000000000methodHandle平均耗时(ns)：1.17300033
//2
计算结果为: a: 10000000000 b: 10000000000直接调用平均耗时(ns)：0.97544322
计算结果为: a: 10000000000 b: 10000000000反射平均耗时(ns)：3.0777855
计算结果为: a: 10000000000 b: 10000000000methodHandle平均耗时(ns)：0.95403012
//3
计算结果为: a: 10000000000 b: 10000000000直接调用平均耗时(ns)：1.24535073
计算结果为: a: 10000000000 b: 10000000000反射平均耗时(ns)：3.53959802
计算结果为: a: 10000000000 b: 10000000000methodHandle平均耗时(ns)：0.97747171

非常的amazing啊，我们简简单单只是从局部变量变成了静态变量，效率直逼直接调用了，为什么呢？

前后对比一下加了static final前的inline信息和不加static final的信息

加入static final的log

com.hustdj.jdkStudy.methodHandle.MethodHandleVsReflect::testMethodHandle @ 33 (200 bytes)
@ 47 java.lang.invoke.LambdaForm$MH/1554547125::invokeExact_MT (21 bytes) force inline by annotation
...
@ 17 com.hustdj.jdkStudy.methodHandle.MethodHandleVsReflect$A::count (11 bytes) inline (hot)
...
@ 17 com.hustdj.jdkStudy.methodHandle.MethodHandleVsReflect$B::count (11 bytes) inline (hot)

我们发现出现了inline，而且这里的内联是直接内联到了最外层的testMethodHandle方法中，区别于method.invoke()的内联

没有static final的log

com.hustdj.jdkStudy.methodHandle.MethodHandleVsReflect::testMethodHandle @ 126 (239 bytes)
@ 140 java.lang.invoke.LambdaForm$MH/1554547125::invokeExact_MT (21 bytes) force inline by annotation
@ 2 java.lang.invoke.Invokers::checkExactType (30 bytes) force inline by annotation
@ 11 java.lang.invoke.MethodHandle::type (5 bytes) accessor
@ 6 java.lang.invoke.Invokers::checkCustomized (20 bytes) force inline by annotation
@ 17 java.lang.invoke.MethodHandle::invokeBasic(LL)V (0 bytes) receiver not constant
@ 151 java.lang.invoke.LambdaForm$MH/1554547125::invokeExact_MT (21 bytes) force inline by annotation
@ 2 java.lang.invoke.Invokers::checkExactType (30 bytes) force inline by annotation
@ 11 java.lang.invoke.MethodHandle::type (5 bytes) accessor
@ 6 java.lang.invoke.Invokers::checkCustomized (20 bytes) force inline by annotation
@ 17 java.lang.invoke.MethodHandle::invokeBasic(LL)V (0 bytes) receiver not constant

并没有看到内联的出现，导致Method Handle的性能大涨的原因找到了，也就是两个count方法的内联。

至于为什么会出现这样的情况，这里参考了一下别人的博客https://shipilev.net/jvm/anatomy-quarks/17-trust-nonstatic-final-fields/

虽然他这里提到的是Nostatic final field，但是我们这里是static final field能够直接进行常量替换，不用考虑那么复杂，但是引用博客中的一句话

Constant folding through these final fields is the corner-stone for performance story for MethodHandle-s, VarHandle-s, Atomic*FieldUpdaters` and other high-performance implementations from the core library.

constant folding是MethodHandle-s, VarHandle-s, Atomic*FieldUpdaters`这些高性能实现的性能基石，那method handle可以，反射又行不行呢？测试一下

修改代码如下：

static final Method countAReflect=getReflectA();
static final Method countBReflect=getRelfectB();
private static Method getReflectA(){
    try {
        return A.class.getMethod("count", Integer.class);
    } catch (NoSuchMethodException e) {
        e.printStackTrace();
        return null;
    }
}
private static Method getRelfectB(){
    try {
        return B.class.getMethod("count",Integer.class);
    } catch (NoSuchMethodException e) {
        e.printStackTrace();
        return null;
    }
}

测试结果为：

//1
计算结果为: a: 10000000000 b: 10000000000直接调用平均耗时(ns)：1.22609891
计算结果为: a: 10000000000 b: 10000000000反射平均耗时(ns)：3.43359246
计算结果为: a: 10000000000 b: 10000000000methodHandle平均耗时(ns)：0.92919298
//2
计算结果为: a: 10000000000 b: 10000000000直接调用平均耗时(ns)：1.00782253
计算结果为: a: 10000000000 b: 10000000000反射平均耗时(ns)：3.60654
计算结果为: a: 10000000000 b: 10000000000methodHandle平均耗时(ns)：0.99133146

然而反射并不行！可能要深入JVM才能了解为何Method Handle能做到直接调用的性能吧。

初现端倪

照理说

method handle创建时就进行了类型检查，而method.invoke每次调用都需要进行检查
method invoke是用数组包装参数的，每次都需要创建一个新的数组
method handle在创建之后就是固定的，MH.invoke()自身都可以被内联，而Method.invoke()所有对方法的反射调用都需要经过它，它自身就很难被内联到调用方

但是事实来看method handle的性能很难让人满意大部分情况下都不如反射（除开static final这样的方式），这是为什么呢？

因为JDK8对反射进行了大量的优化，把代码放到JDK7中跑一下结果如下：

//1
计算结果为: a: 10000000000 b: 10000000000直接调用平均耗时(ns)：0.97767317
计算结果为: a: 10000000000 b: 10000000000反射平均耗时(ns)：14.81999815
计算结果为: a: 10000000000 b: 10000000000methodHandle平均耗时(ns)：10.21145029
//2
计算结果为: a: 10000000000 b: 10000000000直接调用平均耗时(ns)：0.97137786
计算结果为: a: 10000000000 b: 10000000000反射平均耗时(ns)：14.79272622
计算结果为: a: 10000000000 b: 10000000000methodHandle平均耗时(ns)：9.30254589

那个熟悉的慢反射又回来了，在JDK1.7一下，Method Handle确实比反射要快上一些，但是还是比JDK8中慢

Method.invoke()和MethodHandle.invoke()同样是native，为什么反射能够被内联？

JDK的设计者对于Method.invoke()采取了两种策略，一种是native也就是C++的实现方式很难进行内联优化，另一种是在某个方法调用超过阈值后会利用字节码生成技术在内存中生成一个类（暂时没有找到将这个类保存下来的方法），包含要调用的方法，然后加载进虚拟机，这个时候就能内联优化了，而MethodHandle.invoke直接就是native调用，并没有上面的策略，自然也就无法内联，至于设置为static final之后为什么就可以内联了，这个。。。

参考链接

此外，这里再贴一个其他人做的method handle的性能测试
http://chriskirk.blogspot.com/2014/05/which-is-faster-in-java-reflection-or.html
https://www.iteye.com/blog/rednaxelafx-548536

题外话

将循环次数减低到10000，会发现一个奇怪的现象，直接调用居然最慢，感兴趣的可以自行测试一下,会出现made not entrant，JIT会进行反优化，附这个问题的另一个链接https://zhuanlan.zhihu.com/p/82118137
欢迎讨论！！