Groovy实现代码热载的机制和原理

前言:
　　真的很久没在博客园上更新博客了, 现在趁这段空闲的时间, 对之前接触的一些工程知识做下总结. 先来讲下借用Groovy如何来实现代码的热载, 以及其中涉及到的原理和需要注意的点.
　　总的来说, Groovy作为一本动态编译语言, 其对标应该是c/c++体系中的lua, 在一些业务逻辑变动频繁的场景, 其意义非常的重大.

简单入门:
　　本文的主题是Groovy实现代码热载, 其他大背景是java实现主干代码, groovy实现易变动的逻辑代码. 先来看下java是如何调用的groovy脚本的.

import groovy.lang.Binding;
import groovy.lang.GroovyShell;

public class GroovyTest {

    public static void main(String[] args) {
        // *) groovy 代码
        String script = "println 'hello'; 'name = ' + name;";

        // *) 传入参数
        Binding binding = new Binding();
        binding.setVariable("name", "lilei");

        // *) 执行脚本代码
        GroovyShell shell = new GroovyShell(binding);
        Object res = shell.evaluate(script);
        System.out.println(res);
    }

}

　　这段代码的输出为:

hello
name = lilei

　　Binding类主要用于传递参数集, 而GroovyShell则主要用于编译执行Groovy代码. 是不是比想象中的要简答, ^_^.
　　当然java调用groovy还有其他的方式, 下文会涉及到.

原理分析:
　　下面这段其实大有文章.

GroovyShell shell = new GroovyShell(binding);
Object res = shell.evaluate(script);

　　对于函数evaluate, 我们追踪进去, 会有不少的重新认识.

    public Object evaluate(GroovyCodeSource codeSource) throws CompilationFailedException {
        Script script = this.parse(codeSource);
        return script.run();
    }

    public Script parse(GroovyCodeSource codeSource) throws CompilationFailedException {
        return InvokerHelper.createScript(this.parseClass(codeSource), this.context);
    }

　　其大致的思路, 为Groovy脚本代码包装生成class, 然后产生该类实例对象, 在具体执行其包装的逻辑代码.
　　但是这边需要注意的情况:

    public Class parseClass(String text) throws CompilationFailedException {
        return this.parseClass(text, "script" + System.currentTimeMillis() + Math.abs(text.hashCode()) + ".groovy");
    }

　　对于groovy脚本, 它默认会生成名字为script + System.currentTimeMillis() + Math.abs(text.hashCode())的class类, 也就是说传入脚本, 它都会生成一个新类, 就算同一段groovy脚本代码, 每调用一次, 都会生成一个新类.

陷阱评估:
　　原理我们基本上理解了, 但是让我们来构造一段代码, 看看是否有哪些陷阱.

import groovy.lang.Binding;
import groovy.lang.GroovyShell;
import groovy.lang.Script;

import java.util.Map;
import java.util.TreeMap;

public class GroovyTest2 {

    private static GroovyShell shell = new GroovyShell();

    public static Object handle(String script, Map<String, Object> params) {
        Binding binding = new Binding();
        for ( Map.Entry<String, Object> ent : params.entrySet() ) {
            binding.setVariable(ent.getKey(), ent.getValue());
        }
        Script sci = shell.parse(script);
        sci.setBinding(binding);
        return sci.run();
    }

    public static void main(String[] args) {
        String script = "println 'hello'; 'name = ' + name;";
        Map<String, Object> params = new TreeMap<String, Object>();
        params.put("name", "lilei");
        while(true) {
            handle(script, params);
        }
    }

}

　　这段代码执行到最后的结果为, 频繁触发full gc, 究其原因为PermGen区爆满. 这是为何呢?
　　如上所分析的, 虽然是同一份脚本代码, 但是都为其每次调用, 间接生成了一个class类. 对于full gc, 除了清理老年代, 也会顺便清理永久代(PermGen), 但为何不清理这些一次性的class呢? 答案是gc条件不成立.
　　引用下class被gc, 需满足的三个条件:
　　1). 该类所有的实例都已经被GC
　　2). 加载该类的ClassLoader已经被GC
　　3). 该类的java.lang.Class对象没有在任何地方被引用
　　加载类的ClassLoader实例被GroovyShell所持有, 作为静态变量(gc root), 条件2不成立, GroovyClassLoader有个map成员, 会缓存编译的class, 因此条件3都不成立.
　　有人会问, 为何不把GroovyShell对象, 作为一个临时变量呢?

    public static Object handle(String script, Map<String, Object> params) {
        Binding binding = new Binding();
        for ( Map.Entry<String, Object> ent : params.entrySet() ) {
            binding.setVariable(ent.getKey(), ent.getValue());
        }
        GroovyShell shell = new GroovyShell();
        Script sci = shell.parse(script);
        sci.setBinding(binding);
        return sci.run();
    }

　　实际上, 还是治标不治本, 只是说class能被gc掉, 但是清理的速度可能赶不上产生的速度, 依旧频繁触发full gc.

推荐做法:
　　解决上述问题很简单, 就是引入缓存, 当然缓存的对象不上Script实例(在多线程环境下, 会遇到数据混乱的问题, 对象有状态), 而是Script.class本身. 对应的key为脚本代码的指纹.
　　大致的代码如下所示:

    private static ConcurrentHashMap<String, Class<Script>> zlassMaps
            = new ConcurrentHashMap<String, Class<Script>>();

    public static Object invoke(String scriptText, Map<String, Object> params) {
        String key = fingerKey(scriptText);
        Class<Script> script = zlassMaps.get(key);
        if ( script == null ) {
            synchronized (key.intern()) {
                // Double Check
                script = zlassMaps.get(key);
                if ( script == null ) {
                    GroovyClassLoader classLoader = new GroovyClassLoader();
                    script = classLoader.parseClass(scriptText);
                    zlassMaps.put(key, script);
                }
            }
        }

        Binding binding = new Binding();
        for ( Map.Entry<String, Object> ent : params.entrySet() ) {
            binding.setVariable(ent.getKey(), ent.getValue());
        }
        Script scriptObj = InvokerHelper.createScript(script, binding);
        return scriptObj.run();

    }

    // *) 为脚本代码生成md5指纹
    public static String fingerKey(String scriptText) {
        try {
            MessageDigest md = MessageDigest.getInstance("MD5");
            byte[] bytes = md.digest(scriptText.getBytes("utf-8"));

            final char[] HEX_DIGITS = "0123456789ABCDEF".toCharArray();
            StringBuilder ret = new StringBuilder(bytes.length * 2);
            for (int i=0; i<bytes.length; i++) {
                ret.append(HEX_DIGITS[(bytes[i] >> 4) & 0x0f]);
                ret.append(HEX_DIGITS[bytes[i] & 0x0f]);
            }
            return ret.toString();
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }

　　这边会为每个新类单独创建一个GroovyClassLoader对象, 也是巧妙地回避之前的陷阱.

总结:
　　这边没有深入研究java中类的加载机制, 只是涉及class被gc的先决条件, 同时提供了一种思路, 如何借助groovy实现代码热加载, 同时又规避其中的陷阱.