自制编译器 青木峰郎 笔记 Ch7 JavaCC的action和AST

7.1 JavaCC中的Action

javacc的规则中可以声明,定义,计算和返回变量。

返回的语义值类型 非终端符号名 参数列表
{
      临时变量声明
} 
{
      规则{action}
}

注意符号串的中间可以调用action。执行完action之后会看看后面有没有没匹配完的符号,会继续执行匹配。
e.g:

// #@@range/defstruct{
//设置返回类型为StructNode
StructNode defstruct():
{
//声明临时变量
    Token t;
    String n;
    List<Slot> membs;
}
{
    t=<STRUCT> n=name() membs=member_list() ";"
        {
//返回
            return new StructNode(location(t), new StructTypeRef(n), n, membs);
        }
}

要注意写在action之后的token也完全可能因为之前的Lookahead被扫描进来了
这里获取非终端符号语义值的方法就是n=name(),相当于调用name对应的解析并且将返回的语义值赋给n。
t=则将终端符号对应的Token赋给t。

终端符号-Token


public class Token implements java.io.Serializable {

  /**
   * The version identifier for this Serializable class.
   * Increment only if the <i>serialized</i> form of the
   * class changes.
   */
  private static final long serialVersionUID = 1L;

  /**
   * An integer that describes the kind of this token.  This numbering
   * system is determined by JavaCCParser, and a table of these numbers is
   * stored in the file ...Constants.java.
   */
  public int kind;

  /** The line number of the first character of this Token. */
  public int beginLine;
  /** The column number of the first character of this Token. */
  public int beginColumn;
  /** The line number of the last character of this Token. */
  public int endLine;
  /** The column number of the last character of this Token. */
  public int endColumn;

  /**
   * The string image of the token.
   */
  public String image;

  /**
   * A reference to the next regular (non-special) token from the input
   * stream.  If this is the last token from the input stream, or if the
   * token manager has not read tokens beyond this one, this field is
   * set to null.  This is true only if this token is also a regular
   * token.  Otherwise, see below for a description of the contents of
   * this field.
   */
  public Token next;

  /**
   * This field is used to access special tokens that occur prior to this
   * token, but after the immediately preceding regular (non-special) token.
   * If there are no such special tokens, this field is set to null.
   * When there are more than one such special token, this field refers
   * to the last of these special tokens, which in turn refers to the next
   * previous special token through its specialToken field, and so on
   * until the first special token (whose specialToken field is null).
   * The next fields of special tokens refer to other special tokens that
   * immediately follow it (without an intervening regular token).  If there
   * is no such token, this field is null.
   */
  public Token specialToken;

这里注意,对于
int /comment/ main()
这样一行,在int之后" " "/comment/" " "都是specialToken,那么Keyword_int.next=Identifier_main; Identifier_main.special_token=Special_token_" "; Special_token_" ".special_token = Special_token_"/comment/"; Special_token_"/comment/".special_token = Special_token_" ";

重复和action

把action写在*或者类似的标志重复的符号前面,则每次识别到X都会调用一次对应的action。

(X
{action}
)*

如果希望只在整个重复的最后执行1次action,可以写在重复符号后面

(X)*{action}

7.2 AST和节点

AST的节点都是Node的子类,具体有以下种类, AST是根节点

AbstractAssignNode
AddressNode
ArefNode
AssignNode
AST
ASTVisitor
BinaryOpNode
BlockNode
BreakNode
CaseNode
CastNode
CflatToken
CompositeTypeDefinition
CondExprNode
ContinueNode
Declarations
DeclarationVisitor
DereferenceNode
DoWhileNode
Dumpable
Dumper
ExprNode
ExprStmtNode
ForNode
FuncallNode
GotoNode
IfNode
IntegerLiteralNode
LabelNode
LHSNode
LiteralNode
Location
LogicalAndNode
LogicalOrNode
MemberNode
Node
OpAssignNode
PrefixOpNode
PtrMemberNode
ReturnNode
SizeofExprNode
SizeofTypeNode
Slot
StmtNode
StringLiteralNode
StructNode
SuffixOpNode
SwitchNode
TypeDefinition
TypedefNode
TypeNode
UnaryArithmeticOpNode
UnaryOpNode
UnionNode
VariableNode
WhileNode

基础类Node的声明如下:

abstract public class Node implements Dumpable {
    public Node() {
    }

    abstract public Location location();

    public void dump() {
        dump(System.out);
    }

    public void dump(PrintStream s) {
        dump(new Dumper(s));
    }

    public void dump(Dumper d) {
        d.printClass(this, location());
        _dump(d);
    }

    abstract protected void _dump(Dumper d);
}

dump方法允许cbc --dump-ast xxx.cb打印AST。
e.g:

root@cf43f429204e:/# cbc --dump-ast cbc-ubuntu-64bit/test/if1.cb 
<<AST>> (cbc-ubuntu-64bit/test/if1.cb:1)
variables:
functions:
    <<DefinedFunction>> (cbc-ubuntu-64bit/test/if1.cb:3)
    name: "main"
    isPrivate: false
    params:
        parameters:
            <<CBCParameter>> (cbc-ubuntu-64bit/test/if1.cb:4)
            name: "argc"
            typeNode: int
            <<CBCParameter>> (cbc-ubuntu-64bit/test/if1.cb:4)
            name: "argv"
            typeNode: char**
    body:
        <<BlockNode>> (cbc-ubuntu-64bit/test/if1.cb:5)
        variables:
        stmts:
            <<IfNode>> (cbc-ubuntu-64bit/test/if1.cb:6)
            cond:
                <<IntegerLiteralNode>> (cbc-ubuntu-64bit/test/if1.cb:6)
                typeNode: int
                value: 2
            thenBody:
                <<BlockNode>> (cbc-ubuntu-64bit/test/if1.cb:6)
                variables:
                stmts:
                    <<ExprStmtNode>> (cbc-ubuntu-64bit/test/if1.cb:7)
                    expr:
                        <<FuncallNode>> (cbc-ubuntu-64bit/test/if1.cb:7)
                        expr:
                            <<VariableNode>> (cbc-ubuntu-64bit/test/if1.cb:7)
                            name: "puts"
                        args:
                            <<StringLiteralNode>> (cbc-ubuntu-64bit/test/if1.cb:7)
                            value: "OK"
            elseBody:
                <<BlockNode>> (cbc-ubuntu-64bit/test/if1.cb:9)
                variables:
                stmts:
                    <<ExprStmtNode>> (cbc-ubuntu-64bit/test/if1.cb:10)
                    expr:
                        <<FuncallNode>> (cbc-ubuntu-64bit/test/if1.cb:10)
                        expr:
                            <<VariableNode>> (cbc-ubuntu-64bit/test/if1.cb:10)
                            name: "puts"
                        args:
                            <<StringLiteralNode>> (cbc-ubuntu-64bit/test/if1.cb:10)
                            value: "NG"
            <<ReturnNode>> (cbc-ubuntu-64bit/test/if1.cb:12)
            expr:
                <<IntegerLiteralNode>> (cbc-ubuntu-64bit/test/if1.cb:12)
                typeNode: int
                value: 0

具体的节点例子:

package net.loveruby.cflat.ast;
import net.loveruby.cflat.asm.Label;
import java.util.*;

public class CaseNode extends StmtNode {
    protected Label label;
    protected List<ExprNode> values;
    protected BlockNode body;

    public CaseNode(Location loc, List<ExprNode> values, BlockNode body) {
        super(loc);
        this.values = values;
        this.body = body;
        this.label = new Label();
    }

    public List<ExprNode> values() {
        return values;
    }

    public boolean isDefault() {
        return values.isEmpty();
    }

    public BlockNode body() {
        return body;
    }

    public Label label() {
        return label;
    }

    protected void _dump(Dumper d) {
        d.printNodeList("values", values);
        d.printMember("body", body);
    }

    public <S,E> S accept(ASTVisitor<S,E> visitor) {
        return visitor.visit(this);
    }
}

JJTree

JJTree是javacc自带的一个工具,可以用JJTree来半自动化地生成action和节点类。.jjt文件和.jj文件基本是相同地。

PARSER_BEGIN(Parser)

PARSER_END(Parser)


SKIP :
{
  " "
| "	"
| "
"
| "
"
| <"//" (~["
","
"])* ("
"|"
"|"
")>
| <"/*" (~["*"])* "*" (~["/"] (~["*"])* "*")* "/">
}

TOKEN : /* LITERALS */
{
  < INTEGER_LITERAL:
        <DECIMAL_LITERAL> (["l","L"])?
      | <HEX_LITERAL> (["l","L"])?
      | <OCTAL_LITERAL> (["l","L"])?
  >
|
  < #DECIMAL_LITERAL: ["1"-"9"] (["0"-"9"])* >
|
  < #HEX_LITERAL: "0" ["x","X"] (["0"-"9","a"-"f","A"-"F"])+ >
|
  < #OCTAL_LITERAL: "0" (["0"-"7"])* >
}

TOKEN : /* IDENTIFIERS */
{
  < IDENTIFIER: <LETTER> (<LETTER>|<DIGIT>)* >
|
  < #LETTER: ["_","a"-"z","A"-"Z"] >
|
  < #DIGIT: ["0"-"9"] >
}

/** Main production. */
SimpleNode* Start() : {}
{
  Expression() ";"
  { return jjtThis; }
}

/** An Expression. */
void Expression() : {}
{
  AdditiveExpression()
}

/** An Additive Expression. */
void AdditiveExpression() : {}
{
  MultiplicativeExpression() ( ( "+" | "-" ) MultiplicativeExpression() )*
}

/** A Multiplicative Expression. */
void MultiplicativeExpression() : {}
{
  UnaryExpression() ( ( "*" | "/" | "%" ) UnaryExpression() )*
}

/** A Unary Expression. */
void UnaryExpression() : {}
{
  "(" Expression() ")" | Identifier() | Integer()
}

/** An Identifier. */
void Identifier() : {}
{
  <IDENTIFIER>
}

/** An Integer. */
void Integer() : {}
{
  <INTEGER_LITERAL>
}

生成的.jj如下,基本的简单逻辑还有错误处理都做了。

/*@bgen(jjtree) Generated By:JJTree: Do not edit this line. test.jj */
/*@egen*/PARSER_BEGIN(Eg1)

/** An Arithmetic Grammar. */
public class Eg1/*@bgen(jjtree)*/implements Eg1TreeConstants/*@egen*/ {/*@bgen(jjtree)*/
  protected JJTEg1State jjtree = new JJTEg1State();

/*@egen*/

  /** Main entry point. */
  public static void main(String args[]) {
    System.out.println("Reading from standard input...");
    Eg1 t = new Eg1(System.in);
    try {
      SimpleNode n = t.Start();
      n.dump("");
      System.out.println("Thank you.");
    } catch (Exception e) {
      System.out.println("Oops.");
      System.out.println(e.getMessage());
      e.printStackTrace();
    }
  }
}

PARSER_END(Eg1)


SKIP :
{
  " "
| "	"
| "
"
| "
"
| <"//" (~["
","
"])* ("
"|"
"|"
")>
| <"/*" (~["*"])* "*" (~["/"] (~["*"])* "*")* "/">
}

TOKEN : /* LITERALS */
{
  < INTEGER_LITERAL:
        <DECIMAL_LITERAL> (["l","L"])?
      | <HEX_LITERAL> (["l","L"])?
      | <OCTAL_LITERAL> (["l","L"])?
  >
|
  < #DECIMAL_LITERAL: ["1"-"9"] (["0"-"9"])* >
|
  < #HEX_LITERAL: "0" ["x","X"] (["0"-"9","a"-"f","A"-"F"])+ >
|
  < #OCTAL_LITERAL: "0" (["0"-"7"])* >
}

TOKEN : /* IDENTIFIERS */
{
  < IDENTIFIER: <LETTER> (<LETTER>|<DIGIT>)* >
|
  < #LETTER: ["_","a"-"z","A"-"Z"] >
|
  < #DIGIT: ["0"-"9"] >
}

/** Main production. */
SimpleNode Start() : {/*@bgen(jjtree) Start */
  SimpleNode jjtn000 = new SimpleNode(JJTSTART);
  boolean jjtc000 = true;
  jjtree.openNodeScope(jjtn000);
/*@egen*/}
{/*@bgen(jjtree) Start */
  try {
/*@egen*/
  Expression() ";"/*@bgen(jjtree)*/
  {
    jjtree.closeNodeScope(jjtn000, true);
    jjtc000 = false;
  }
/*@egen*/
  { return jjtn000; }/*@bgen(jjtree)*/
  } catch (Throwable jjte000) {
    if (jjtc000) {
      jjtree.clearNodeScope(jjtn000);
      jjtc000 = false;
    } else {
      jjtree.popNode();
    }
    if (jjte000 instanceof RuntimeException) {
      throw (RuntimeException)jjte000;
    }
    if (jjte000 instanceof ParseException) {
      throw (ParseException)jjte000;
    }
    throw (Error)jjte000;
  } finally {
    if (jjtc000) {
      jjtree.closeNodeScope(jjtn000, true);
    }
  }
/*@egen*/
}

/** An Expression. */
void Expression() : {/*@bgen(jjtree) Expression */
  SimpleNode jjtn000 = new SimpleNode(JJTEXPRESSION);
  boolean jjtc000 = true;
  jjtree.openNodeScope(jjtn000);
/*@egen*/}
{/*@bgen(jjtree) Expression */
  try {
/*@egen*/
  AdditiveExpression()/*@bgen(jjtree)*/
  } catch (Throwable jjte000) {
    if (jjtc000) {
      jjtree.clearNodeScope(jjtn000);
      jjtc000 = false;
    } else {
      jjtree.popNode();
    }
    if (jjte000 instanceof RuntimeException) {
      throw (RuntimeException)jjte000;
    }
    if (jjte000 instanceof ParseException) {
      throw (ParseException)jjte000;
    }
    throw (Error)jjte000;
  } finally {
    if (jjtc000) {
      jjtree.closeNodeScope(jjtn000, true);
    }
  }
/*@egen*/
}

/** An Additive Expression. */
void AdditiveExpression() : {/*@bgen(jjtree) AdditiveExpression */
  SimpleNode jjtn000 = new SimpleNode(JJTADDITIVEEXPRESSION);
  boolean jjtc000 = true;
  jjtree.openNodeScope(jjtn000);
/*@egen*/}
{/*@bgen(jjtree) AdditiveExpression */
  try {
/*@egen*/
  MultiplicativeExpression() ( ( "+" | "-" ) MultiplicativeExpression() )*/*@bgen(jjtree)*/
  } catch (Throwable jjte000) {
    if (jjtc000) {
      jjtree.clearNodeScope(jjtn000);
      jjtc000 = false;
    } else {
      jjtree.popNode();
    }
    if (jjte000 instanceof RuntimeException) {
      throw (RuntimeException)jjte000;
    }
    if (jjte000 instanceof ParseException) {
      throw (ParseException)jjte000;
    }
    throw (Error)jjte000;
  } finally {
    if (jjtc000) {
      jjtree.closeNodeScope(jjtn000, true);
    }
  }
/*@egen*/
}

/** A Multiplicative Expression. */
void MultiplicativeExpression() : {/*@bgen(jjtree) MultiplicativeExpression */
  SimpleNode jjtn000 = new SimpleNode(JJTMULTIPLICATIVEEXPRESSION);
  boolean jjtc000 = true;
  jjtree.openNodeScope(jjtn000);
/*@egen*/}
{/*@bgen(jjtree) MultiplicativeExpression */
  try {
/*@egen*/
  UnaryExpression() ( ( "*" | "/" | "%" ) UnaryExpression() )*/*@bgen(jjtree)*/
  } catch (Throwable jjte000) {
    if (jjtc000) {
      jjtree.clearNodeScope(jjtn000);
      jjtc000 = false;
    } else {
      jjtree.popNode();
    }
    if (jjte000 instanceof RuntimeException) {
      throw (RuntimeException)jjte000;
    }
    if (jjte000 instanceof ParseException) {
      throw (ParseException)jjte000;
    }
    throw (Error)jjte000;
  } finally {
    if (jjtc000) {
      jjtree.closeNodeScope(jjtn000, true);
    }
  }
/*@egen*/
}

/** A Unary Expression. */
void UnaryExpression() : {/*@bgen(jjtree) UnaryExpression */
  SimpleNode jjtn000 = new SimpleNode(JJTUNARYEXPRESSION);
  boolean jjtc000 = true;
  jjtree.openNodeScope(jjtn000);
/*@egen*/}
{/*@bgen(jjtree) UnaryExpression */
  try {
/*@egen*/
  "(" Expression() ")" | Identifier() | Integer()/*@bgen(jjtree)*/
  } catch (Throwable jjte000) {
    if (jjtc000) {
      jjtree.clearNodeScope(jjtn000);
      jjtc000 = false;
    } else {
      jjtree.popNode();
    }
    if (jjte000 instanceof RuntimeException) {
      throw (RuntimeException)jjte000;
    }
    if (jjte000 instanceof ParseException) {
      throw (ParseException)jjte000;
    }
    throw (Error)jjte000;
  } finally {
    if (jjtc000) {
      jjtree.closeNodeScope(jjtn000, true);
    }
  }
/*@egen*/
}

/** An Identifier. */
void Identifier() : {/*@bgen(jjtree) Identifier */
  SimpleNode jjtn000 = new SimpleNode(JJTIDENTIFIER);
  boolean jjtc000 = true;
  jjtree.openNodeScope(jjtn000);
/*@egen*/}
{/*@bgen(jjtree) Identifier */
  try {
/*@egen*/
  <IDENTIFIER>/*@bgen(jjtree)*/
  } finally {
    if (jjtc000) {
      jjtree.closeNodeScope(jjtn000, true);
    }
  }
/*@egen*/
}

/** An Integer. */
void Integer() : {/*@bgen(jjtree) Integer */
  SimpleNode jjtn000 = new SimpleNode(JJTINTEGER);
  boolean jjtc000 = true;
  jjtree.openNodeScope(jjtn000);
/*@egen*/}
{/*@bgen(jjtree) Integer */
  try {
/*@egen*/
  <INTEGER_LITERAL>/*@bgen(jjtree)*/
  } finally {
    if (jjtc000) {
      jjtree.closeNodeScope(jjtn000, true);
    }
  }
/*@egen*/
}
原文地址:https://www.cnblogs.com/xuesu/p/14379293.html