Zero-shot Relation Classification as Textual Entailment (Abiola Obamuyide, Andreas Vlachos, 2018)阅读笔记:Model

ESIM (Enhanced Sequential Inference Model):

应用bidirectional Long Short-Term Memory(BiLSTM) units 作为building block，并接受两个sequence of text作为输入。随后对他们进行三步操作：

Input encoding 输入编码;

local inference modeling 局部推理建模;

inference composition 推理组成;

最后返回entailment, contradiction, neutral中得分最高的作为结果

对ESIM模型的改进：

主要是对input encoding和local inference modeling两个步骤进行的改造，改为采用conditional encoding.

input encoding 和local inference modeling两个操作类似，都是接受两个向量组成的序列{P_i}和{h_j}或者更加复杂的两个矩阵 P /in R^{I*d}, H /in R^{J*d}, 其中Ｐ是premise（前提）, Ｈ是hupothesis（假设），I和J都是两列文字中的单词数量。

‘’‘premise指的是输入的文字序列而Hypothesis指的是模板文字序列。比如

P: Obama was born in a hospital in Hawaii in 1965.

H: X is born in Y. '''

在input encoding层, P 和H是premise和hypothesis分别embedding后的单词

在inference composition层，Ｐ和Ｈ是premise和hypothesis是模型内部的某种从之前步骤输出的表示。

在进行完之前两个步骤之后，两个输入序列将被送给BiLSTM units，产生P^hat in R^{I*2d}, H^hat in R^{J*2d}

P̄ , c->p , c<-p = BiLST M (P )

H̄ , c->h, c<-h = BiLST M (H)

其中的c们是Ｐ和Ｈ前后续的BiLSTM units cell stages.

Conditional encoding for ESIM:

在做zero-shot的relation classification时，ESIM会把输入的句子完全independently于relation description的进行encodes. 这就造成当给定一个新的目标relation description时，需要注意考虑它的representations.

所以我们要把BiLSTM改成conditional BiLSTM(cBiLSTM)

P̄ = cBiLST M (P , c->h ,c<-h )

这个改变的实现需要对input encoding和inference composition两个stage进行改变。

我们将改造后的ESIM叫做Conditioned INference Model(CIM).