深度知识追踪论文阅读——L@S 2017 Deep Knowledge Tracing On Programming Exercises

L@S 2017 Deep Knowledge Tracing On Programming Exercises (斯坦福)

本文主要目的:将embedded program submission喂入循环神经网络(LSTM),训练模型并预测学生是否能通过后续的编程训练

Task definition:

Based on a student’s sequence of code submission attempts over time (hereby, their "trajectory") on a programming exercise, predict whether the student will successfully complete the next programming exercise within the same course.

Dataset:

code research
The Hour of Code course
Exercise 18

This Exercise 18 data set contains 1,263,360
code submissions, of which 79,553 are unique, made by 263,569 students. 81 . 0% of these students arrived at the correct solution in their last submission

单个练习的学生的学习轨迹:
每次代码提交都被表达成一次抽象语法树
对轨迹长度进行控制,轨迹长度不同的分开训练,分为2-10这样的九个数据集长度分别进行训练

Model

  1. 用循环神经网络(LSTM)处理学生的学习轨迹

    假设学生的轨迹包含k次提交,这些被转换成程序嵌入,形成k个嵌入序列。将这k个嵌入喂入rnn,其最终隐藏状态通过一个完全连接的层和随后的softmax层传递。softmax的输出y就是一二分类答案,反映学生是否能成功解决下一个问题。

  2. 用递归神经网络(Recursive Neural Network处程序嵌入(program embedding))

    训练了一个递归神经网络,使我们能够对学生程序的AST表示进行矢量化。
    思路是参考前人的文献【9】,实现是利用文献【11】

  3. 作者自己设置了一Baseline Model
    pathSocre(T)=提交次数倒数之和
    训练了一简单的logistic regression model

Results

原文地址:https://www.cnblogs.com/advLuo/p/10820829.html