基于卷积神经网络CNN的电影推荐系统

本项目使用文本卷积神经网络，并使用MovieLens数据集完成电影推荐的任务。

推荐系统在日常的网络应用中无处不在，比如网上购物、网上买书、新闻app、社交网络、音乐网站、电影网站等等等等，有人的地方就有推荐。根据个人的喜好，相同喜好人群的习惯等信息进行个性化的内容推荐。比如打开新闻类的app，因为有了个性化的内容，每个人看到的新闻首页都是不一样的。

这当然是很有用的，在信息爆炸的今天，获取信息的途径和方式多种多样，人们花费时间最多的不再是去哪获取信息，而是要在众多的信息中寻找自己感兴趣的，这就是信息超载问题。为了解决这个问题，推荐系统应运而生。

协同过滤是推荐系统应用较广泛的技术，该方法搜集用户的历史记录、个人喜好等信息，计算与其他用户的相似度，利用相似用户的评价来预测目标用户对特定项目的喜好程度。优点是会给用户推荐未浏览过的项目，缺点呢，对于新用户来说，没有任何与商品的交互记录和个人喜好等信息，存在冷启动问题，导致模型无法找到相似的用户或商品。

为了解决冷启动的问题，通常的做法是对于刚注册的用户，要求用户先选择自己感兴趣的话题、群组、商品、性格、喜欢的音乐类型等信息，比如豆瓣FM：

下载数据集

运行下面代码把数据集下载下来

import pandas as pd
from sklearn.model_selection import train_test_split
import numpy as np
from collections import Counter
import tensorflow as tf

import os
import pickle
import re
from tensorflow.python.ops import math_ops

from urllib.request import urlretrieve
from os.path import isfile, isdir
from tqdm import tqdm
import zipfile
import hashlib

def _unzip(save_path, _, database_name, data_path):
    """
    解压
    :param save_path: The path of the gzip files
    :param database_name: Name of database
    :param data_path: Path to extract to
    :param _: HACK - Used to have to same interface as _ungzip
    """
    print('Extracting {}...'.format(database_name))
    with zipfile.ZipFile(save_path) as zf:
        zf.extractall(data_path)

def download_extract(database_name, data_path):
    """
    下载提取数据
    :param database_name: Database name
    """
    DATASET_ML1M = 'ml-1m'

    if database_name == DATASET_ML1M:
        url = 'http://files.grouplens.org/datasets/movielens/ml-1m.zip'
        hash_code = 'c4d9eecfca2ab87c1945afe126590906'
        extract_path = os.path.join(data_path, 'ml-1m')
        save_path = os.path.join(data_path, 'ml-1m.zip')
        extract_fn = _unzip

    if os.path.exists(extract_path):
        print('Found {} Data'.format(database_name))
        return

    if not os.path.exists(data_path):
        os.makedirs(data_path)

    if not os.path.exists(save_path):
        with DLProgress(unit='B', unit_scale=True, miniters=1, desc='Downloading {}'.format(database_name)) as pbar:
            urlretrieve(
                url,
                save_path,
                pbar.hook)

    assert hashlib.md5(open(save_path, 'rb').read()).hexdigest() == hash_code, 
        '{} file is corrupted.  Remove the file and try again.'.format(save_path)

    os.makedirs(extract_path)
    try:
        extract_fn(save_path, extract_path, database_name, data_path)
    except Exception as err:
        shutil.rmtree(extract_path)  # Remove extraction folder if there is an error
        raise err

    print('Done.')
    # Remove compressed data
#     os.remove(save_path)

class DLProgress(tqdm):
    """
    下载时处理进度条
    """
    last_block = 0

    def hook(self, block_num=1, block_size=1, total_size=None):
        """
        A hook function that will be called once on establishment of the network connection and
        once after each block read thereafter.
        :param block_num: A count of blocks transferred so far
        :param block_size: Block size in bytes
        :param total_size: The total size of the file. This may be -1 on older FTP servers which do not return
                            a file size in response to a retrieval request.
        """
        self.total = total_size
        self.update((block_num - self.last_block) * block_size)
        self.last_block = block_num

data_dir = './'
download_extract('ml-1m', data_dir)

Extracting ml-1m...
Done.

先来看看数据

本项目使用的是MovieLens 1M 数据集，包含6000个用户在近4000部电影上的1亿条评论。

数据集分为三个文件：

用户数据users.dat
电影数据movies.dat
评分数据ratings.dat

用户数据

用户ID
性别
年龄
职业ID
邮编

数据中的格式：UserID::Gender::Age::Occupation::Zip-code

Gender is denoted by a "M" for male and "F" for female
Age is chosen from the following ranges:
- 1: "Under 18"
- 18: "18-24"
- 25: "25-34"
- 35: "35-44"
- 45: "45-49"
- 50: "50-55"
- 56: "56+"
Occupation is chosen from the following choices:
- 0: "other" or not specified
- 1: "academic/educator"
- 2: "artist"
- 3: "clerical/admin"
- 4: "college/grad student"
- 5: "customer service"
- 6: "doctor/health care"
- 7: "executive/managerial"
- 8: "farmer"
- 9: "homemaker"
- 10: "K-12 student"
- 11: "lawyer"
- 12: "programmer"
- 13: "retired"
- 14: "sales/marketing"
- 15: "scientist"
- 16: "self-employed"
- 17: "technician/engineer"
- 18: "tradesman/craftsman"
- 19: "unemployed"
- 20: "writer"

users_title = ['UserID', 'Gender', 'Age', 'OccupationID', 'Zip-code']
users = pd.read_table('./ml-1m/users.dat', sep='::', header=None, names=users_title, engine = 'python')
users.head()

	UserID	Gender	Age	OccupationID	Zip-code
0	1	F	1	10	48067
1	2	M	56	16	70072
2	3	M	25	15	55117
3	4	M	45	7	02460
4	5	M	25	20	55455

可以看出UserID、Gender、Age和Occupation都是类别字段，其中邮编字段是我们不使用的。

电影数据

电影ID
电影名
电影风格

数据中的格式：MovieID::Title::Genres

Titles are identical to titles provided by the IMDB (including
year of release)
Genres are pipe-separated and are selected from the following genres:
- Action
- Adventure
- Animation
- Children's
- Comedy
- Crime
- Documentary
- Drama
- Fantasy
- Film-Noir
- Horror
- Musical
- Mystery
- Romance
- Sci-Fi
- Thriller
- War
- Western

movies_title = ['MovieID', 'Title', 'Genres']
movies = pd.read_table('./ml-1m/movies.dat', sep='::', header=None, names=movies_title, engine = 'python')
movies.head()

	MovieID	Title	Genres
0	1	Toy Story (1995)	Animation\|Children's\|Comedy
1	2	Jumanji (1995)	Adventure\|Children's\|Fantasy
2	3	Grumpier Old Men (1995)	Comedy\|Romance
3	4	Waiting to Exhale (1995)	Comedy\|Drama
4	5	Father of the Bride Part II (1995)	Comedy

MovieID是类别字段，Title是文本，Genres也是类别字段

评分数据

用户ID
电影ID
评分
时间戳

数据中的格式：UserID::MovieID::Rating::Timestamp

UserIDs range between 1 and 6040
MovieIDs range between 1 and 3952
Ratings are made on a 5-star scale (whole-star ratings only)
Timestamp is represented in seconds since the epoch as returned by time(2)
Each user has at least 20 ratings

ratings_title = ['UserID','MovieID', 'Rating', 'timestamps']
ratings = pd.read_table('./ml-1m/ratings.dat', sep='::', header=None, names=ratings_title, engine = 'python')
ratings.head()

	UserID	MovieID	Rating	timestamps
0	1	1193	5	978300760
1	1	661	3	978302109
2	1	914	3	978301968
3	1	3408	4	978300275
4	1	2355	5	978824291

评分字段Rating就是我们要学习的targets，时间戳字段我们不使用。

来说说数据预处理

UserID、Occupation和MovieID不用变。
Gender字段：需要将‘F’和‘M’转换成0和1。
Age字段：要转成7个连续数字0~6。
Genres字段：是分类字段，要转成数字。首先将Genres中的类别转成字符串到数字的字典，然后再将每个电影的Genres字段转成数字列表，因为有些电影是多个Genres的组合。
Title字段：处理方式跟Genres字段一样，首先创建文本到数字的字典，然后将Title中的描述转成数字的列表。另外Title中的年份也需要去掉。
Genres和Title字段需要将长度统一，这样在神经网络中方便处理。空白部分用‘< PAD >’对应的数字填充。

实现数据预处理

def load_data():
    """
    从文件中加载数据集
    """
    # 读取User数据
    users_title = ['UserID', 'Gender', 'Age', 'JobID', 'Zip-code']
    users = pd.read_table('./ml-1m/users.dat', sep='::', header=None, names=users_title, engine = 'python')
    users = users.filter(regex='UserID|Gender|Age|JobID')
    users_orig = users.values
    
    # 改变User数据中性别和年龄
    gender_map = {'F':0, 'M':1}
    users['Gender'] = users['Gender'].map(gender_map)

    age_map = {val:ii for ii,val in enumerate(set(users['Age']))}
    users['Age'] = users['Age'].map(age_map)

    # 读取Movie数据集
    movies_title = ['MovieID', 'Title', 'Genres']
    movies = pd.read_table('./ml-1m/movies.dat', sep='::', header=None, names=movies_title, engine = 'python')
    movies_orig = movies.values
    # 将Title中的年份去掉
    pattern = re.compile(r'^(.*)((d+))$')

    title_map = {val:pattern.match(val).group(1) for ii,val in enumerate(set(movies['Title']))}
    movies['Title'] = movies['Title'].map(title_map)

    # 电影类型转数字字典
    genres_set = set()
    for val in movies['Genres'].str.split('|'):
        genres_set.update(val)

    genres_set.add('<PAD>')
    genres2int = {val:ii for ii, val in enumerate(genres_set)}

    # 将电影类型转成等长数字列表，长度是18
    genres_map = {val:[genres2int[row] for row in val.split('|')] for ii,val in enumerate(set(movies['Genres']))}

    for key in genres_map:
        for cnt in range(max(genres2int.values()) - len(genres_map[key])):
            genres_map[key].insert(len(genres_map[key]) + cnt,genres2int['<PAD>'])
    
    movies['Genres'] = movies['Genres'].map(genres_map)

    # 电影Title转数字字典
    title_set = set()
    for val in movies['Title'].str.split():
        title_set.update(val)
    
    title_set.add('<PAD>')
    title2int = {val:ii for ii, val in enumerate(title_set)}

    # 将电影Title转成等长数字列表，长度是15
    title_count = 15
    title_map = {val:[title2int[row] for row in val.split()] for ii,val in enumerate(set(movies['Title']))}
    
    for key in title_map:
        for cnt in range(title_count - len(title_map[key])):
            title_map[key].insert(len(title_map[key]) + cnt,title2int['<PAD>'])
    
    movies['Title'] = movies['Title'].map(title_map)

    # 读取评分数据集
    ratings_title = ['UserID','MovieID', 'ratings', 'timestamps']
    ratings = pd.read_table('./ml-1m/ratings.dat', sep='::', header=None, names=ratings_title, engine = 'python')
    ratings = ratings.filter(regex='UserID|MovieID|ratings')

    # 合并三个表
    data = pd.merge(pd.merge(ratings, users), movies)
    
    # 将数据分成X和y两张表
    target_fields = ['ratings']
    features_pd, targets_pd = data.drop(target_fields, axis=1), data[target_fields]
    
    features = features_pd.values
    targets_values = targets_pd.values
    
    return title_count, title_set, genres2int, features, targets_values, ratings, users, movies, data, movies_orig, users_orig

加载数据并保存到本地

title_count：Title字段的长度（15）
title_set：Title文本的集合
genres2int：电影类型转数字的字典
features：是输入X
targets_values：是学习目标y
ratings：评分数据集的Pandas对象
users：用户数据集的Pandas对象
movies：电影数据的Pandas对象
data：三个数据集组合在一起的Pandas对象
movies_orig：没有做数据处理的原始电影数据
users_orig：没有做数据处理的原始用户数据

# 加载数据
title_count, title_set, genres2int, features, targets_values, ratings, users, movies, data, movies_orig, users_orig = load_data()

# 存入文件中
pickle.dump((title_count, title_set, genres2int, features, targets_values, ratings, users, movies, data, movies_orig, users_orig), open('preprocess.p', 'wb'))

预处理后的数据

users.head()

	UserID	Gender	Age	JobID
0	1	0	0	10
1	2	1	5	16
2	3	1	6	15
3	4	1	2	7
4	5	1	6	20

movies.head()

	MovieID	Title	Genres
0	1	[310, 2184, 634, 634, 634, 634, 634, 634, 634,...	[0, 18, 7, 17, 17, 17, 17, 17, 17, 17, 17, 17,...
1	2	[1182, 634, 634, 634, 634, 634, 634, 634, 634,...	[3, 18, 8, 17, 17, 17, 17, 17, 17, 17, 17, 17,...
2	3	[5011, 4744, 2629, 634, 634, 634, 634, 634, 63...	[7, 9, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,...
3	4	[4095, 1535, 1886, 634, 634, 634, 634, 634, 63...	[7, 5, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,...
4	5	[3563, 1725, 3790, 3727, 838, 343, 634, 634, 6...	[7, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17...

movies.values[0]

array([1,
       list([310, 2184, 634, 634, 634, 634, 634, 634, 634, 634, 634, 634, 634, 634, 634]),
       list([0, 18, 7, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17])],
      dtype=object)

从本地读取数据

title_count, title_set, genres2int, features, targets_values, ratings, users, movies, data, movies_orig, users_orig = pickle.load(open('preprocess.p', mode='rb'))

模型设计

通过研究数据集中的字段类型，我们发现有一些是类别字段，通常的处理是将这些字段转成one hot编码，但是像UserID、MovieID这样的字段就会变成非常的稀疏，输入的维度急剧膨胀，这是我们不愿意见到的，毕竟我这小笔记本不像大厂动辄能处理数以亿计维度的输入：）

所以在预处理数据时将这些字段转成了数字，我们用这个数字当做嵌入矩阵的索引，在网络的第一层使用了嵌入层，维度是（N，32）和（N，16）。

电影类型的处理要多一步，有时一个电影有多个电影类型，这样从嵌入矩阵索引出来是一个（n，32）的矩阵，因为有多个类型嘛，我们要将这个矩阵求和，变成（1，32）的向量。

电影名的处理比较特殊，没有使用循环神经网络，而是用了文本卷积网络，下文会进行说明。

从嵌入层索引出特征以后，将各特征传入全连接层，将输出再次传入全连接层，最终分别得到（1，200）的用户特征和电影特征两个特征向量。

我们的目的就是要训练出用户特征和电影特征，在实现推荐功能时使用。得到这两个特征以后，就可以选择任意的方式来拟合评分了。我使用了两种方式，一个是上图中画出的将两个特征做向量乘法，将结果与真实评分做回归，采用MSE优化损失。因为本质上这是一个回归问题，另一种方式是，将两个特征作为输入，再次传入全连接层，输出一个值，将输出值回归到真实评分，采用MSE优化损失。

实际上第二个方式的MSE loss在0.8附近，第一个方式在1附近，5次迭代的结果。

文本卷积网络

网络看起来像下面这样

图片来自Kim Yoon的论文：Convolutional Neural Networks for Sentence Classification

将卷积神经网络用于文本的文章建议你阅读Understanding Convolutional Neural Networks for NLP

网络的第一层是词嵌入层，由每一个单词的嵌入向量组成的嵌入矩阵。下一层使用多个不同尺寸（窗口大小）的卷积核在嵌入矩阵上做卷积，窗口大小指的是每次卷积覆盖几个单词。这里跟对图像做卷积不太一样，图像的卷积通常用2x2、3x3、5x5之类的尺寸，而文本卷积要覆盖整个单词的嵌入向量，所以尺寸是（单词数，向量维度），比如每次滑动3个，4个或者5个单词。第三层网络是max pooling得到一个长向量，最后使用dropout做正则化，最终得到了电影Title的特征。

辅助函数

import tensorflow as tf
import os
import pickle

def save_params(params):
    """
    保存参数到文件中
    """
    pickle.dump(params, open('params.p', 'wb'))


def load_params():
    """
    从文件中加载参数
    """
    return pickle.load(open('params.p', mode='rb'))

编码实现

# 嵌入矩阵的维度
embed_dim = 32
# 用户ID个数
uid_max = max(features.take(0,1)) + 1 # 6040
# 性别个数
gender_max = max(features.take(2,1)) + 1 # 1 + 1 = 2
# 年龄类别个数
age_max = max(features.take(3,1)) + 1 # 6 + 1 = 7
# 职业个数
job_max = max(features.take(4,1)) + 1# 20 + 1 = 21

# 电影ID个数
movie_id_max = max(features.take(1,1)) + 1 # 3952
# 电影类型个数
movie_categories_max = max(genres2int.values()) + 1 # 18 + 1 = 19
# 电影名单词个数
movie_title_max = len(title_set) # 5216

# 对电影类型嵌入向量做加和操作的标志，考虑过使用mean做平均，但是没实现mean
combiner = "sum"

# 电影名长度
sentences_size = title_count # = 15
# 文本卷积滑动窗口，分别滑动2, 3, 4, 5个单词
window_sizes = {2, 3, 4, 5}
# 文本卷积核数量
filter_num = 8

# 电影ID转下标的字典，数据集中电影ID跟下标不一致，比如第5行的数据电影ID不一定是5
movieid2idx = {val[0]:i for i, val in enumerate(movies.values)}

超参

# Number of Epochs
num_epochs = 5
# Batch Size
batch_size = 256

dropout_keep = 0.5
# Learning Rate
learning_rate = 0.0001
# Show stats for every n number of batches
show_every_n_batches = 20

save_dir = './save'

输入

定义输入的占位符

def get_inputs():
    uid = tf.placeholder(tf.int32, [None, 1], name="uid")
    user_gender = tf.placeholder(tf.int32, [None, 1], name="user_gender")
    user_age = tf.placeholder(tf.int32, [None, 1], name="user_age")
    user_job = tf.placeholder(tf.int32, [None, 1], name="user_job")
    
    movie_id = tf.placeholder(tf.int32, [None, 1], name="movie_id")
    movie_categories = tf.placeholder(tf.int32, [None, 18], name="movie_categories")
    movie_titles = tf.placeholder(tf.int32, [None, 15], name="movie_titles")
    targets = tf.placeholder(tf.int32, [None, 1], name="targets")
    LearningRate = tf.placeholder(tf.float32, name = "LearningRate")
    dropout_keep_prob = tf.placeholder(tf.float32, name = "dropout_keep_prob")
    return uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, LearningRate, dropout_keep_prob

构建神经网络

定义User的嵌入矩阵

def get_user_embedding(uid, user_gender, user_age, user_job):
    with tf.name_scope("user_embedding"):
        uid_embed_matrix = tf.Variable(tf.random_uniform([uid_max, embed_dim], -1, 1), name = "uid_embed_matrix")
        uid_embed_layer = tf.nn.embedding_lookup(uid_embed_matrix, uid, name = "uid_embed_layer")
    
        gender_embed_matrix = tf.Variable(tf.random_uniform([gender_max, embed_dim // 2], -1, 1), name= "gender_embed_matrix")
        gender_embed_layer = tf.nn.embedding_lookup(gender_embed_matrix, user_gender, name = "gender_embed_layer")
        
        age_embed_matrix = tf.Variable(tf.random_uniform([age_max, embed_dim // 2], -1, 1), name="age_embed_matrix")
        age_embed_layer = tf.nn.embedding_lookup(age_embed_matrix, user_age, name="age_embed_layer")
        
        job_embed_matrix = tf.Variable(tf.random_uniform([job_max, embed_dim // 2], -1, 1), name = "job_embed_matrix")
        job_embed_layer = tf.nn.embedding_lookup(job_embed_matrix, user_job, name = "job_embed_layer")
    return uid_embed_layer, gender_embed_layer, age_embed_layer, job_embed_layer

将User的嵌入矩阵一起全连接生成User的特征

def get_user_feature_layer(uid_embed_layer, gender_embed_layer, age_embed_layer, job_embed_layer):
    with tf.name_scope("user_fc"):
        #第一层全连接
        uid_fc_layer = tf.layers.dense(uid_embed_layer, embed_dim, name = "uid_fc_layer", activation=tf.nn.relu)
        gender_fc_layer = tf.layers.dense(gender_embed_layer, embed_dim, name = "gender_fc_layer", activation=tf.nn.relu)
        age_fc_layer = tf.layers.dense(age_embed_layer, embed_dim, name ="age_fc_layer", activation=tf.nn.relu)
        job_fc_layer = tf.layers.dense(job_embed_layer, embed_dim, name = "job_fc_layer", activation=tf.nn.relu)
        
        #第二层全连接
        user_combine_layer = tf.concat([uid_fc_layer, gender_fc_layer, age_fc_layer, job_fc_layer], 2)  #(?, 1, 128)
        user_combine_layer = tf.contrib.layers.fully_connected(user_combine_layer, 200, tf.tanh)  #(?, 1, 200)
    
        user_combine_layer_flat = tf.reshape(user_combine_layer, [-1, 200])
    return user_combine_layer, user_combine_layer_flat

定义Movie ID的嵌入矩阵

def get_movie_id_embed_layer(movie_id):
    with tf.name_scope("movie_embedding"):
        movie_id_embed_matrix = tf.Variable(tf.random_uniform([movie_id_max, embed_dim], -1, 1), name = "movie_id_embed_matrix")
        movie_id_embed_layer = tf.nn.embedding_lookup(movie_id_embed_matrix, movie_id, name = "movie_id_embed_layer")
    return movie_id_embed_layer

对电影类型的多个嵌入向量做加和

def get_movie_categories_layers(movie_categories):
    with tf.name_scope("movie_categories_layers"):
        movie_categories_embed_matrix = tf.Variable(tf.random_uniform([movie_categories_max, embed_dim], -1, 1), name = "movie_categories_embed_matrix")
        movie_categories_embed_layer = tf.nn.embedding_lookup(movie_categories_embed_matrix, movie_categories, name = "movie_categories_embed_layer")
        if combiner == "sum":
            movie_categories_embed_layer = tf.reduce_sum(movie_categories_embed_layer, axis=1, keep_dims=True)
    #     elif combiner == "mean":

    return movie_categories_embed_layer

Movie Title的文本卷积网络实现

def get_movie_cnn_layer(movie_titles):
    #从嵌入矩阵中得到电影名对应的各个单词的嵌入向量
    with tf.name_scope("movie_embedding"):
        movie_title_embed_matrix = tf.Variable(tf.random_uniform([movie_title_max, embed_dim], -1, 1), name = "movie_title_embed_matrix")
        movie_title_embed_layer = tf.nn.embedding_lookup(movie_title_embed_matrix, movie_titles, name = "movie_title_embed_layer")
        movie_title_embed_layer_expand = tf.expand_dims(movie_title_embed_layer, -1)
    
    #对文本嵌入层使用不同尺寸的卷积核做卷积和最大池化
    pool_layer_lst = []
    for window_size in window_sizes:
        with tf.name_scope("movie_txt_conv_maxpool_{}".format(window_size)):
            filter_weights = tf.Variable(tf.truncated_normal([window_size, embed_dim, 1, filter_num],stddev=0.1),name = "filter_weights")
            filter_bias = tf.Variable(tf.constant(0.1, shape=[filter_num]), name="filter_bias")
            
            conv_layer = tf.nn.conv2d(movie_title_embed_layer_expand, filter_weights, [1,1,1,1], padding="VALID", name="conv_layer")
            relu_layer = tf.nn.relu(tf.nn.bias_add(conv_layer,filter_bias), name ="relu_layer")
            
            maxpool_layer = tf.nn.max_pool(relu_layer, [1,sentences_size - window_size + 1 ,1,1], [1,1,1,1], padding="VALID", name="maxpool_layer")
            pool_layer_lst.append(maxpool_layer)

    #Dropout层
    with tf.name_scope("pool_dropout"):
        pool_layer = tf.concat(pool_layer_lst, 3, name ="pool_layer")
        max_num = len(window_sizes) * filter_num
        pool_layer_flat = tf.reshape(pool_layer , [-1, 1, max_num], name = "pool_layer_flat")
    
        dropout_layer = tf.nn.dropout(pool_layer_flat, dropout_keep_prob, name = "dropout_layer")
    return pool_layer_flat, dropout_layer

将Movie的各个层一起做全连接

def get_movie_feature_layer(movie_id_embed_layer, movie_categories_embed_layer, dropout_layer):
    with tf.name_scope("movie_fc"):
        #第一层全连接
        movie_id_fc_layer = tf.layers.dense(movie_id_embed_layer, embed_dim, name = "movie_id_fc_layer", activation=tf.nn.relu)
        movie_categories_fc_layer = tf.layers.dense(movie_categories_embed_layer, embed_dim, name = "movie_categories_fc_layer", activation=tf.nn.relu)
    
        #第二层全连接
        movie_combine_layer = tf.concat([movie_id_fc_layer, movie_categories_fc_layer, dropout_layer], 2)  #(?, 1, 96)
        movie_combine_layer = tf.contrib.layers.fully_connected(movie_combine_layer, 200, tf.tanh)  #(?, 1, 200)
    
        movie_combine_layer_flat = tf.reshape(movie_combine_layer, [-1, 200])
    return movie_combine_layer, movie_combine_layer_flat

构建计算图

tf.reset_default_graph()
train_graph = tf.Graph()
with train_graph.as_default():
    #获取输入占位符
    uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, lr, dropout_keep_prob = get_inputs()
    #获取User的4个嵌入向量
    uid_embed_layer, gender_embed_layer, age_embed_layer, job_embed_layer = get_user_embedding(uid, user_gender, user_age, user_job)
    #得到用户特征
    user_combine_layer, user_combine_layer_flat = get_user_feature_layer(uid_embed_layer, gender_embed_layer, age_embed_layer, job_embed_layer)
    #获取电影ID的嵌入向量
    movie_id_embed_layer = get_movie_id_embed_layer(movie_id)
    #获取电影类型的嵌入向量
    movie_categories_embed_layer = get_movie_categories_layers(movie_categories)
    #获取电影名的特征向量
    pool_layer_flat, dropout_layer = get_movie_cnn_layer(movie_titles)
    #得到电影特征
    movie_combine_layer, movie_combine_layer_flat = get_movie_feature_layer(movie_id_embed_layer, 
                                                                            movie_categories_embed_layer, 
                                                                            dropout_layer)
    #计算出评分，要注意两个不同的方案，inference的名字（name值）是不一样的，后面做推荐时要根据name取得tensor
    with tf.name_scope("inference"):
        #将用户特征和电影特征作为输入，经过全连接，输出一个值的方案
#         inference_layer = tf.concat([user_combine_layer_flat, movie_combine_layer_flat], 1)  #(?, 200)
#         inference = tf.layers.dense(inference_layer, 1,
#                                     kernel_initializer=tf.truncated_normal_initializer(stddev=0.01), 
#                                     kernel_regularizer=tf.nn.l2_loss, name="inference")
        #简单的将用户特征和电影特征做矩阵乘法得到一个预测评分
#        inference = tf.matmul(user_combine_layer_flat, tf.transpose(movie_combine_layer_flat))
        inference = tf.reduce_sum(user_combine_layer_flat * movie_combine_layer_flat, axis=1)
        inference = tf.expand_dims(inference, axis=1)

    with tf.name_scope("loss"):
        # MSE损失，将计算值回归到评分
        cost = tf.losses.mean_squared_error(targets, inference )
        loss = tf.reduce_mean(cost)
    # 优化损失 
#     train_op = tf.train.AdamOptimizer(lr).minimize(loss)  #cost
    global_step = tf.Variable(0, name="global_step", trainable=False)
    optimizer = tf.train.AdamOptimizer(lr)
    gradients = optimizer.compute_gradients(loss)  #cost
    train_op = optimizer.apply_gradients(gradients, global_step=global_step)

WARNING:tensorflow:From <ipython-input-20-559a1ee9ce9e>:6: calling reduce_sum (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead

inference

<tf.Tensor 'inference/ExpandDims:0' shape=(?, 1) dtype=float32>

取得batch

def get_batches(Xs, ys, batch_size):
    for start in range(0, len(Xs), batch_size):
        end = min(start + batch_size, len(Xs))
        yield Xs[start:end], ys[start:end]

训练网络

%matplotlib inline
%config InlineBackend.figure_format = 'retina'
import matplotlib.pyplot as plt
import time
import datetime

losses = {'train':[], 'test':[]}

with tf.Session(graph=train_graph) as sess:
    
    #搜集数据给tensorBoard用
    # Keep track of gradient values and sparsity
    grad_summaries = []
    for g, v in gradients:
        if g is not None:
            grad_hist_summary = tf.summary.histogram("{}/grad/hist".format(v.name.replace(':', '_')), g)
            sparsity_summary = tf.summary.scalar("{}/grad/sparsity".format(v.name.replace(':', '_')), tf.nn.zero_fraction(g))
            grad_summaries.append(grad_hist_summary)
            grad_summaries.append(sparsity_summary)
    grad_summaries_merged = tf.summary.merge(grad_summaries)
        
    # Output directory for models and summaries
    timestamp = str(int(time.time()))
    out_dir = os.path.abspath(os.path.join(os.path.curdir, "runs", timestamp))
    print("Writing to {}
".format(out_dir))
     
    # Summaries for loss and accuracy
    loss_summary = tf.summary.scalar("loss", loss)

    # Train Summaries
    train_summary_op = tf.summary.merge([loss_summary, grad_summaries_merged])
    train_summary_dir = os.path.join(out_dir, "summaries", "train")
    train_summary_writer = tf.summary.FileWriter(train_summary_dir, sess.graph)

    # Inference summaries
    inference_summary_op = tf.summary.merge([loss_summary])
    inference_summary_dir = os.path.join(out_dir, "summaries", "inference")
    inference_summary_writer = tf.summary.FileWriter(inference_summary_dir, sess.graph)

    sess.run(tf.global_variables_initializer())
    saver = tf.train.Saver()
    for epoch_i in range(num_epochs):
        
        #将数据集分成训练集和测试集，随机种子不固定
        train_X,test_X, train_y, test_y = train_test_split(features,  
                                                           targets_values,  
                                                           test_size = 0.2,  
                                                           random_state = 0)  
        
        train_batches = get_batches(train_X, train_y, batch_size)
        test_batches = get_batches(test_X, test_y, batch_size)
    
        #训练的迭代，保存训练损失
        for batch_i in range(len(train_X) // batch_size):
            x, y = next(train_batches)

            categories = np.zeros([batch_size, 18])
            for i in range(batch_size):
                categories[i] = x.take(6,1)[i]

            titles = np.zeros([batch_size, sentences_size])
            for i in range(batch_size):
                titles[i] = x.take(5,1)[i]

            feed = {
                uid: np.reshape(x.take(0,1), [batch_size, 1]),
                user_gender: np.reshape(x.take(2,1), [batch_size, 1]),
                user_age: np.reshape(x.take(3,1), [batch_size, 1]),
                user_job: np.reshape(x.take(4,1), [batch_size, 1]),
                movie_id: np.reshape(x.take(1,1), [batch_size, 1]),
                movie_categories: categories,  #x.take(6,1)
                movie_titles: titles,  #x.take(5,1)
                targets: np.reshape(y, [batch_size, 1]),
                dropout_keep_prob: dropout_keep, #dropout_keep
                lr: learning_rate}

            step, train_loss, summaries, _ = sess.run([global_step, loss, train_summary_op, train_op], feed)  #cost
            losses['train'].append(train_loss)
            train_summary_writer.add_summary(summaries, step)  #
            
            # Show every <show_every_n_batches> batches
            if (epoch_i * (len(train_X) // batch_size) + batch_i) % show_every_n_batches == 0:
                time_str = datetime.datetime.now().isoformat()
                print('{}: Epoch {:>3} Batch {:>4}/{}   train_loss = {:.3f}'.format(
                    time_str,
                    epoch_i,
                    batch_i,
                    (len(train_X) // batch_size),
                    train_loss))
                
        #使用测试数据的迭代
        for batch_i  in range(len(test_X) // batch_size):
            x, y = next(test_batches)
            
            categories = np.zeros([batch_size, 18])
            for i in range(batch_size):
                categories[i] = x.take(6,1)[i]

            titles = np.zeros([batch_size, sentences_size])
            for i in range(batch_size):
                titles[i] = x.take(5,1)[i]

            feed = {
                uid: np.reshape(x.take(0,1), [batch_size, 1]),
                user_gender: np.reshape(x.take(2,1), [batch_size, 1]),
                user_age: np.reshape(x.take(3,1), [batch_size, 1]),
                user_job: np.reshape(x.take(4,1), [batch_size, 1]),
                movie_id: np.reshape(x.take(1,1), [batch_size, 1]),
                movie_categories: categories,  #x.take(6,1)
                movie_titles: titles,  #x.take(5,1)
                targets: np.reshape(y, [batch_size, 1]),
                dropout_keep_prob: 1,
                lr: learning_rate}
            
            step, test_loss, summaries = sess.run([global_step, loss, inference_summary_op], feed)  #cost

            #保存测试损失
            losses['test'].append(test_loss)
            inference_summary_writer.add_summary(summaries, step)  #

            time_str = datetime.datetime.now().isoformat()
            if (epoch_i * (len(test_X) // batch_size) + batch_i) % show_every_n_batches == 0:
                print('{}: Epoch {:>3} Batch {:>4}/{}   test_loss = {:.3f}'.format(
                    time_str,
                    epoch_i,
                    batch_i,
                    (len(test_X) // batch_size),
                    test_loss))

    # Save Model
    saver.save(sess, save_dir)  #, global_step=epoch_i
    print('Model Trained and Saved')

Writing to F:jupyterworkmovie_recommender-master
uns1554780412

2019-04-09T11:26:53.633627: Epoch   0 Batch    0/3125   train_loss = 8.810
2019-04-09T11:26:54.052240: Epoch   0 Batch   20/3125   train_loss = 3.457
2019-04-09T11:26:54.466181: Epoch   0 Batch   40/3125   train_loss = 2.563
2019-04-09T11:26:54.890814: Epoch   0 Batch   60/3125   train_loss = 1.962
2019-04-09T11:26:55.315803: Epoch   0 Batch   80/3125   train_loss = 1.852
2019-04-09T11:26:55.730125: Epoch   0 Batch  100/3125   train_loss = 1.826
2019-04-09T11:26:56.146734: Epoch   0 Batch  120/3125   train_loss = 1.781
2019-04-09T11:26:56.559145: Epoch   0 Batch  140/3125   train_loss = 1.630
2019-04-09T11:26:56.971689: Epoch   0 Batch  160/3125   train_loss = 1.652
2019-04-09T11:26:57.394125: Epoch   0 Batch  180/3125   train_loss = 1.361
2019-04-09T11:26:57.810824: Epoch   0 Batch  200/3125   train_loss = 1.715
2019-04-09T11:26:58.227455: Epoch   0 Batch  220/3125   train_loss = 1.430
2019-04-09T11:26:58.643714: Epoch   0 Batch  240/3125   train_loss = 1.342
2019-04-09T11:26:59.056816: Epoch   0 Batch  260/3125   train_loss = 1.512
2019-04-09T11:26:59.468409: Epoch   0 Batch  280/3125   train_loss = 1.678
2019-04-09T11:26:59.882126: Epoch   0 Batch  300/3125   train_loss = 1.482
2019-04-09T11:27:00.294685: Epoch   0 Batch  320/3125   train_loss = 1.463
2019-04-09T11:27:00.826546: Epoch   0 Batch  340/3125   train_loss = 1.333
2019-04-09T11:27:01.239302: Epoch   0 Batch  360/3125   train_loss = 1.318
2019-04-09T11:27:01.652219: Epoch   0 Batch  380/3125   train_loss = 1.253
2019-04-09T11:27:02.067588: Epoch   0 Batch  400/3125   train_loss = 1.155
2019-04-09T11:27:02.483490: Epoch   0 Batch  420/3125   train_loss = 1.341
2019-04-09T11:27:02.892079: Epoch   0 Batch  440/3125   train_loss = 1.429
2019-04-09T11:27:03.305331: Epoch   0 Batch  460/3125   train_loss = 1.315
2019-04-09T11:27:03.721028: Epoch   0 Batch  480/3125   train_loss = 1.351
2019-04-09T11:27:04.130622: Epoch   0 Batch  500/3125   train_loss = 1.043
2019-04-09T11:27:04.549775: Epoch   0 Batch  520/3125   train_loss = 1.340
2019-04-09T11:27:04.963936: Epoch   0 Batch  540/3125   train_loss = 1.258
2019-04-09T11:27:05.378772: Epoch   0 Batch  560/3125   train_loss = 1.474
2019-04-09T11:27:05.790245: Epoch   0 Batch  580/3125   train_loss = 1.399
2019-04-09T11:27:06.202342: Epoch   0 Batch  600/3125   train_loss = 1.374
2019-04-09T11:27:06.616239: Epoch   0 Batch  620/3125   train_loss = 1.429
2019-04-09T11:27:07.027259: Epoch   0 Batch  640/3125   train_loss = 1.346
2019-04-09T11:27:07.443480: Epoch   0 Batch  660/3125   train_loss = 1.377
2019-04-09T11:27:07.857450: Epoch   0 Batch  680/3125   train_loss = 1.191
2019-04-09T11:27:08.269326: Epoch   0 Batch  700/3125   train_loss = 1.302
2019-04-09T11:27:08.685203: Epoch   0 Batch  720/3125   train_loss = 1.171
2019-04-09T11:27:09.098769: Epoch   0 Batch  740/3125   train_loss = 1.403
2019-04-09T11:27:09.519383: Epoch   0 Batch  760/3125   train_loss = 1.369
2019-04-09T11:27:09.931100: Epoch   0 Batch  780/3125   train_loss = 1.402
2019-04-09T11:27:10.343018: Epoch   0 Batch  800/3125   train_loss = 1.250
2019-04-09T11:27:10.755994: Epoch   0 Batch  820/3125   train_loss = 1.292
2019-04-09T11:27:11.169596: Epoch   0 Batch  840/3125   train_loss = 1.215
2019-04-09T11:27:11.583017: Epoch   0 Batch  860/3125   train_loss = 1.201
2019-04-09T11:27:11.997121: Epoch   0 Batch  880/3125   train_loss = 1.189
2019-04-09T11:27:12.411392: Epoch   0 Batch  900/3125   train_loss = 1.240
2019-04-09T11:27:12.824492: Epoch   0 Batch  920/3125   train_loss = 1.220
2019-04-09T11:27:13.238173: Epoch   0 Batch  940/3125   train_loss = 1.414
2019-04-09T11:27:13.649014: Epoch   0 Batch  960/3125   train_loss = 1.332
2019-04-09T11:27:14.058947: Epoch   0 Batch  980/3125   train_loss = 1.345
2019-04-09T11:27:14.491861: Epoch   0 Batch 1000/3125   train_loss = 1.275
2019-04-09T11:27:14.920000: Epoch   0 Batch 1020/3125   train_loss = 1.341
2019-04-09T11:27:15.337096: Epoch   0 Batch 1040/3125   train_loss = 1.281
2019-04-09T11:27:15.760618: Epoch   0 Batch 1060/3125   train_loss = 1.478
2019-04-09T11:27:16.174406: Epoch   0 Batch 1080/3125   train_loss = 1.158
2019-04-09T11:27:16.591839: Epoch   0 Batch 1100/3125   train_loss = 1.268
2019-04-09T11:27:17.013498: Epoch   0 Batch 1120/3125   train_loss = 1.270
2019-04-09T11:27:17.438626: Epoch   0 Batch 1140/3125   train_loss = 1.280
2019-04-09T11:27:17.852226: Epoch   0 Batch 1160/3125   train_loss = 1.205
2019-04-09T11:27:18.273478: Epoch   0 Batch 1180/3125   train_loss = 1.274
2019-04-09T11:27:18.696339: Epoch   0 Batch 1200/3125   train_loss = 1.284
2019-04-09T11:27:19.117179: Epoch   0 Batch 1220/3125   train_loss = 1.155
2019-04-09T11:27:19.524543: Epoch   0 Batch 1240/3125   train_loss = 1.143
2019-04-09T11:27:19.938738: Epoch   0 Batch 1260/3125   train_loss = 1.247
2019-04-09T11:27:20.350656: Epoch   0 Batch 1280/3125   train_loss = 1.223
2019-04-09T11:27:20.761388: Epoch   0 Batch 1300/3125   train_loss = 1.267
2019-04-09T11:27:21.177496: Epoch   0 Batch 1320/3125   train_loss = 1.183
2019-04-09T11:27:21.590091: Epoch   0 Batch 1340/3125   train_loss = 1.047
2019-04-09T11:27:22.004788: Epoch   0 Batch 1360/3125   train_loss = 1.149
2019-04-09T11:27:22.414416: Epoch   0 Batch 1380/3125   train_loss = 1.114
2019-04-09T11:27:22.827015: Epoch   0 Batch 1400/3125   train_loss = 1.282
2019-04-09T11:27:23.236719: Epoch   0 Batch 1420/3125   train_loss = 1.256
2019-04-09T11:27:23.645758: Epoch   0 Batch 1440/3125   train_loss = 1.174
2019-04-09T11:27:24.063386: Epoch   0 Batch 1460/3125   train_loss = 1.251
2019-04-09T11:27:24.477184: Epoch   0 Batch 1480/3125   train_loss = 1.180
2019-04-09T11:27:24.890286: Epoch   0 Batch 1500/3125   train_loss = 1.322
2019-04-09T11:27:25.300422: Epoch   0 Batch 1520/3125   train_loss = 1.277
2019-04-09T11:27:25.709640: Epoch   0 Batch 1540/3125   train_loss = 1.270
2019-04-09T11:27:26.122241: Epoch   0 Batch 1560/3125   train_loss = 1.122
2019-04-09T11:27:26.534862: Epoch   0 Batch 1580/3125   train_loss = 1.138
2019-04-09T11:27:26.947461: Epoch   0 Batch 1600/3125   train_loss = 1.274
2019-04-09T11:27:27.359900: Epoch   0 Batch 1620/3125   train_loss = 1.169
2019-04-09T11:27:27.769969: Epoch   0 Batch 1640/3125   train_loss = 1.235
2019-04-09T11:27:28.180519: Epoch   0 Batch 1660/3125   train_loss = 1.282
2019-04-09T11:27:28.592653: Epoch   0 Batch 1680/3125   train_loss = 1.174
2019-04-09T11:27:29.003519: Epoch   0 Batch 1700/3125   train_loss = 1.009
2019-04-09T11:27:29.414262: Epoch   0 Batch 1720/3125   train_loss = 1.149
2019-04-09T11:27:29.828869: Epoch   0 Batch 1740/3125   train_loss = 1.221
2019-04-09T11:27:30.238773: Epoch   0 Batch 1760/3125   train_loss = 1.288
2019-04-09T11:27:30.648342: Epoch   0 Batch 1780/3125   train_loss = 1.067
2019-04-09T11:27:31.188925: Epoch   0 Batch 1800/3125   train_loss = 1.196
2019-04-09T11:27:31.603231: Epoch   0 Batch 1820/3125   train_loss = 1.142
2019-04-09T11:27:32.010926: Epoch   0 Batch 1840/3125   train_loss = 1.256
2019-04-09T11:27:32.425741: Epoch   0 Batch 1860/3125   train_loss = 1.345
2019-04-09T11:27:32.839345: Epoch   0 Batch 1880/3125   train_loss = 1.215
2019-04-09T11:27:33.248900: Epoch   0 Batch 1900/3125   train_loss = 1.048
2019-04-09T11:27:33.663116: Epoch   0 Batch 1920/3125   train_loss = 1.211
2019-04-09T11:27:34.074400: Epoch   0 Batch 1940/3125   train_loss = 1.070
2019-04-09T11:27:34.484302: Epoch   0 Batch 1960/3125   train_loss = 1.131
2019-04-09T11:27:34.894396: Epoch   0 Batch 1980/3125   train_loss = 1.196
2019-04-09T11:27:35.306864: Epoch   0 Batch 2000/3125   train_loss = 1.347
2019-04-09T11:27:35.722043: Epoch   0 Batch 2020/3125   train_loss = 1.297
2019-04-09T11:27:36.135143: Epoch   0 Batch 2040/3125   train_loss = 1.180
2019-04-09T11:27:36.543475: Epoch   0 Batch 2060/3125   train_loss = 1.025
2019-04-09T11:27:36.953066: Epoch   0 Batch 2080/3125   train_loss = 1.265
2019-04-09T11:27:37.370478: Epoch   0 Batch 2100/3125   train_loss = 1.094
2019-04-09T11:27:37.782974: Epoch   0 Batch 2120/3125   train_loss = 1.069
2019-04-09T11:27:38.190560: Epoch   0 Batch 2140/3125   train_loss = 1.132
2019-04-09T11:27:38.604746: Epoch   0 Batch 2160/3125   train_loss = 1.122
2019-04-09T11:27:39.019245: Epoch   0 Batch 2180/3125   train_loss = 1.166
2019-04-09T11:27:39.431946: Epoch   0 Batch 2200/3125   train_loss = 1.137
2019-04-09T11:27:39.847258: Epoch   0 Batch 2220/3125   train_loss = 1.118
2019-04-09T11:27:40.256398: Epoch   0 Batch 2240/3125   train_loss = 1.011
2019-04-09T11:27:40.665478: Epoch   0 Batch 2260/3125   train_loss = 1.160
2019-04-09T11:27:41.078758: Epoch   0 Batch 2280/3125   train_loss = 1.164
2019-04-09T11:27:41.489744: Epoch   0 Batch 2300/3125   train_loss = 1.163
2019-04-09T11:27:41.901845: Epoch   0 Batch 2320/3125   train_loss = 1.288
2019-04-09T11:27:42.312713: Epoch   0 Batch 2340/3125   train_loss = 1.177
2019-04-09T11:27:42.725320: Epoch   0 Batch 2360/3125   train_loss = 1.130
2019-04-09T11:27:43.132848: Epoch   0 Batch 2380/3125   train_loss = 1.163
2019-04-09T11:27:43.541373: Epoch   0 Batch 2400/3125   train_loss = 1.231
2019-04-09T11:27:43.947189: Epoch   0 Batch 2420/3125   train_loss = 1.133
2019-04-09T11:27:44.355782: Epoch   0 Batch 2440/3125   train_loss = 1.272
2019-04-09T11:27:44.768420: Epoch   0 Batch 2460/3125   train_loss = 1.128
2019-04-09T11:27:45.177740: Epoch   0 Batch 2480/3125   train_loss = 1.184
2019-04-09T11:27:45.584471: Epoch   0 Batch 2500/3125   train_loss = 1.161
2019-04-09T11:27:45.993960: Epoch   0 Batch 2520/3125   train_loss = 1.055
2019-04-09T11:27:46.402164: Epoch   0 Batch 2540/3125   train_loss = 1.108
2019-04-09T11:27:46.812056: Epoch   0 Batch 2560/3125   train_loss = 0.977
2019-04-09T11:27:47.230169: Epoch   0 Batch 2580/3125   train_loss = 1.101
2019-04-09T11:27:47.639261: Epoch   0 Batch 2600/3125   train_loss = 1.141
2019-04-09T11:27:48.047294: Epoch   0 Batch 2620/3125   train_loss = 1.098
2019-04-09T11:27:48.457188: Epoch   0 Batch 2640/3125   train_loss = 1.096
2019-04-09T11:27:48.870683: Epoch   0 Batch 2660/3125   train_loss = 1.241
2019-04-09T11:27:49.282413: Epoch   0 Batch 2680/3125   train_loss = 1.001
2019-04-09T11:27:49.690957: Epoch   0 Batch 2700/3125   train_loss = 1.266
2019-04-09T11:27:50.103555: Epoch   0 Batch 2720/3125   train_loss = 1.158
2019-04-09T11:27:50.514897: Epoch   0 Batch 2740/3125   train_loss = 1.210
2019-04-09T11:27:50.924909: Epoch   0 Batch 2760/3125   train_loss = 1.234
2019-04-09T11:27:51.336251: Epoch   0 Batch 2780/3125   train_loss = 1.121
2019-04-09T11:27:51.748175: Epoch   0 Batch 2800/3125   train_loss = 1.377
2019-04-09T11:27:52.164028: Epoch   0 Batch 2820/3125   train_loss = 1.417
2019-04-09T11:27:52.583020: Epoch   0 Batch 2840/3125   train_loss = 1.146
2019-04-09T11:27:53.001214: Epoch   0 Batch 2860/3125   train_loss = 1.067
2019-04-09T11:27:53.413084: Epoch   0 Batch 2880/3125   train_loss = 1.160
2019-04-09T11:27:53.830194: Epoch   0 Batch 2900/3125   train_loss = 1.134
2019-04-09T11:27:54.242290: Epoch   0 Batch 2920/3125   train_loss = 1.188
2019-04-09T11:27:54.657395: Epoch   0 Batch 2940/3125   train_loss = 1.103
2019-04-09T11:27:55.066253: Epoch   0 Batch 2960/3125   train_loss = 1.222
2019-04-09T11:27:55.476481: Epoch   0 Batch 2980/3125   train_loss = 1.197
2019-04-09T11:27:55.891054: Epoch   0 Batch 3000/3125   train_loss = 1.123
2019-04-09T11:27:56.299092: Epoch   0 Batch 3020/3125   train_loss = 1.213
2019-04-09T11:27:56.709737: Epoch   0 Batch 3040/3125   train_loss = 1.128
2019-04-09T11:27:57.121834: Epoch   0 Batch 3060/3125   train_loss = 1.174
2019-04-09T11:27:57.537893: Epoch   0 Batch 3080/3125   train_loss = 1.253
2019-04-09T11:27:57.945981: Epoch   0 Batch 3100/3125   train_loss = 1.169
2019-04-09T11:27:58.355315: Epoch   0 Batch 3120/3125   train_loss = 1.011
2019-04-09T11:27:58.525868: Epoch   0 Batch    0/781   test_loss = 1.003
2019-04-09T11:27:58.655211: Epoch   0 Batch   20/781   test_loss = 1.118
2019-04-09T11:27:58.785057: Epoch   0 Batch   40/781   test_loss = 0.975
2019-04-09T11:27:58.914903: Epoch   0 Batch   60/781   test_loss = 1.317
2019-04-09T11:27:59.043746: Epoch   0 Batch   80/781   test_loss = 1.261
2019-04-09T11:27:59.172589: Epoch   0 Batch  100/781   test_loss = 1.333
2019-04-09T11:27:59.301431: Epoch   0 Batch  120/781   test_loss = 1.186
2019-04-09T11:27:59.429434: Epoch   0 Batch  140/781   test_loss = 1.192
2019-04-09T11:27:59.557775: Epoch   0 Batch  160/781   test_loss = 1.259
2019-04-09T11:27:59.685114: Epoch   0 Batch  180/781   test_loss = 1.189
2019-04-09T11:27:59.813455: Epoch   0 Batch  200/781   test_loss = 1.093
2019-04-09T11:27:59.939791: Epoch   0 Batch  220/781   test_loss = 0.963
2019-04-09T11:28:00.066629: Epoch   0 Batch  240/781   test_loss = 1.173
2019-04-09T11:28:00.194468: Epoch   0 Batch  260/781   test_loss = 1.160
2019-04-09T11:28:00.321306: Epoch   0 Batch  280/781   test_loss = 1.354
2019-04-09T11:28:00.448551: Epoch   0 Batch  300/781   test_loss = 1.140
2019-04-09T11:28:00.576892: Epoch   0 Batch  320/781   test_loss = 1.270
2019-04-09T11:28:00.705735: Epoch   0 Batch  340/781   test_loss = 0.836
2019-04-09T11:28:00.832572: Epoch   0 Batch  360/781   test_loss = 1.297
2019-04-09T11:28:00.961415: Epoch   0 Batch  380/781   test_loss = 1.141
2019-04-09T11:28:01.090257: Epoch   0 Batch  400/781   test_loss = 1.135
2019-04-09T11:28:01.217095: Epoch   0 Batch  420/781   test_loss = 0.986
2019-04-09T11:28:01.344936: Epoch   0 Batch  440/781   test_loss = 1.153
2019-04-09T11:28:01.472184: Epoch   0 Batch  460/781   test_loss = 1.084
2019-04-09T11:28:01.599021: Epoch   0 Batch  480/781   test_loss = 1.101
2019-04-09T11:28:01.726862: Epoch   0 Batch  500/781   test_loss = 0.917
2019-04-09T11:28:01.854702: Epoch   0 Batch  520/781   test_loss = 1.127
2019-04-09T11:28:01.980536: Epoch   0 Batch  540/781   test_loss = 1.025
2019-04-09T11:28:02.108377: Epoch   0 Batch  560/781   test_loss = 1.267
2019-04-09T11:28:02.235214: Epoch   0 Batch  580/781   test_loss = 1.131
2019-04-09T11:28:02.362552: Epoch   0 Batch  600/781   test_loss = 1.179
2019-04-09T11:28:02.490387: Epoch   0 Batch  620/781   test_loss = 1.140
2019-04-09T11:28:02.617224: Epoch   0 Batch  640/781   test_loss = 1.194
2019-04-09T11:28:02.744563: Epoch   0 Batch  660/781   test_loss = 1.135
2019-04-09T11:28:02.875411: Epoch   0 Batch  680/781   test_loss = 1.403
2019-04-09T11:28:03.002248: Epoch   0 Batch  700/781   test_loss = 1.109
2019-04-09T11:28:03.130089: Epoch   0 Batch  720/781   test_loss = 1.243
2019-04-09T11:28:03.256926: Epoch   0 Batch  740/781   test_loss = 1.118
2019-04-09T11:28:03.383769: Epoch   0 Batch  760/781   test_loss = 1.098
2019-04-09T11:28:03.510695: Epoch   0 Batch  780/781   test_loss = 1.155
2019-04-09T11:28:04.289124: Epoch   1 Batch   15/3125   train_loss = 1.266
2019-04-09T11:28:04.711410: Epoch   1 Batch   35/3125   train_loss = 1.142
2019-04-09T11:28:05.124010: Epoch   1 Batch   55/3125   train_loss = 1.165
2019-04-09T11:28:05.539135: Epoch   1 Batch   75/3125   train_loss = 1.079
2019-04-09T11:28:05.955033: Epoch   1 Batch   95/3125   train_loss = 0.929
2019-04-09T11:28:06.374924: Epoch   1 Batch  115/3125   train_loss = 1.166
2019-04-09T11:28:06.784549: Epoch   1 Batch  135/3125   train_loss = 1.015
2019-04-09T11:28:07.202663: Epoch   1 Batch  155/3125   train_loss = 1.129
2019-04-09T11:28:07.622296: Epoch   1 Batch  175/3125   train_loss = 1.051
2019-04-09T11:28:08.044004: Epoch   1 Batch  195/3125   train_loss = 1.215
2019-04-09T11:28:08.464873: Epoch   1 Batch  215/3125   train_loss = 1.127
2019-04-09T11:28:08.882758: Epoch   1 Batch  235/3125   train_loss = 1.092
2019-04-09T11:28:09.302399: Epoch   1 Batch  255/3125   train_loss = 1.211
2019-04-09T11:28:09.718143: Epoch   1 Batch  275/3125   train_loss = 1.005
2019-04-09T11:28:10.135755: Epoch   1 Batch  295/3125   train_loss = 0.973
2019-04-09T11:28:10.556105: Epoch   1 Batch  315/3125   train_loss = 1.039
2019-04-09T11:28:10.968219: Epoch   1 Batch  335/3125   train_loss = 0.990
2019-04-09T11:28:11.382497: Epoch   1 Batch  355/3125   train_loss = 1.110
2019-04-09T11:28:11.792475: Epoch   1 Batch  375/3125   train_loss = 1.187
2019-04-09T11:28:12.203571: Epoch   1 Batch  395/3125   train_loss = 1.056
2019-04-09T11:28:12.616848: Epoch   1 Batch  415/3125   train_loss = 1.314
2019-04-09T11:28:13.031510: Epoch   1 Batch  435/3125   train_loss = 1.136
2019-04-09T11:28:13.442848: Epoch   1 Batch  455/3125   train_loss = 1.054
2019-04-09T11:28:13.860246: Epoch   1 Batch  475/3125   train_loss = 1.144
2019-04-09T11:28:14.274154: Epoch   1 Batch  495/3125   train_loss = 1.056
2019-04-09T11:28:14.692507: Epoch   1 Batch  515/3125   train_loss = 1.161
2019-04-09T11:28:15.109092: Epoch   1 Batch  535/3125   train_loss = 1.140
2019-04-09T11:28:15.524725: Epoch   1 Batch  555/3125   train_loss = 1.257
2019-04-09T11:28:15.938088: Epoch   1 Batch  575/3125   train_loss = 1.070
2019-04-09T11:28:16.350862: Epoch   1 Batch  595/3125   train_loss = 1.285
2019-04-09T11:28:16.761759: Epoch   1 Batch  615/3125   train_loss = 1.101
2019-04-09T11:28:17.182378: Epoch   1 Batch  635/3125   train_loss = 1.138
2019-04-09T11:28:17.599235: Epoch   1 Batch  655/3125   train_loss = 1.057
2019-04-09T11:28:18.019362: Epoch   1 Batch  675/3125   train_loss = 0.876
2019-04-09T11:28:18.438108: Epoch   1 Batch  695/3125   train_loss = 1.045
2019-04-09T11:28:18.849900: Epoch   1 Batch  715/3125   train_loss = 1.098
2019-04-09T11:28:19.261195: Epoch   1 Batch  735/3125   train_loss = 0.914
2019-04-09T11:28:19.812365: Epoch   1 Batch  755/3125   train_loss = 1.162
2019-04-09T11:28:20.222217: Epoch   1 Batch  775/3125   train_loss = 0.998
2019-04-09T11:28:20.645987: Epoch   1 Batch  795/3125   train_loss = 1.218
2019-04-09T11:28:21.064302: Epoch   1 Batch  815/3125   train_loss = 1.102
2019-04-09T11:28:21.482799: Epoch   1 Batch  835/3125   train_loss = 1.071
2019-04-09T11:28:21.907954: Epoch   1 Batch  855/3125   train_loss = 1.297
2019-04-09T11:28:22.327483: Epoch   1 Batch  875/3125   train_loss = 1.248
2019-04-09T11:28:22.741550: Epoch   1 Batch  895/3125   train_loss = 1.080
2019-04-09T11:28:23.157659: Epoch   1 Batch  915/3125   train_loss = 1.059
2019-04-09T11:28:23.571202: Epoch   1 Batch  935/3125   train_loss = 1.163
2019-04-09T11:28:23.984586: Epoch   1 Batch  955/3125   train_loss = 1.102
2019-04-09T11:28:24.396511: Epoch   1 Batch  975/3125   train_loss = 1.100
2019-04-09T11:28:24.824835: Epoch   1 Batch  995/3125   train_loss = 0.890
2019-04-09T11:28:25.242948: Epoch   1 Batch 1015/3125   train_loss = 1.077
2019-04-09T11:28:25.659444: Epoch   1 Batch 1035/3125   train_loss = 1.090
2019-04-09T11:28:26.076601: Epoch   1 Batch 1055/3125   train_loss = 1.154
2019-04-09T11:28:26.489531: Epoch   1 Batch 1075/3125   train_loss = 1.004
2019-04-09T11:28:26.897455: Epoch   1 Batch 1095/3125   train_loss = 1.012
2019-04-09T11:28:27.320553: Epoch   1 Batch 1115/3125   train_loss = 1.165
2019-04-09T11:28:27.739517: Epoch   1 Batch 1135/3125   train_loss = 1.029
2019-04-09T11:28:28.156628: Epoch   1 Batch 1155/3125   train_loss = 1.117
2019-04-09T11:28:28.570595: Epoch   1 Batch 1175/3125   train_loss = 1.103
2019-04-09T11:28:28.980586: Epoch   1 Batch 1195/3125   train_loss = 1.250
2019-04-09T11:28:29.393619: Epoch   1 Batch 1215/3125   train_loss = 0.930
2019-04-09T11:28:29.809238: Epoch   1 Batch 1235/3125   train_loss = 1.077
2019-04-09T11:28:30.219331: Epoch   1 Batch 1255/3125   train_loss = 1.089
2019-04-09T11:28:30.627580: Epoch   1 Batch 1275/3125   train_loss = 1.000
2019-04-09T11:28:31.035136: Epoch   1 Batch 1295/3125   train_loss = 1.006
2019-04-09T11:28:31.448626: Epoch   1 Batch 1315/3125   train_loss = 1.210
2019-04-09T11:28:31.948769: Epoch   1 Batch 1335/3125   train_loss = 1.045
2019-04-09T11:28:32.356933: Epoch   1 Batch 1355/3125   train_loss = 1.058
2019-04-09T11:28:32.771030: Epoch   1 Batch 1375/3125   train_loss = 1.110
2019-04-09T11:28:33.184133: Epoch   1 Batch 1395/3125   train_loss = 1.008
2019-04-09T11:28:33.596132: Epoch   1 Batch 1415/3125   train_loss = 1.086
2019-04-09T11:28:34.007114: Epoch   1 Batch 1435/3125   train_loss = 1.221
2019-04-09T11:28:34.419967: Epoch   1 Batch 1455/3125   train_loss = 1.241
2019-04-09T11:28:34.829988: Epoch   1 Batch 1475/3125   train_loss = 1.154
2019-04-09T11:28:35.241458: Epoch   1 Batch 1495/3125   train_loss = 1.102
2019-04-09T11:28:35.650228: Epoch   1 Batch 1515/3125   train_loss = 0.990
2019-04-09T11:28:36.060708: Epoch   1 Batch 1535/3125   train_loss = 0.907
2019-04-09T11:28:36.472293: Epoch   1 Batch 1555/3125   train_loss = 1.079
2019-04-09T11:28:36.880701: Epoch   1 Batch 1575/3125   train_loss = 0.986
2019-04-09T11:28:37.298235: Epoch   1 Batch 1595/3125   train_loss = 1.052
2019-04-09T11:28:37.710706: Epoch   1 Batch 1615/3125   train_loss = 1.025
2019-04-09T11:28:38.118793: Epoch   1 Batch 1635/3125   train_loss = 1.146
2019-04-09T11:28:38.533452: Epoch   1 Batch 1655/3125   train_loss = 1.123
2019-04-09T11:28:38.948779: Epoch   1 Batch 1675/3125   train_loss = 0.976
2019-04-09T11:28:39.359489: Epoch   1 Batch 1695/3125   train_loss = 1.035
2019-04-09T11:28:39.766989: Epoch   1 Batch 1715/3125   train_loss = 0.945
2019-04-09T11:28:40.179589: Epoch   1 Batch 1735/3125   train_loss = 1.174
2019-04-09T11:28:40.590375: Epoch   1 Batch 1755/3125   train_loss = 1.027
2019-04-09T11:28:40.998865: Epoch   1 Batch 1775/3125   train_loss = 1.026
2019-04-09T11:28:41.408017: Epoch   1 Batch 1795/3125   train_loss = 0.981
2019-04-09T11:28:41.821620: Epoch   1 Batch 1815/3125   train_loss = 0.966
2019-04-09T11:28:42.229169: Epoch   1 Batch 1835/3125   train_loss = 1.074
2019-04-09T11:28:42.642918: Epoch   1 Batch 1855/3125   train_loss = 0.959
2019-04-09T11:28:43.154530: Epoch   1 Batch 1875/3125   train_loss = 1.213
2019-04-09T11:28:43.560385: Epoch   1 Batch 1895/3125   train_loss = 0.935
2019-04-09T11:28:43.974210: Epoch   1 Batch 1915/3125   train_loss = 0.973
2019-04-09T11:28:44.393618: Epoch   1 Batch 1935/3125   train_loss = 1.016
2019-04-09T11:28:44.808725: Epoch   1 Batch 1955/3125   train_loss = 1.006
2019-04-09T11:28:45.224542: Epoch   1 Batch 1975/3125   train_loss = 1.036
2019-04-09T11:28:45.638372: Epoch   1 Batch 1995/3125   train_loss = 1.130
2019-04-09T11:28:46.050876: Epoch   1 Batch 2015/3125   train_loss = 1.092
2019-04-09T11:28:46.466638: Epoch   1 Batch 2035/3125   train_loss = 1.163
2019-04-09T11:28:46.877782: Epoch   1 Batch 2055/3125   train_loss = 0.961
2019-04-09T11:28:47.297977: Epoch   1 Batch 2075/3125   train_loss = 1.154
2019-04-09T11:28:47.707362: Epoch   1 Batch 2095/3125   train_loss = 1.007
2019-04-09T11:28:48.119961: Epoch   1 Batch 2115/3125   train_loss = 1.150
2019-04-09T11:28:48.536958: Epoch   1 Batch 2135/3125   train_loss = 1.026
2019-04-09T11:28:48.955579: Epoch   1 Batch 2155/3125   train_loss = 1.008
2019-04-09T11:28:49.371992: Epoch   1 Batch 2175/3125   train_loss = 1.028
2019-04-09T11:28:49.785513: Epoch   1 Batch 2195/3125   train_loss = 1.013
2019-04-09T11:28:50.199116: Epoch   1 Batch 2215/3125   train_loss = 1.034
2019-04-09T11:28:50.609969: Epoch   1 Batch 2235/3125   train_loss = 1.184
2019-04-09T11:28:51.023581: Epoch   1 Batch 2255/3125   train_loss = 1.135
2019-04-09T11:28:51.436197: Epoch   1 Batch 2275/3125   train_loss = 0.936
2019-04-09T11:28:51.854318: Epoch   1 Batch 2295/3125   train_loss = 1.230
2019-04-09T11:28:52.266593: Epoch   1 Batch 2315/3125   train_loss = 1.180
2019-04-09T11:28:53.027310: Epoch   1 Batch 2335/3125   train_loss = 1.068
2019-04-09T11:28:53.443572: Epoch   1 Batch 2355/3125   train_loss = 1.021
2019-04-09T11:28:53.859233: Epoch   1 Batch 2375/3125   train_loss = 1.241
2019-04-09T11:28:54.268702: Epoch   1 Batch 2395/3125   train_loss = 1.022
2019-04-09T11:28:54.684586: Epoch   1 Batch 2415/3125   train_loss = 1.062
2019-04-09T11:28:55.104188: Epoch   1 Batch 2435/3125   train_loss = 0.978
2019-04-09T11:28:55.517661: Epoch   1 Batch 2455/3125   train_loss = 1.075
2019-04-09T11:28:55.940375: Epoch   1 Batch 2475/3125   train_loss = 0.997
2019-04-09T11:28:56.355446: Epoch   1 Batch 2495/3125   train_loss = 0.991
2019-04-09T11:28:56.767784: Epoch   1 Batch 2515/3125   train_loss = 1.057
2019-04-09T11:28:57.185487: Epoch   1 Batch 2535/3125   train_loss = 1.064
2019-04-09T11:28:57.599402: Epoch   1 Batch 2555/3125   train_loss = 0.883
2019-04-09T11:28:58.012436: Epoch   1 Batch 2575/3125   train_loss = 0.914
2019-04-09T11:28:58.427098: Epoch   1 Batch 2595/3125   train_loss = 0.934
2019-04-09T11:28:58.836389: Epoch   1 Batch 2615/3125   train_loss = 1.151
2019-04-09T11:28:59.262074: Epoch   1 Batch 2635/3125   train_loss = 1.017
2019-04-09T11:28:59.680762: Epoch   1 Batch 2655/3125   train_loss = 1.036
2019-04-09T11:29:00.094884: Epoch   1 Batch 2675/3125   train_loss = 0.960
2019-04-09T11:29:00.510614: Epoch   1 Batch 2695/3125   train_loss = 1.031
2019-04-09T11:29:00.925679: Epoch   1 Batch 2715/3125   train_loss = 1.011
2019-04-09T11:29:01.343105: Epoch   1 Batch 2735/3125   train_loss = 0.876
2019-04-09T11:29:01.762199: Epoch   1 Batch 2755/3125   train_loss = 1.087
2019-04-09T11:29:02.171790: Epoch   1 Batch 2775/3125   train_loss = 1.101
2019-04-09T11:29:02.585480: Epoch   1 Batch 2795/3125   train_loss = 1.064
2019-04-09T11:29:02.995887: Epoch   1 Batch 2815/3125   train_loss = 0.981
2019-04-09T11:29:03.414306: Epoch   1 Batch 2835/3125   train_loss = 1.123
2019-04-09T11:29:03.824405: Epoch   1 Batch 2855/3125   train_loss = 1.069
2019-04-09T11:29:04.236239: Epoch   1 Batch 2875/3125   train_loss = 1.006
2019-04-09T11:29:04.644747: Epoch   1 Batch 2895/3125   train_loss = 1.013
2019-04-09T11:29:05.058545: Epoch   1 Batch 2915/3125   train_loss = 0.985
2019-04-09T11:29:05.473539: Epoch   1 Batch 2935/3125   train_loss = 1.152
2019-04-09T11:29:05.881997: Epoch   1 Batch 2955/3125   train_loss = 1.015
2019-04-09T11:29:06.294405: Epoch   1 Batch 2975/3125   train_loss = 0.977
2019-04-09T11:29:06.707933: Epoch   1 Batch 2995/3125   train_loss = 0.928
2019-04-09T11:29:07.122537: Epoch   1 Batch 3015/3125   train_loss = 1.033
2019-04-09T11:29:07.534921: Epoch   1 Batch 3035/3125   train_loss = 1.097
2019-04-09T11:29:07.945410: Epoch   1 Batch 3055/3125   train_loss = 1.058
2019-04-09T11:29:08.355520: Epoch   1 Batch 3075/3125   train_loss = 1.009
2019-04-09T11:29:08.775390: Epoch   1 Batch 3095/3125   train_loss = 0.946
2019-04-09T11:29:09.190497: Epoch   1 Batch 3115/3125   train_loss = 0.919
2019-04-09T11:29:09.605177: Epoch   1 Batch   19/781   test_loss = 1.005
2019-04-09T11:29:09.737030: Epoch   1 Batch   39/781   test_loss = 0.844
2019-04-09T11:29:09.863600: Epoch   1 Batch   59/781   test_loss = 0.955
2019-04-09T11:29:09.991439: Epoch   1 Batch   79/781   test_loss = 0.980
2019-04-09T11:29:10.118778: Epoch   1 Batch   99/781   test_loss = 0.997
2019-04-09T11:29:10.246117: Epoch   1 Batch  119/781   test_loss = 0.996
2019-04-09T11:29:10.374962: Epoch   1 Batch  139/781   test_loss = 0.988
2019-04-09T11:29:10.503975: Epoch   1 Batch  159/781   test_loss = 0.970
2019-04-09T11:29:10.630812: Epoch   1 Batch  179/781   test_loss = 0.950
2019-04-09T11:29:10.758151: Epoch   1 Batch  199/781   test_loss = 0.939
2019-04-09T11:29:10.885992: Epoch   1 Batch  219/781   test_loss = 0.993
2019-04-09T11:29:11.014332: Epoch   1 Batch  239/781   test_loss = 1.237
2019-04-09T11:29:11.141671: Epoch   1 Batch  259/781   test_loss = 0.976
2019-04-09T11:29:11.270013: Epoch   1 Batch  279/781   test_loss = 1.069
2019-04-09T11:29:11.399713: Epoch   1 Batch  299/781   test_loss = 1.209
2019-04-09T11:29:11.531062: Epoch   1 Batch  319/781   test_loss = 0.913
2019-04-09T11:29:11.661408: Epoch   1 Batch  339/781   test_loss = 0.906
2019-04-09T11:29:11.787744: Epoch   1 Batch  359/781   test_loss = 0.924
2019-04-09T11:29:11.914581: Epoch   1 Batch  379/781   test_loss = 1.030
2019-04-09T11:29:12.043424: Epoch   1 Batch  399/781   test_loss = 0.912
2019-04-09T11:29:12.171264: Epoch   1 Batch  419/781   test_loss = 0.959
2019-04-09T11:29:12.300107: Epoch   1 Batch  439/781   test_loss = 1.026
2019-04-09T11:29:12.428123: Epoch   1 Batch  459/781   test_loss = 1.085
2019-04-09T11:29:12.553965: Epoch   1 Batch  479/781   test_loss = 1.054
2019-04-09T11:29:12.683302: Epoch   1 Batch  499/781   test_loss = 0.919
2019-04-09T11:29:12.810139: Epoch   1 Batch  519/781   test_loss = 1.083
2019-04-09T11:29:12.939483: Epoch   1 Batch  539/781   test_loss = 0.888
2019-04-09T11:29:13.066822: Epoch   1 Batch  559/781   test_loss = 1.165
2019-04-09T11:29:13.195164: Epoch   1 Batch  579/781   test_loss = 1.014
2019-04-09T11:29:13.321500: Epoch   1 Batch  599/781   test_loss = 0.975
2019-04-09T11:29:13.449045: Epoch   1 Batch  619/781   test_loss = 1.152
2019-04-09T11:29:13.578390: Epoch   1 Batch  639/781   test_loss = 0.881
2019-04-09T11:29:13.706229: Epoch   1 Batch  659/781   test_loss = 1.086
2019-04-09T11:29:13.834069: Epoch   1 Batch  679/781   test_loss = 1.149
2019-04-09T11:29:13.964416: Epoch   1 Batch  699/781   test_loss = 0.888
2019-04-09T11:29:14.094763: Epoch   1 Batch  719/781   test_loss = 0.940
2019-04-09T11:29:14.223606: Epoch   1 Batch  739/781   test_loss = 1.001
2019-04-09T11:29:14.350443: Epoch   1 Batch  759/781   test_loss = 0.925
2019-04-09T11:29:14.479091: Epoch   1 Batch  779/781   test_loss = 0.786
2019-04-09T11:29:15.169929: Epoch   2 Batch   10/3125   train_loss = 0.962
2019-04-09T11:29:15.585033: Epoch   2 Batch   30/3125   train_loss = 0.921
2019-04-09T11:29:16.090936: Epoch   2 Batch   50/3125   train_loss = 1.098
2019-04-09T11:29:16.504056: Epoch   2 Batch   70/3125   train_loss = 1.066
2019-04-09T11:29:16.916616: Epoch   2 Batch   90/3125   train_loss = 1.065
2019-04-09T11:29:17.335995: Epoch   2 Batch  110/3125   train_loss = 0.908
2019-04-09T11:29:17.744923: Epoch   2 Batch  130/3125   train_loss = 0.927
2019-04-09T11:29:18.156518: Epoch   2 Batch  150/3125   train_loss = 1.094
2019-04-09T11:29:18.572814: Epoch   2 Batch  170/3125   train_loss = 1.062
2019-04-09T11:29:18.979180: Epoch   2 Batch  190/3125   train_loss = 1.043
2019-04-09T11:29:19.392758: Epoch   2 Batch  210/3125   train_loss = 0.920
2019-04-09T11:29:19.806360: Epoch   2 Batch  230/3125   train_loss = 0.990
2019-04-09T11:29:20.213864: Epoch   2 Batch  250/3125   train_loss = 0.956
2019-04-09T11:29:20.624843: Epoch   2 Batch  270/3125   train_loss = 0.816
2019-04-09T11:29:21.034399: Epoch   2 Batch  290/3125   train_loss = 1.029
2019-04-09T11:29:21.450506: Epoch   2 Batch  310/3125   train_loss = 1.039
2019-04-09T11:29:21.860168: Epoch   2 Batch  330/3125   train_loss = 0.981
2019-04-09T11:29:22.268774: Epoch   2 Batch  350/3125   train_loss = 0.927
2019-04-09T11:29:22.681125: Epoch   2 Batch  370/3125   train_loss = 1.157
2019-04-09T11:29:23.092834: Epoch   2 Batch  390/3125   train_loss = 1.131
2019-04-09T11:29:23.503543: Epoch   2 Batch  410/3125   train_loss = 0.945
2019-04-09T11:29:23.913894: Epoch   2 Batch  430/3125   train_loss = 1.121
2019-04-09T11:29:24.324622: Epoch   2 Batch  450/3125   train_loss = 0.925
2019-04-09T11:29:24.740883: Epoch   2 Batch  470/3125   train_loss = 0.952
2019-04-09T11:29:25.150474: Epoch   2 Batch  490/3125   train_loss = 1.031
2019-04-09T11:29:25.566388: Epoch   2 Batch  510/3125   train_loss = 1.045
2019-04-09T11:29:25.981499: Epoch   2 Batch  530/3125   train_loss = 0.936
2019-04-09T11:29:26.427824: Epoch   2 Batch  550/3125   train_loss = 1.041
2019-04-09T11:29:26.844394: Epoch   2 Batch  570/3125   train_loss = 1.175
2019-04-09T11:29:27.262411: Epoch   2 Batch  590/3125   train_loss = 1.093
2019-04-09T11:29:27.677138: Epoch   2 Batch  610/3125   train_loss = 0.941
2019-04-09T11:29:28.088132: Epoch   2 Batch  630/3125   train_loss = 1.067
2019-04-09T11:29:28.504546: Epoch   2 Batch  650/3125   train_loss = 1.015
2019-04-09T11:29:28.919901: Epoch   2 Batch  670/3125   train_loss = 0.921
2019-04-09T11:29:29.332525: Epoch   2 Batch  690/3125   train_loss = 0.946
2019-04-09T11:29:29.752401: Epoch   2 Batch  710/3125   train_loss = 0.958
2019-04-09T11:29:30.169512: Epoch   2 Batch  730/3125   train_loss = 0.833
2019-04-09T11:29:30.581918: Epoch   2 Batch  750/3125   train_loss = 0.983
2019-04-09T11:29:30.990078: Epoch   2 Batch  770/3125   train_loss = 0.882
2019-04-09T11:29:31.401819: Epoch   2 Batch  790/3125   train_loss = 0.922
2019-04-09T11:29:31.821438: Epoch   2 Batch  810/3125   train_loss = 0.843
2019-04-09T11:29:32.231582: Epoch   2 Batch  830/3125   train_loss = 0.875
2019-04-09T11:29:32.646142: Epoch   2 Batch  850/3125   train_loss = 1.077
2019-04-09T11:29:33.064808: Epoch   2 Batch  870/3125   train_loss = 0.952
2019-04-09T11:29:33.477008: Epoch   2 Batch  890/3125   train_loss = 0.888
2019-04-09T11:29:33.887466: Epoch   2 Batch  910/3125   train_loss = 1.012
2019-04-09T11:29:34.298086: Epoch   2 Batch  930/3125   train_loss = 0.959
2019-04-09T11:29:34.715677: Epoch   2 Batch  950/3125   train_loss = 0.975
2019-04-09T11:29:35.130281: Epoch   2 Batch  970/3125   train_loss = 1.050
2019-04-09T11:29:35.544737: Epoch   2 Batch  990/3125   train_loss = 0.864
2019-04-09T11:29:35.958160: Epoch   2 Batch 1010/3125   train_loss = 1.084
2019-04-09T11:29:36.371777: Epoch   2 Batch 1030/3125   train_loss = 0.946
2019-04-09T11:29:36.780334: Epoch   2 Batch 1050/3125   train_loss = 1.009
2019-04-09T11:29:37.193936: Epoch   2 Batch 1070/3125   train_loss = 0.981
2019-04-09T11:29:37.603917: Epoch   2 Batch 1090/3125   train_loss = 1.081
2019-04-09T11:29:38.014688: Epoch   2 Batch 1110/3125   train_loss = 1.080
2019-04-09T11:29:38.435423: Epoch   2 Batch 1130/3125   train_loss = 0.920
2019-04-09T11:29:38.848851: Epoch   2 Batch 1150/3125   train_loss = 0.949
2019-04-09T11:29:39.260649: Epoch   2 Batch 1170/3125   train_loss = 0.944
2019-04-09T11:29:39.676982: Epoch   2 Batch 1190/3125   train_loss = 1.046
2019-04-09T11:29:40.089421: Epoch   2 Batch 1210/3125   train_loss = 0.873
2019-04-09T11:29:40.501075: Epoch   2 Batch 1230/3125   train_loss = 0.862
2019-04-09T11:29:40.912917: Epoch   2 Batch 1250/3125   train_loss = 0.963
2019-04-09T11:29:41.331306: Epoch   2 Batch 1270/3125   train_loss = 1.041
2019-04-09T11:29:41.745589: Epoch   2 Batch 1290/3125   train_loss = 0.935
2019-04-09T11:29:42.155682: Epoch   2 Batch 1310/3125   train_loss = 1.011
2019-04-09T11:29:42.565230: Epoch   2 Batch 1330/3125   train_loss = 1.089
2019-04-09T11:29:42.972821: Epoch   2 Batch 1350/3125   train_loss = 0.929
2019-04-09T11:29:43.384313: Epoch   2 Batch 1370/3125   train_loss = 0.871
2019-04-09T11:29:43.800679: Epoch   2 Batch 1390/3125   train_loss = 1.056
2019-04-09T11:29:44.212277: Epoch   2 Batch 1410/3125   train_loss = 0.956
2019-04-09T11:29:44.622595: Epoch   2 Batch 1430/3125   train_loss = 0.991
2019-04-09T11:29:45.030926: Epoch   2 Batch 1450/3125   train_loss = 1.019
2019-04-09T11:29:45.446118: Epoch   2 Batch 1470/3125   train_loss = 1.018
2019-04-09T11:29:45.858249: Epoch   2 Batch 1490/3125   train_loss = 1.025
2019-04-09T11:29:46.264877: Epoch   2 Batch 1510/3125   train_loss = 0.987
2019-04-09T11:29:46.680210: Epoch   2 Batch 1530/3125   train_loss = 1.077
2019-04-09T11:29:47.097122: Epoch   2 Batch 1550/3125   train_loss = 0.871
2019-04-09T11:29:47.505701: Epoch   2 Batch 1570/3125   train_loss = 0.963
2019-04-09T11:29:47.915740: Epoch   2 Batch 1590/3125   train_loss = 0.935
2019-04-09T11:29:48.325191: Epoch   2 Batch 1610/3125   train_loss = 1.024
2019-04-09T11:29:48.741050: Epoch   2 Batch 1630/3125   train_loss = 1.033
2019-04-09T11:29:49.303637: Epoch   2 Batch 1650/3125   train_loss = 0.892
2019-04-09T11:29:49.716688: Epoch   2 Batch 1670/3125   train_loss = 0.828
2019-04-09T11:29:50.127782: Epoch   2 Batch 1690/3125   train_loss = 0.886
2019-04-09T11:29:50.541466: Epoch   2 Batch 1710/3125   train_loss = 1.033
2019-04-09T11:29:50.952638: Epoch   2 Batch 1730/3125   train_loss = 0.990
2019-04-09T11:29:51.366254: Epoch   2 Batch 1750/3125   train_loss = 0.851
2019-04-09T11:29:51.779575: Epoch   2 Batch 1770/3125   train_loss = 1.130
2019-04-09T11:29:52.189667: Epoch   2 Batch 1790/3125   train_loss = 0.970
2019-04-09T11:29:52.600989: Epoch   2 Batch 1810/3125   train_loss = 1.004
2019-04-09T11:29:53.010986: Epoch   2 Batch 1830/3125   train_loss = 1.035
2019-04-09T11:29:53.428213: Epoch   2 Batch 1850/3125   train_loss = 0.935
2019-04-09T11:29:53.847839: Epoch   2 Batch 1870/3125   train_loss = 1.039
2019-04-09T11:29:54.260999: Epoch   2 Batch 1890/3125   train_loss = 0.822
2019-04-09T11:29:54.670587: Epoch   2 Batch 1910/3125   train_loss = 0.885
2019-04-09T11:29:55.079904: Epoch   2 Batch 1930/3125   train_loss = 1.038
2019-04-09T11:29:55.492941: Epoch   2 Batch 1950/3125   train_loss = 0.887
2019-04-09T11:29:55.909977: Epoch   2 Batch 1970/3125   train_loss = 0.998
2019-04-09T11:29:56.321669: Epoch   2 Batch 1990/3125   train_loss = 0.864
2019-04-09T11:29:56.731912: Epoch   2 Batch 2010/3125   train_loss = 0.792
2019-04-09T11:29:57.143008: Epoch   2 Batch 2030/3125   train_loss = 0.907
2019-04-09T11:29:57.555451: Epoch   2 Batch 2050/3125   train_loss = 0.952
2019-04-09T11:29:57.967763: Epoch   2 Batch 2070/3125   train_loss = 0.882
2019-04-09T11:29:58.396253: Epoch   2 Batch 2090/3125   train_loss = 0.831
2019-04-09T11:29:58.810290: Epoch   2 Batch 2110/3125   train_loss = 1.050
2019-04-09T11:29:59.220382: Epoch   2 Batch 2130/3125   train_loss = 0.973
2019-04-09T11:29:59.638177: Epoch   2 Batch 2150/3125   train_loss = 1.009
2019-04-09T11:30:00.054094: Epoch   2 Batch 2170/3125   train_loss = 0.862
2019-04-09T11:30:00.465054: Epoch   2 Batch 2190/3125   train_loss = 0.967
2019-04-09T11:30:00.875581: Epoch   2 Batch 2210/3125   train_loss = 0.950
2019-04-09T11:30:01.283669: Epoch   2 Batch 2230/3125   train_loss = 0.843
2019-04-09T11:30:01.702253: Epoch   2 Batch 2250/3125   train_loss = 0.933
2019-04-09T11:30:02.116357: Epoch   2 Batch 2270/3125   train_loss = 0.917
2019-04-09T11:30:02.530943: Epoch   2 Batch 2290/3125   train_loss = 0.856
2019-04-09T11:30:02.942953: Epoch   2 Batch 2310/3125   train_loss = 0.851
2019-04-09T11:30:03.360388: Epoch   2 Batch 2330/3125   train_loss = 1.097
2019-04-09T11:30:03.770799: Epoch   2 Batch 2350/3125   train_loss = 0.989
2019-04-09T11:30:04.189427: Epoch   2 Batch 2370/3125   train_loss = 0.886
2019-04-09T11:30:04.602910: Epoch   2 Batch 2390/3125   train_loss = 1.017
2019-04-09T11:30:05.013193: Epoch   2 Batch 2410/3125   train_loss = 1.025
2019-04-09T11:30:05.426136: Epoch   2 Batch 2430/3125   train_loss = 0.885
2019-04-09T11:30:05.834048: Epoch   2 Batch 2450/3125   train_loss = 0.968
2019-04-09T11:30:06.246209: Epoch   2 Batch 2470/3125   train_loss = 1.042
2019-04-09T11:30:06.661647: Epoch   2 Batch 2490/3125   train_loss = 1.003
2019-04-09T11:30:07.071296: Epoch   2 Batch 2510/3125   train_loss = 1.084
2019-04-09T11:30:07.490192: Epoch   2 Batch 2530/3125   train_loss = 0.793
2019-04-09T11:30:07.904515: Epoch   2 Batch 2550/3125   train_loss = 0.954
2019-04-09T11:30:08.315032: Epoch   2 Batch 2570/3125   train_loss = 0.957
2019-04-09T11:30:08.733158: Epoch   2 Batch 2590/3125   train_loss = 0.984
2019-04-09T11:30:09.146760: Epoch   2 Batch 2610/3125   train_loss = 1.043
2019-04-09T11:30:09.564414: Epoch   2 Batch 2630/3125   train_loss = 0.660
2019-04-09T11:30:09.977708: Epoch   2 Batch 2650/3125   train_loss = 0.913
2019-04-09T11:30:10.392227: Epoch   2 Batch 2670/3125   train_loss = 1.051
2019-04-09T11:30:10.803323: Epoch   2 Batch 2690/3125   train_loss = 0.980
2019-04-09T11:30:11.221892: Epoch   2 Batch 2710/3125   train_loss = 0.845
2019-04-09T11:30:11.636832: Epoch   2 Batch 2730/3125   train_loss = 1.067
2019-04-09T11:30:12.048855: Epoch   2 Batch 2750/3125   train_loss = 1.020
2019-04-09T11:30:12.466622: Epoch   2 Batch 2770/3125   train_loss = 0.894
2019-04-09T11:30:12.877228: Epoch   2 Batch 2790/3125   train_loss = 0.881
2019-04-09T11:30:13.292940: Epoch   2 Batch 2810/3125   train_loss = 0.958
2019-04-09T11:30:13.707370: Epoch   2 Batch 2830/3125   train_loss = 0.816
2019-04-09T11:30:14.115458: Epoch   2 Batch 2850/3125   train_loss = 1.005
2019-04-09T11:30:14.527402: Epoch   2 Batch 2870/3125   train_loss = 0.792
2019-04-09T11:30:14.941006: Epoch   2 Batch 2890/3125   train_loss = 0.779
2019-04-09T11:30:15.351115: Epoch   2 Batch 2910/3125   train_loss = 1.007
2019-04-09T11:30:15.761429: Epoch   2 Batch 2930/3125   train_loss = 0.813
2019-04-09T11:30:16.174529: Epoch   2 Batch 2950/3125   train_loss = 1.069
2019-04-09T11:30:16.592845: Epoch   2 Batch 2970/3125   train_loss = 0.993
2019-04-09T11:30:17.005062: Epoch   2 Batch 2990/3125   train_loss = 0.862
2019-04-09T11:30:17.425470: Epoch   2 Batch 3010/3125   train_loss = 0.936
2019-04-09T11:30:17.837640: Epoch   2 Batch 3030/3125   train_loss = 0.968
2019-04-09T11:30:18.248424: Epoch   2 Batch 3050/3125   train_loss = 0.980
2019-04-09T11:30:18.666115: Epoch   2 Batch 3070/3125   train_loss = 0.896
2019-04-09T11:30:19.074163: Epoch   2 Batch 3090/3125   train_loss = 0.774
2019-04-09T11:30:19.491628: Epoch   2 Batch 3110/3125   train_loss = 0.837
2019-04-09T11:30:19.895275: Epoch   2 Batch   18/781   test_loss = 0.808
2019-04-09T11:30:20.023969: Epoch   2 Batch   38/781   test_loss = 0.915
2019-04-09T11:30:20.152310: Epoch   2 Batch   58/781   test_loss = 0.851
2019-04-09T11:30:20.280151: Epoch   2 Batch   78/781   test_loss = 0.905
2019-04-09T11:30:20.408187: Epoch   2 Batch   98/781   test_loss = 0.903
2019-04-09T11:30:20.536028: Epoch   2 Batch  118/781   test_loss = 0.884
2019-04-09T11:30:20.663366: Epoch   2 Batch  138/781   test_loss = 1.000
2019-04-09T11:30:20.791206: Epoch   2 Batch  158/781   test_loss = 0.904
2019-04-09T11:30:20.918545: Epoch   2 Batch  178/781   test_loss = 0.785
2019-04-09T11:30:21.045884: Epoch   2 Batch  198/781   test_loss = 0.922
2019-04-09T11:30:21.177736: Epoch   2 Batch  218/781   test_loss = 0.997
2019-04-09T11:30:21.310087: Epoch   2 Batch  238/781   test_loss = 0.998
2019-04-09T11:30:21.437625: Epoch   2 Batch  258/781   test_loss = 0.959
2019-04-09T11:30:21.565465: Epoch   2 Batch  278/781   test_loss = 1.074
2019-04-09T11:30:21.692804: Epoch   2 Batch  298/781   test_loss = 0.915
2019-04-09T11:30:21.821646: Epoch   2 Batch  318/781   test_loss = 0.889
2019-04-09T11:30:21.952495: Epoch   2 Batch  338/781   test_loss = 0.941
2019-04-09T11:30:22.081338: Epoch   2 Batch  358/781   test_loss = 0.913
2019-04-09T11:30:22.210686: Epoch   2 Batch  378/781   test_loss = 0.890
2019-04-09T11:30:22.344036: Epoch   2 Batch  398/781   test_loss = 0.833
2019-04-09T11:30:22.471957: Epoch   2 Batch  418/781   test_loss = 0.941
2019-04-09T11:30:22.599296: Epoch   2 Batch  438/781   test_loss = 1.013
2019-04-09T11:30:22.728139: Epoch   2 Batch  458/781   test_loss = 0.919
2019-04-09T11:30:22.855992: Epoch   2 Batch  478/781   test_loss = 0.965
2019-04-09T11:30:22.982816: Epoch   2 Batch  498/781   test_loss = 0.813
2019-04-09T11:30:23.110155: Epoch   2 Batch  518/781   test_loss = 0.919
2019-04-09T11:30:23.238497: Epoch   2 Batch  538/781   test_loss = 0.795
2019-04-09T11:30:23.366838: Epoch   2 Batch  558/781   test_loss = 0.830
2019-04-09T11:30:23.495883: Epoch   2 Batch  578/781   test_loss = 0.915
2019-04-09T11:30:23.623225: Epoch   2 Batch  598/781   test_loss = 1.055
2019-04-09T11:30:23.751062: Epoch   2 Batch  618/781   test_loss = 0.850
2019-04-09T11:30:23.879905: Epoch   2 Batch  638/781   test_loss = 0.845
2019-04-09T11:30:24.007243: Epoch   2 Batch  658/781   test_loss = 1.026
2019-04-09T11:30:24.138091: Epoch   2 Batch  678/781   test_loss = 0.926
2019-04-09T11:30:24.266433: Epoch   2 Batch  698/781   test_loss = 0.875
2019-04-09T11:30:24.395604: Epoch   2 Batch  718/781   test_loss = 1.006
2019-04-09T11:30:24.523445: Epoch   2 Batch  738/781   test_loss = 0.850
2019-04-09T11:30:24.651786: Epoch   2 Batch  758/781   test_loss = 0.892
2019-04-09T11:30:24.779626: Epoch   2 Batch  778/781   test_loss = 0.913
2019-04-09T11:30:25.360700: Epoch   3 Batch    5/3125   train_loss = 0.900
2019-04-09T11:30:25.776594: Epoch   3 Batch   25/3125   train_loss = 0.995
2019-04-09T11:30:26.190195: Epoch   3 Batch   45/3125   train_loss = 0.823
2019-04-09T11:30:26.605221: Epoch   3 Batch   65/3125   train_loss = 0.936
2019-04-09T11:30:27.017575: Epoch   3 Batch   85/3125   train_loss = 0.811
2019-04-09T11:30:27.433325: Epoch   3 Batch  105/3125   train_loss = 0.735
2019-04-09T11:30:27.845489: Epoch   3 Batch  125/3125   train_loss = 0.883
2019-04-09T11:30:28.255902: Epoch   3 Batch  145/3125   train_loss = 0.946
2019-04-09T11:30:28.676186: Epoch   3 Batch  165/3125   train_loss = 0.907
2019-04-09T11:30:29.086028: Epoch   3 Batch  185/3125   train_loss = 0.843
2019-04-09T11:30:29.498049: Epoch   3 Batch  205/3125   train_loss = 0.782
2019-04-09T11:30:29.910137: Epoch   3 Batch  225/3125   train_loss = 0.818
2019-04-09T11:30:30.321717: Epoch   3 Batch  245/3125   train_loss = 1.094
2019-04-09T11:30:30.732822: Epoch   3 Batch  265/3125   train_loss = 0.907
2019-04-09T11:30:31.144919: Epoch   3 Batch  285/3125   train_loss = 0.899
2019-04-09T11:30:31.564878: Epoch   3 Batch  305/3125   train_loss = 0.886
2019-04-09T11:30:31.986450: Epoch   3 Batch  325/3125   train_loss = 0.900
2019-04-09T11:30:32.402943: Epoch   3 Batch  345/3125   train_loss = 0.966
2019-04-09T11:30:32.817756: Epoch   3 Batch  365/3125   train_loss = 0.897
2019-04-09T11:30:33.231358: Epoch   3 Batch  385/3125   train_loss = 0.854
2019-04-09T11:30:33.642523: Epoch   3 Batch  405/3125   train_loss = 0.854
2019-04-09T11:30:34.052009: Epoch   3 Batch  425/3125   train_loss = 0.950
2019-04-09T11:30:34.463651: Epoch   3 Batch  445/3125   train_loss = 0.963
2019-04-09T11:30:34.877612: Epoch   3 Batch  465/3125   train_loss = 0.840
2019-04-09T11:30:35.291041: Epoch   3 Batch  485/3125   train_loss = 1.043
2019-04-09T11:30:35.701510: Epoch   3 Batch  505/3125   train_loss = 0.820
2019-04-09T11:30:36.113107: Epoch   3 Batch  525/3125   train_loss = 0.977
2019-04-09T11:30:36.526067: Epoch   3 Batch  545/3125   train_loss = 0.785
2019-04-09T11:30:36.938504: Epoch   3 Batch  565/3125   train_loss = 1.138
2019-04-09T11:30:37.354627: Epoch   3 Batch  585/3125   train_loss = 0.877
2019-04-09T11:30:37.769480: Epoch   3 Batch  605/3125   train_loss = 0.865
2019-04-09T11:30:38.180576: Epoch   3 Batch  625/3125   train_loss = 0.931
2019-04-09T11:30:38.595414: Epoch   3 Batch  645/3125   train_loss = 1.007
2019-04-09T11:30:39.007112: Epoch   3 Batch  665/3125   train_loss = 0.960
2019-04-09T11:30:39.427161: Epoch   3 Batch  685/3125   train_loss = 0.908
2019-04-09T11:30:39.841768: Epoch   3 Batch  705/3125   train_loss = 1.001
2019-04-09T11:30:40.258352: Epoch   3 Batch  725/3125   train_loss = 0.888
2019-04-09T11:30:40.672977: Epoch   3 Batch  745/3125   train_loss = 0.834
2019-04-09T11:30:41.090307: Epoch   3 Batch  765/3125   train_loss = 0.864
2019-04-09T11:30:41.504196: Epoch   3 Batch  785/3125   train_loss = 1.046
2019-04-09T11:30:41.912423: Epoch   3 Batch  805/3125   train_loss = 0.816
2019-04-09T11:30:42.328090: Epoch   3 Batch  825/3125   train_loss = 0.904
2019-04-09T11:30:42.740677: Epoch   3 Batch  845/3125   train_loss = 0.932
2019-04-09T11:30:43.153777: Epoch   3 Batch  865/3125   train_loss = 1.004
2019-04-09T11:30:43.566946: Epoch   3 Batch  885/3125   train_loss = 0.968
2019-04-09T11:30:43.981050: Epoch   3 Batch  905/3125   train_loss = 0.998
2019-04-09T11:30:44.394270: Epoch   3 Batch  925/3125   train_loss = 0.896
2019-04-09T11:30:44.807669: Epoch   3 Batch  945/3125   train_loss = 0.978
2019-04-09T11:30:45.224278: Epoch   3 Batch  965/3125   train_loss = 0.731
2019-04-09T11:30:45.644716: Epoch   3 Batch  985/3125   train_loss = 1.003
2019-04-09T11:30:46.056218: Epoch   3 Batch 1005/3125   train_loss = 0.794
2019-04-09T11:30:46.465616: Epoch   3 Batch 1025/3125   train_loss = 0.879
2019-04-09T11:30:46.878718: Epoch   3 Batch 1045/3125   train_loss = 1.127
2019-04-09T11:30:47.297579: Epoch   3 Batch 1065/3125   train_loss = 0.875
2019-04-09T11:30:47.709534: Epoch   3 Batch 1085/3125   train_loss = 0.834
2019-04-09T11:30:48.125642: Epoch   3 Batch 1105/3125   train_loss = 0.842
2019-04-09T11:30:48.538103: Epoch   3 Batch 1125/3125   train_loss = 0.859
2019-04-09T11:30:48.952197: Epoch   3 Batch 1145/3125   train_loss = 0.905
2019-04-09T11:30:49.366261: Epoch   3 Batch 1165/3125   train_loss = 0.964
2019-04-09T11:30:49.774853: Epoch   3 Batch 1185/3125   train_loss = 0.869
2019-04-09T11:30:50.190392: Epoch   3 Batch 1205/3125   train_loss = 0.836
2019-04-09T11:30:50.605998: Epoch   3 Batch 1225/3125   train_loss = 1.002
2019-04-09T11:30:51.020181: Epoch   3 Batch 1245/3125   train_loss = 1.006
2019-04-09T11:30:51.434899: Epoch   3 Batch 1265/3125   train_loss = 0.896
2019-04-09T11:30:51.850872: Epoch   3 Batch 1285/3125   train_loss = 0.960
2019-04-09T11:30:52.265731: Epoch   3 Batch 1305/3125   train_loss = 0.802
2019-04-09T11:30:53.236710: Epoch   3 Batch 1325/3125   train_loss = 0.886
2019-04-09T11:30:53.650278: Epoch   3 Batch 1345/3125   train_loss = 0.928
2019-04-09T11:30:54.066153: Epoch   3 Batch 1365/3125   train_loss = 0.761
2019-04-09T11:30:54.481716: Epoch   3 Batch 1385/3125   train_loss = 0.779
2019-04-09T11:30:54.890807: Epoch   3 Batch 1405/3125   train_loss = 0.857
2019-04-09T11:30:55.303205: Epoch   3 Batch 1425/3125   train_loss = 1.106
2019-04-09T11:30:55.713796: Epoch   3 Batch 1445/3125   train_loss = 1.002
2019-04-09T11:30:56.127899: Epoch   3 Batch 1465/3125   train_loss = 0.887
2019-04-09T11:30:56.544126: Epoch   3 Batch 1485/3125   train_loss = 0.920
2019-04-09T11:30:56.952476: Epoch   3 Batch 1505/3125   train_loss = 0.745
2019-04-09T11:30:57.370433: Epoch   3 Batch 1525/3125   train_loss = 0.759
2019-04-09T11:30:57.781531: Epoch   3 Batch 1545/3125   train_loss = 0.843
2019-04-09T11:30:58.194632: Epoch   3 Batch 1565/3125   train_loss = 0.983
2019-04-09T11:30:58.613587: Epoch   3 Batch 1585/3125   train_loss = 0.827
2019-04-09T11:30:59.029585: Epoch   3 Batch 1605/3125   train_loss = 0.971
2019-04-09T11:30:59.443109: Epoch   3 Batch 1625/3125   train_loss = 0.950
2019-04-09T11:30:59.862969: Epoch   3 Batch 1645/3125   train_loss = 0.978
2019-04-09T11:31:00.280054: Epoch   3 Batch 1665/3125   train_loss = 0.916
2019-04-09T11:31:00.697972: Epoch   3 Batch 1685/3125   train_loss = 0.893
2019-04-09T11:31:01.120406: Epoch   3 Batch 1705/3125   train_loss = 0.883
2019-04-09T11:31:01.540523: Epoch   3 Batch 1725/3125   train_loss = 0.834
2019-04-09T11:31:01.957635: Epoch   3 Batch 1745/3125   train_loss = 0.775
2019-04-09T11:31:02.372311: Epoch   3 Batch 1765/3125   train_loss = 0.825
2019-04-09T11:31:02.786676: Epoch   3 Batch 1785/3125   train_loss = 1.015
2019-04-09T11:31:03.204288: Epoch   3 Batch 1805/3125   train_loss = 0.958
2019-04-09T11:31:03.616851: Epoch   3 Batch 1825/3125   train_loss = 1.031
2019-04-09T11:31:04.029497: Epoch   3 Batch 1845/3125   train_loss = 0.922
2019-04-09T11:31:04.442097: Epoch   3 Batch 1865/3125   train_loss = 0.753
2019-04-09T11:31:04.856887: Epoch   3 Batch 1885/3125   train_loss = 0.986
2019-04-09T11:31:05.271825: Epoch   3 Batch 1905/3125   train_loss = 0.799
2019-04-09T11:31:05.688152: Epoch   3 Batch 1925/3125   train_loss = 0.830
2019-04-09T11:31:06.097059: Epoch   3 Batch 1945/3125   train_loss = 0.865
2019-04-09T11:31:06.510931: Epoch   3 Batch 1965/3125   train_loss = 0.867
2019-04-09T11:31:06.924666: Epoch   3 Batch 1985/3125   train_loss = 0.840
2019-04-09T11:31:07.341276: Epoch   3 Batch 2005/3125   train_loss = 0.881
2019-04-09T11:31:07.755738: Epoch   3 Batch 2025/3125   train_loss = 0.951
2019-04-09T11:31:08.168337: Epoch   3 Batch 2045/3125   train_loss = 0.754
2019-04-09T11:31:08.583280: Epoch   3 Batch 2065/3125   train_loss = 0.727
2019-04-09T11:31:08.998421: Epoch   3 Batch 2085/3125   train_loss = 1.058
2019-04-09T11:31:09.415818: Epoch   3 Batch 2105/3125   train_loss = 0.891
2019-04-09T11:31:09.827917: Epoch   3 Batch 2125/3125   train_loss = 0.976
2019-04-09T11:31:10.237408: Epoch   3 Batch 2145/3125   train_loss = 1.002
2019-04-09T11:31:10.652222: Epoch   3 Batch 2165/3125   train_loss = 0.862
2019-04-09T11:31:11.061610: Epoch   3 Batch 2185/3125   train_loss = 0.948
2019-04-09T11:31:11.476691: Epoch   3 Batch 2205/3125   train_loss = 0.958
2019-04-09T11:31:11.893028: Epoch   3 Batch 2225/3125   train_loss = 0.811
2019-04-09T11:31:12.428069: Epoch   3 Batch 2245/3125   train_loss = 0.798
2019-04-09T11:31:12.840171: Epoch   3 Batch 2265/3125   train_loss = 0.896
2019-04-09T11:31:13.254127: Epoch   3 Batch 2285/3125   train_loss = 1.099
2019-04-09T11:31:13.671868: Epoch   3 Batch 2305/3125   train_loss = 0.812
2019-04-09T11:31:14.083559: Epoch   3 Batch 2325/3125   train_loss = 0.788
2019-04-09T11:31:14.499758: Epoch   3 Batch 2345/3125   train_loss = 0.885
2019-04-09T11:31:14.912859: Epoch   3 Batch 2365/3125   train_loss = 0.702
2019-04-09T11:31:15.331776: Epoch   3 Batch 2385/3125   train_loss = 0.915
2019-04-09T11:31:15.749019: Epoch   3 Batch 2405/3125   train_loss = 0.908
2019-04-09T11:31:16.161618: Epoch   3 Batch 2425/3125   train_loss = 0.875
2019-04-09T11:31:16.583581: Epoch   3 Batch 2445/3125   train_loss = 1.002
2019-04-09T11:31:17.000198: Epoch   3 Batch 2465/3125   train_loss = 0.748
2019-04-09T11:31:17.420234: Epoch   3 Batch 2485/3125   train_loss = 0.880
2019-04-09T11:31:17.834288: Epoch   3 Batch 2505/3125   train_loss = 0.852
2019-04-09T11:31:18.247812: Epoch   3 Batch 2525/3125   train_loss = 0.849
2019-04-09T11:31:18.663700: Epoch   3 Batch 2545/3125   train_loss = 1.010
2019-04-09T11:31:19.076134: Epoch   3 Batch 2565/3125   train_loss = 0.851
2019-04-09T11:31:19.490451: Epoch   3 Batch 2585/3125   train_loss = 0.768
2019-04-09T11:31:19.905388: Epoch   3 Batch 2605/3125   train_loss = 0.867
2019-04-09T11:31:20.318355: Epoch   3 Batch 2625/3125   train_loss = 1.004
2019-04-09T11:31:20.732786: Epoch   3 Batch 2645/3125   train_loss = 0.906
2019-04-09T11:31:21.146894: Epoch   3 Batch 2665/3125   train_loss = 0.984
2019-04-09T11:31:21.566102: Epoch   3 Batch 2685/3125   train_loss = 0.920
2019-04-09T11:31:21.981681: Epoch   3 Batch 2705/3125   train_loss = 0.784
2019-04-09T11:31:22.399609: Epoch   3 Batch 2725/3125   train_loss = 0.916
2019-04-09T11:31:22.817940: Epoch   3 Batch 2745/3125   train_loss = 0.925
2019-04-09T11:31:23.266133: Epoch   3 Batch 2765/3125   train_loss = 0.837
2019-04-09T11:31:23.679262: Epoch   3 Batch 2785/3125   train_loss = 0.935
2019-04-09T11:31:24.097862: Epoch   3 Batch 2805/3125   train_loss = 0.839
2019-04-09T11:31:24.511944: Epoch   3 Batch 2825/3125   train_loss = 0.844
2019-04-09T11:31:24.926787: Epoch   3 Batch 2845/3125   train_loss = 0.858
2019-04-09T11:31:25.347381: Epoch   3 Batch 2865/3125   train_loss = 0.853
2019-04-09T11:31:25.764592: Epoch   3 Batch 2885/3125   train_loss = 0.939
2019-04-09T11:31:26.184209: Epoch   3 Batch 2905/3125   train_loss = 0.969
2019-04-09T11:31:26.601925: Epoch   3 Batch 2925/3125   train_loss = 0.868
2019-04-09T11:31:27.016711: Epoch   3 Batch 2945/3125   train_loss = 0.900
2019-04-09T11:31:27.435058: Epoch   3 Batch 2965/3125   train_loss = 0.939
2019-04-09T11:31:27.848061: Epoch   3 Batch 2985/3125   train_loss = 0.843
2019-04-09T11:31:28.261955: Epoch   3 Batch 3005/3125   train_loss = 0.860
2019-04-09T11:31:28.677308: Epoch   3 Batch 3025/3125   train_loss = 0.917
2019-04-09T11:31:29.091668: Epoch   3 Batch 3045/3125   train_loss = 0.883
2019-04-09T11:31:29.505770: Epoch   3 Batch 3065/3125   train_loss = 0.864
2019-04-09T11:31:29.920149: Epoch   3 Batch 3085/3125   train_loss = 0.867
2019-04-09T11:31:30.335191: Epoch   3 Batch 3105/3125   train_loss = 0.929
2019-04-09T11:31:30.978022: Epoch   3 Batch   17/781   test_loss = 0.866
2019-04-09T11:31:31.112380: Epoch   3 Batch   37/781   test_loss = 0.868
2019-04-09T11:31:31.248741: Epoch   3 Batch   57/781   test_loss = 0.894
2019-04-09T11:31:31.387784: Epoch   3 Batch   77/781   test_loss = 0.898
2019-04-09T11:31:31.519144: Epoch   3 Batch   97/781   test_loss = 0.790
2019-04-09T11:31:31.648478: Epoch   3 Batch  117/781   test_loss = 0.950
2019-04-09T11:31:31.787347: Epoch   3 Batch  137/781   test_loss = 0.922
2019-04-09T11:31:31.934742: Epoch   3 Batch  157/781   test_loss = 0.919
2019-04-09T11:31:32.076115: Epoch   3 Batch  177/781   test_loss = 0.873
2019-04-09T11:31:32.206462: Epoch   3 Batch  197/781   test_loss = 0.928
2019-04-09T11:31:32.347500: Epoch   3 Batch  217/781   test_loss = 0.699
2019-04-09T11:31:32.483362: Epoch   3 Batch  237/781   test_loss = 0.752
2019-04-09T11:31:32.612205: Epoch   3 Batch  257/781   test_loss = 1.014
2019-04-09T11:31:32.754584: Epoch   3 Batch  277/781   test_loss = 0.979
2019-04-09T11:31:32.897965: Epoch   3 Batch  297/781   test_loss = 0.961
2019-04-09T11:31:33.031821: Epoch   3 Batch  317/781   test_loss = 1.030
2019-04-09T11:31:33.166680: Epoch   3 Batch  337/781   test_loss = 0.906
2019-04-09T11:31:33.308477: Epoch   3 Batch  357/781   test_loss = 0.883
2019-04-09T11:31:33.450355: Epoch   3 Batch  377/781   test_loss = 0.932
2019-04-09T11:31:33.580701: Epoch   3 Batch  397/781   test_loss = 0.918
2019-04-09T11:31:33.721075: Epoch   3 Batch  417/781   test_loss = 0.842
2019-04-09T11:31:33.859944: Epoch   3 Batch  437/781   test_loss = 0.808
2019-04-09T11:31:33.988286: Epoch   3 Batch  457/781   test_loss = 0.690
2019-04-09T11:31:34.116627: Epoch   3 Batch  477/781   test_loss = 0.923
2019-04-09T11:31:34.256500: Epoch   3 Batch  497/781   test_loss = 0.807
2019-04-09T11:31:34.394868: Epoch   3 Batch  517/781   test_loss = 0.805
2019-04-09T11:31:34.522207: Epoch   3 Batch  537/781   test_loss = 0.802
2019-04-09T11:31:34.650046: Epoch   3 Batch  557/781   test_loss = 1.050
2019-04-09T11:31:34.792425: Epoch   3 Batch  577/781   test_loss = 0.912
2019-04-09T11:31:34.930292: Epoch   3 Batch  597/781   test_loss = 0.875
2019-04-09T11:31:35.058634: Epoch   3 Batch  617/781   test_loss = 0.862
2019-04-09T11:31:35.184973: Epoch   3 Batch  637/781   test_loss = 0.781
2019-04-09T11:31:35.314815: Epoch   3 Batch  657/781   test_loss = 1.008
2019-04-09T11:31:35.444363: Epoch   3 Batch  677/781   test_loss = 0.931
2019-04-09T11:31:35.578721: Epoch   3 Batch  697/781   test_loss = 0.907
2019-04-09T11:31:35.712076: Epoch   3 Batch  717/781   test_loss = 0.812
2019-04-09T11:31:35.841921: Epoch   3 Batch  737/781   test_loss = 0.764
2019-04-09T11:31:35.983800: Epoch   3 Batch  757/781   test_loss = 1.099
2019-04-09T11:31:36.119660: Epoch   3 Batch  777/781   test_loss = 0.960
2019-04-09T11:31:36.666392: Epoch   4 Batch    0/3125   train_loss = 0.960
2019-04-09T11:31:37.108038: Epoch   4 Batch   20/3125   train_loss = 0.848
2019-04-09T11:31:37.523644: Epoch   4 Batch   40/3125   train_loss = 0.929
2019-04-09T11:31:37.940279: Epoch   4 Batch   60/3125   train_loss = 0.729
2019-04-09T11:31:38.360397: Epoch   4 Batch   80/3125   train_loss = 0.870
2019-04-09T11:31:38.783226: Epoch   4 Batch  100/3125   train_loss = 0.972
2019-04-09T11:31:39.208774: Epoch   4 Batch  120/3125   train_loss = 1.008
2019-04-09T11:31:39.670500: Epoch   4 Batch  140/3125   train_loss = 0.932
2019-04-09T11:31:40.130223: Epoch   4 Batch  160/3125   train_loss = 0.786
2019-04-09T11:31:40.578223: Epoch   4 Batch  180/3125   train_loss = 0.829
2019-04-09T11:31:40.994831: Epoch   4 Batch  200/3125   train_loss = 1.105
2019-04-09T11:31:41.423976: Epoch   4 Batch  220/3125   train_loss = 0.862
2019-04-09T11:31:41.847103: Epoch   4 Batch  240/3125   train_loss = 0.981
2019-04-09T11:31:42.273237: Epoch   4 Batch  260/3125   train_loss = 0.926
2019-04-09T11:31:42.696015: Epoch   4 Batch  280/3125   train_loss = 0.991
2019-04-09T11:31:43.118928: Epoch   4 Batch  300/3125   train_loss = 1.056
2019-04-09T11:31:43.543558: Epoch   4 Batch  320/3125   train_loss = 0.991
2019-04-09T11:31:43.963668: Epoch   4 Batch  340/3125   train_loss = 0.723
2019-04-09T11:31:44.405001: Epoch   4 Batch  360/3125   train_loss = 0.811
2019-04-09T11:31:44.837830: Epoch   4 Batch  380/3125   train_loss = 0.903
2019-04-09T11:31:45.256898: Epoch   4 Batch  400/3125   train_loss = 0.788
2019-04-09T11:31:45.684205: Epoch   4 Batch  420/3125   train_loss = 0.845
2019-04-09T11:31:46.114850: Epoch   4 Batch  440/3125   train_loss = 0.845
2019-04-09T11:31:46.554569: Epoch   4 Batch  460/3125   train_loss = 0.917
2019-04-09T11:31:46.990729: Epoch   4 Batch  480/3125   train_loss = 0.982
2019-04-09T11:31:47.417146: Epoch   4 Batch  500/3125   train_loss = 0.671
2019-04-09T11:31:47.851802: Epoch   4 Batch  520/3125   train_loss = 0.905
2019-04-09T11:31:48.283919: Epoch   4 Batch  540/3125   train_loss = 0.806
2019-04-09T11:31:48.718582: Epoch   4 Batch  560/3125   train_loss = 1.032
2019-04-09T11:31:49.138201: Epoch   4 Batch  580/3125   train_loss = 0.989
2019-04-09T11:31:49.559825: Epoch   4 Batch  600/3125   train_loss = 0.909
2019-04-09T11:31:49.989670: Epoch   4 Batch  620/3125   train_loss = 0.941
2019-04-09T11:31:50.406780: Epoch   4 Batch  640/3125   train_loss = 0.862
2019-04-09T11:31:50.859348: Epoch   4 Batch  660/3125   train_loss = 0.912
2019-04-09T11:31:51.275455: Epoch   4 Batch  680/3125   train_loss = 0.932
2019-04-09T11:31:51.691919: Epoch   4 Batch  700/3125   train_loss = 0.911
2019-04-09T11:31:52.107926: Epoch   4 Batch  720/3125   train_loss = 0.782
2019-04-09T11:31:52.527656: Epoch   4 Batch  740/3125   train_loss = 0.911
2019-04-09T11:31:52.969684: Epoch   4 Batch  760/3125   train_loss = 0.782
2019-04-09T11:31:53.409955: Epoch   4 Batch  780/3125   train_loss = 0.905
2019-04-09T11:31:53.832580: Epoch   4 Batch  800/3125   train_loss = 0.798
2019-04-09T11:31:54.247683: Epoch   4 Batch  820/3125   train_loss = 0.871
2019-04-09T11:31:54.668933: Epoch   4 Batch  840/3125   train_loss = 0.808
2019-04-09T11:31:55.088550: Epoch   4 Batch  860/3125   train_loss = 0.828
2019-04-09T11:31:55.506010: Epoch   4 Batch  880/3125   train_loss = 0.811
2019-04-09T11:31:55.953370: Epoch   4 Batch  900/3125   train_loss = 0.888
2019-04-09T11:31:56.475762: Epoch   4 Batch  920/3125   train_loss = 0.953
2019-04-09T11:31:56.895627: Epoch   4 Batch  940/3125   train_loss = 0.898
2019-04-09T11:31:57.314926: Epoch   4 Batch  960/3125   train_loss = 0.927
2019-04-09T11:31:57.736404: Epoch   4 Batch  980/3125   train_loss = 1.019
2019-04-09T11:31:58.155519: Epoch   4 Batch 1000/3125   train_loss = 0.972
2019-04-09T11:31:58.571659: Epoch   4 Batch 1020/3125   train_loss = 0.885
2019-04-09T11:31:58.987239: Epoch   4 Batch 1040/3125   train_loss = 0.766
2019-04-09T11:31:59.407857: Epoch   4 Batch 1060/3125   train_loss = 0.975
2019-04-09T11:31:59.827189: Epoch   4 Batch 1080/3125   train_loss = 0.890
2019-04-09T11:32:00.250485: Epoch   4 Batch 1100/3125   train_loss = 0.794
2019-04-09T11:32:00.665686: Epoch   4 Batch 1120/3125   train_loss = 0.830
2019-04-09T11:32:01.076280: Epoch   4 Batch 1140/3125   train_loss = 0.850
2019-04-09T11:32:01.495207: Epoch   4 Batch 1160/3125   train_loss = 0.826
2019-04-09T11:32:01.909009: Epoch   4 Batch 1180/3125   train_loss = 0.813
2019-04-09T11:32:02.325685: Epoch   4 Batch 1200/3125   train_loss = 1.011
2019-04-09T11:32:02.747689: Epoch   4 Batch 1220/3125   train_loss = 0.964
2019-04-09T11:32:03.171817: Epoch   4 Batch 1240/3125   train_loss = 0.782
2019-04-09T11:32:03.593569: Epoch   4 Batch 1260/3125   train_loss = 0.848
2019-04-09T11:32:04.011798: Epoch   4 Batch 1280/3125   train_loss = 0.908
2019-04-09T11:32:04.430913: Epoch   4 Batch 1300/3125   train_loss = 0.794
2019-04-09T11:32:04.846453: Epoch   4 Batch 1320/3125   train_loss = 0.872
2019-04-09T11:32:05.263562: Epoch   4 Batch 1340/3125   train_loss = 0.716
2019-04-09T11:32:05.679810: Epoch   4 Batch 1360/3125   train_loss = 0.847
2019-04-09T11:32:06.099427: Epoch   4 Batch 1380/3125   train_loss = 0.831
2019-04-09T11:32:06.515033: Epoch   4 Batch 1400/3125   train_loss = 0.932
2019-04-09T11:32:06.932977: Epoch   4 Batch 1420/3125   train_loss = 0.911
2019-04-09T11:32:07.349584: Epoch   4 Batch 1440/3125   train_loss = 0.767
2019-04-09T11:32:07.768391: Epoch   4 Batch 1460/3125   train_loss = 0.885
2019-04-09T11:32:08.186503: Epoch   4 Batch 1480/3125   train_loss = 0.855
2019-04-09T11:32:08.610562: Epoch   4 Batch 1500/3125   train_loss = 0.890
2019-04-09T11:32:09.027935: Epoch   4 Batch 1520/3125   train_loss = 0.807
2019-04-09T11:32:09.448052: Epoch   4 Batch 1540/3125   train_loss = 0.970
2019-04-09T11:32:09.864802: Epoch   4 Batch 1560/3125   train_loss = 0.786
2019-04-09T11:32:10.279906: Epoch   4 Batch 1580/3125   train_loss = 0.913
2019-04-09T11:32:10.694227: Epoch   4 Batch 1600/3125   train_loss = 0.830
2019-04-09T11:32:11.113843: Epoch   4 Batch 1620/3125   train_loss = 0.764
2019-04-09T11:32:11.535264: Epoch   4 Batch 1640/3125   train_loss = 0.948
2019-04-09T11:32:11.951873: Epoch   4 Batch 1660/3125   train_loss = 1.003
2019-04-09T11:32:12.368324: Epoch   4 Batch 1680/3125   train_loss = 0.899
2019-04-09T11:32:12.877578: Epoch   4 Batch 1700/3125   train_loss = 0.787
2019-04-09T11:32:13.293848: Epoch   4 Batch 1720/3125   train_loss = 0.872
2019-04-09T11:32:13.710885: Epoch   4 Batch 1740/3125   train_loss = 0.929
2019-04-09T11:32:14.120976: Epoch   4 Batch 1760/3125   train_loss = 0.887
2019-04-09T11:32:14.538451: Epoch   4 Batch 1780/3125   train_loss = 0.851
2019-04-09T11:32:14.959239: Epoch   4 Batch 1800/3125   train_loss = 0.820
2019-04-09T11:32:15.374844: Epoch   4 Batch 1820/3125   train_loss = 0.807
2019-04-09T11:32:15.787555: Epoch   4 Batch 1840/3125   train_loss = 0.903
2019-04-09T11:32:16.206090: Epoch   4 Batch 1860/3125   train_loss = 0.977
2019-04-09T11:32:16.620547: Epoch   4 Batch 1880/3125   train_loss = 0.887
2019-04-09T11:32:17.036185: Epoch   4 Batch 1900/3125   train_loss = 0.734
2019-04-09T11:32:17.454960: Epoch   4 Batch 1920/3125   train_loss = 0.883
2019-04-09T11:32:17.870896: Epoch   4 Batch 1940/3125   train_loss = 0.792
2019-04-09T11:32:18.287611: Epoch   4 Batch 1960/3125   train_loss = 0.756
2019-04-09T11:32:18.708944: Epoch   4 Batch 1980/3125   train_loss = 0.856
2019-04-09T11:32:19.124550: Epoch   4 Batch 2000/3125   train_loss = 0.989
2019-04-09T11:32:19.539524: Epoch   4 Batch 2020/3125   train_loss = 0.987
2019-04-09T11:32:19.955392: Epoch   4 Batch 2040/3125   train_loss = 0.793
2019-04-09T11:32:20.373002: Epoch   4 Batch 2060/3125   train_loss = 0.851
2019-04-09T11:32:20.788365: Epoch   4 Batch 2080/3125   train_loss = 0.980
2019-04-09T11:32:21.207642: Epoch   4 Batch 2100/3125   train_loss = 0.782
2019-04-09T11:32:21.628621: Epoch   4 Batch 2120/3125   train_loss = 0.808
2019-04-09T11:32:22.042255: Epoch   4 Batch 2140/3125   train_loss = 0.840
2019-04-09T11:32:22.456976: Epoch   4 Batch 2160/3125   train_loss = 0.829
2019-04-09T11:32:22.867969: Epoch   4 Batch 2180/3125   train_loss = 0.917
2019-04-09T11:32:23.281501: Epoch   4 Batch 2200/3125   train_loss = 0.803
2019-04-09T11:32:23.696260: Epoch   4 Batch 2220/3125   train_loss = 0.832
2019-04-09T11:32:24.112367: Epoch   4 Batch 2240/3125   train_loss = 0.797
2019-04-09T11:32:24.528127: Epoch   4 Batch 2260/3125   train_loss = 0.872
2019-04-09T11:32:24.944427: Epoch   4 Batch 2280/3125   train_loss = 0.880
2019-04-09T11:32:25.362539: Epoch   4 Batch 2300/3125   train_loss = 0.847
2019-04-09T11:32:25.776624: Epoch   4 Batch 2320/3125   train_loss = 0.908
2019-04-09T11:32:26.191315: Epoch   4 Batch 2340/3125   train_loss = 0.849
2019-04-09T11:32:26.607493: Epoch   4 Batch 2360/3125   train_loss = 0.881
2019-04-09T11:32:27.021723: Epoch   4 Batch 2380/3125   train_loss = 0.835
2019-04-09T11:32:27.440410: Epoch   4 Batch 2400/3125   train_loss = 0.915
2019-04-09T11:32:27.850694: Epoch   4 Batch 2420/3125   train_loss = 0.794
2019-04-09T11:32:28.265448: Epoch   4 Batch 2440/3125   train_loss = 0.800
2019-04-09T11:32:28.684222: Epoch   4 Batch 2460/3125   train_loss = 0.852
2019-04-09T11:32:29.103336: Epoch   4 Batch 2480/3125   train_loss = 0.954
2019-04-09T11:32:29.520448: Epoch   4 Batch 2500/3125   train_loss = 0.811
2019-04-09T11:32:29.941087: Epoch   4 Batch 2520/3125   train_loss = 0.885
2019-04-09T11:32:30.357195: Epoch   4 Batch 2540/3125   train_loss = 0.845
2019-04-09T11:32:30.780301: Epoch   4 Batch 2560/3125   train_loss = 0.665
2019-04-09T11:32:31.195065: Epoch   4 Batch 2580/3125   train_loss = 0.825
2019-04-09T11:32:31.604654: Epoch   4 Batch 2600/3125   train_loss = 0.868
2019-04-09T11:32:32.018648: Epoch   4 Batch 2620/3125   train_loss = 0.813
2019-04-09T11:32:32.435706: Epoch   4 Batch 2640/3125   train_loss = 0.826
2019-04-09T11:32:32.853230: Epoch   4 Batch 2660/3125   train_loss = 1.017
2019-04-09T11:32:33.270841: Epoch   4 Batch 2680/3125   train_loss = 0.769
2019-04-09T11:32:33.692010: Epoch   4 Batch 2700/3125   train_loss = 0.922
2019-04-09T11:32:34.136192: Epoch   4 Batch 2720/3125   train_loss = 0.796
2019-04-09T11:32:34.551979: Epoch   4 Batch 2740/3125   train_loss = 0.870
2019-04-09T11:32:34.968683: Epoch   4 Batch 2760/3125   train_loss = 0.799
2019-04-09T11:32:35.385795: Epoch   4 Batch 2780/3125   train_loss = 0.842
2019-04-09T11:32:35.803009: Epoch   4 Batch 2800/3125   train_loss = 1.050
2019-04-09T11:32:36.220554: Epoch   4 Batch 2820/3125   train_loss = 1.034
2019-04-09T11:32:36.638668: Epoch   4 Batch 2840/3125   train_loss = 0.822
2019-04-09T11:32:37.057100: Epoch   4 Batch 2860/3125   train_loss = 0.789
2019-04-09T11:32:37.477429: Epoch   4 Batch 2880/3125   train_loss = 0.858
2019-04-09T11:32:37.894122: Epoch   4 Batch 2900/3125   train_loss = 0.833
2019-04-09T11:32:38.309463: Epoch   4 Batch 2920/3125   train_loss = 0.849
2019-04-09T11:32:38.727701: Epoch   4 Batch 2940/3125   train_loss = 0.879
2019-04-09T11:32:39.142808: Epoch   4 Batch 2960/3125   train_loss = 0.877
2019-04-09T11:32:39.560118: Epoch   4 Batch 2980/3125   train_loss = 0.827
2019-04-09T11:32:39.978247: Epoch   4 Batch 3000/3125   train_loss = 0.920
2019-04-09T11:32:40.396863: Epoch   4 Batch 3020/3125   train_loss = 1.001
2019-04-09T11:32:40.812059: Epoch   4 Batch 3040/3125   train_loss = 0.956
2019-04-09T11:32:41.228167: Epoch   4 Batch 3060/3125   train_loss = 0.814
2019-04-09T11:32:41.643774: Epoch   4 Batch 3080/3125   train_loss = 1.017
2019-04-09T11:32:42.059833: Epoch   4 Batch 3100/3125   train_loss = 1.032
2019-04-09T11:32:42.478235: Epoch   4 Batch 3120/3125   train_loss = 0.816
2019-04-09T11:32:42.674176: Epoch   4 Batch   16/781   test_loss = 0.830
2019-04-09T11:32:42.806027: Epoch   4 Batch   36/781   test_loss = 0.903
2019-04-09T11:32:42.936875: Epoch   4 Batch   56/781   test_loss = 0.934
2019-04-09T11:32:43.067222: Epoch   4 Batch   76/781   test_loss = 0.974
2019-04-09T11:32:43.197569: Epoch   4 Batch   96/781   test_loss = 1.000
2019-04-09T11:32:43.326913: Epoch   4 Batch  116/781   test_loss = 0.887
2019-04-09T11:32:43.457535: Epoch   4 Batch  136/781   test_loss = 0.811
2019-04-09T11:32:43.588383: Epoch   4 Batch  156/781   test_loss = 0.876
2019-04-09T11:32:43.716224: Epoch   4 Batch  176/781   test_loss = 0.865
2019-04-09T11:32:43.846583: Epoch   4 Batch  196/781   test_loss = 0.786
2019-04-09T11:32:43.975413: Epoch   4 Batch  216/781   test_loss = 0.974
2019-04-09T11:32:44.105258: Epoch   4 Batch  236/781   test_loss = 0.793
2019-04-09T11:32:44.235605: Epoch   4 Batch  256/781   test_loss = 0.827
2019-04-09T11:32:44.367456: Epoch   4 Batch  276/781   test_loss = 1.097
2019-04-09T11:32:44.496146: Epoch   4 Batch  296/781   test_loss = 0.813
2019-04-09T11:32:44.625489: Epoch   4 Batch  316/781   test_loss = 0.820
2019-04-09T11:32:44.754834: Epoch   4 Batch  336/781   test_loss = 0.760
2019-04-09T11:32:44.884178: Epoch   4 Batch  356/781   test_loss = 0.885
2019-04-09T11:32:45.013021: Epoch   4 Batch  376/781   test_loss = 0.872
2019-04-09T11:32:45.141362: Epoch   4 Batch  396/781   test_loss = 0.807
2019-04-09T11:32:45.273213: Epoch   4 Batch  416/781   test_loss = 0.935
2019-04-09T11:32:45.402058: Epoch   4 Batch  436/781   test_loss = 0.955
2019-04-09T11:32:45.533244: Epoch   4 Batch  456/781   test_loss = 0.735
2019-04-09T11:32:45.666098: Epoch   4 Batch  476/781   test_loss = 0.931
2019-04-09T11:32:45.795442: Epoch   4 Batch  496/781   test_loss = 0.966
2019-04-09T11:32:45.925789: Epoch   4 Batch  516/781   test_loss = 0.760
2019-04-09T11:32:46.054130: Epoch   4 Batch  536/781   test_loss = 0.990
2019-04-09T11:32:46.183474: Epoch   4 Batch  556/781   test_loss = 0.868
2019-04-09T11:32:46.312818: Epoch   4 Batch  576/781   test_loss = 0.940
2019-04-09T11:32:46.441522: Epoch   4 Batch  596/781   test_loss = 0.959
2019-04-09T11:32:46.571869: Epoch   4 Batch  616/781   test_loss = 0.930
2019-04-09T11:32:46.702216: Epoch   4 Batch  636/781   test_loss = 0.809
2019-04-09T11:32:46.831560: Epoch   4 Batch  656/781   test_loss = 0.876
2019-04-09T11:32:46.960904: Epoch   4 Batch  676/781   test_loss = 1.057
2019-04-09T11:32:47.092254: Epoch   4 Batch  696/781   test_loss = 0.856
2019-04-09T11:32:47.222600: Epoch   4 Batch  716/781   test_loss = 0.852
2019-04-09T11:32:47.351944: Epoch   4 Batch  736/781   test_loss = 1.075
2019-04-09T11:32:47.480286: Epoch   4 Batch  756/781   test_loss = 0.809
2019-04-09T11:32:47.610131: Epoch   4 Batch  776/781   test_loss = 0.753
Model Trained and Saved

在 TensorBoard 中查看可视化结果

tensorboard --logdir=/PATH_TO_CODE/runs/1513402825/summaries/

保存参数

保存save_dir 在生成预测时使用。

save_params((save_dir))

load_dir = load_params()

显示训练Loss

plt.plot(losses['train'], label='Training loss')
plt.legend()
_ = plt.ylim()

显示测试Loss

迭代次数再增加一些，下降的趋势会明显一些

plt.plot(losses['test'], label='Test loss')
plt.legend()
_ = plt.ylim()

获取 Tensors

使用函数 get_tensor_by_name()从 loaded_graph 中获取tensors，后面的推荐功能要用到。

def get_tensors(loaded_graph):

    uid = loaded_graph.get_tensor_by_name("uid:0")
    user_gender = loaded_graph.get_tensor_by_name("user_gender:0")
    user_age = loaded_graph.get_tensor_by_name("user_age:0")
    user_job = loaded_graph.get_tensor_by_name("user_job:0")
    movie_id = loaded_graph.get_tensor_by_name("movie_id:0")
    movie_categories = loaded_graph.get_tensor_by_name("movie_categories:0")
    movie_titles = loaded_graph.get_tensor_by_name("movie_titles:0")
    targets = loaded_graph.get_tensor_by_name("targets:0")
    dropout_keep_prob = loaded_graph.get_tensor_by_name("dropout_keep_prob:0")
    lr = loaded_graph.get_tensor_by_name("LearningRate:0")
    #两种不同计算预测评分的方案使用不同的name获取tensor inference
#     inference = loaded_graph.get_tensor_by_name("inference/inference/BiasAdd:0")
    inference = loaded_graph.get_tensor_by_name("inference/ExpandDims:0") # 之前是MatMul:0 因为inference代码修改了 这里也要修改 感谢网友 @清歌 指出问题
    movie_combine_layer_flat = loaded_graph.get_tensor_by_name("movie_fc/Reshape:0")
    user_combine_layer_flat = loaded_graph.get_tensor_by_name("user_fc/Reshape:0")
    return uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, lr, dropout_keep_prob, inference, movie_combine_layer_flat, user_combine_layer_flat

指定用户和电影进行评分

这部分就是对网络做正向传播，计算得到预测的评分

def rating_movie(user_id_val, movie_id_val):
    loaded_graph = tf.Graph()  #
    with tf.Session(graph=loaded_graph) as sess:  #
        # Load saved model
        loader = tf.train.import_meta_graph(load_dir + '.meta')
        loader.restore(sess, load_dir)
    
        # Get Tensors from loaded model
        uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, lr, dropout_keep_prob, inference,_, __ = get_tensors(loaded_graph)  #loaded_graph
    
        categories = np.zeros([1, 18])
        categories[0] = movies.values[movieid2idx[movie_id_val]][2]
    
        titles = np.zeros([1, sentences_size])
        titles[0] = movies.values[movieid2idx[movie_id_val]][1]
    
        feed = {
              uid: np.reshape(users.values[user_id_val-1][0], [1, 1]),
              user_gender: np.reshape(users.values[user_id_val-1][1], [1, 1]),
              user_age: np.reshape(users.values[user_id_val-1][2], [1, 1]),
              user_job: np.reshape(users.values[user_id_val-1][3], [1, 1]),
              movie_id: np.reshape(movies.values[movieid2idx[movie_id_val]][0], [1, 1]),
              movie_categories: categories,  #x.take(6,1)
              movie_titles: titles,  #x.take(5,1)
              dropout_keep_prob: 1}
    
        # Get Prediction
        inference_val = sess.run([inference], feed)  
    
        return (inference_val)

rating_movie(234, 1401)

INFO:tensorflow:Restoring parameters from ./save





[array([[3.1157281]], dtype=float32)]

生成Movie特征矩阵

将训练好的电影特征组合成电影特征矩阵并保存到本地

loaded_graph = tf.Graph()  #
movie_matrics = []
with tf.Session(graph=loaded_graph) as sess:  #
    # Load saved model
    loader = tf.train.import_meta_graph(load_dir + '.meta')
    loader.restore(sess, load_dir)

    # Get Tensors from loaded model
    uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, lr, dropout_keep_prob, _, movie_combine_layer_flat, __ = get_tensors(loaded_graph)  #loaded_graph

    for item in movies.values:
        categories = np.zeros([1, 18])
        categories[0] = item.take(2)

        titles = np.zeros([1, sentences_size])
        titles[0] = item.take(1)

        feed = {
            movie_id: np.reshape(item.take(0), [1, 1]),
            movie_categories: categories,  #x.take(6,1)
            movie_titles: titles,  #x.take(5,1)
            dropout_keep_prob: 1}

        movie_combine_layer_flat_val = sess.run([movie_combine_layer_flat], feed)  
        movie_matrics.append(movie_combine_layer_flat_val)

pickle.dump((np.array(movie_matrics).reshape(-1, 200)), open('movie_matrics.p', 'wb'))
movie_matrics = pickle.load(open('movie_matrics.p', mode='rb'))

INFO:tensorflow:Restoring parameters from ./save

movie_matrics = pickle.load(open('movie_matrics.p', mode='rb'))

生成User特征矩阵

将训练好的用户特征组合成用户特征矩阵并保存到本地

loaded_graph = tf.Graph()  #
users_matrics = []
with tf.Session(graph=loaded_graph) as sess:  #
    # Load saved model
    loader = tf.train.import_meta_graph(load_dir + '.meta')
    loader.restore(sess, load_dir)

    # Get Tensors from loaded model
    uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, lr, dropout_keep_prob, _, __,user_combine_layer_flat = get_tensors(loaded_graph)  #loaded_graph

    for item in users.values:

        feed = {
            uid: np.reshape(item.take(0), [1, 1]),
            user_gender: np.reshape(item.take(1), [1, 1]),
            user_age: np.reshape(item.take(2), [1, 1]),
            user_job: np.reshape(item.take(3), [1, 1]),
            dropout_keep_prob: 1}

        user_combine_layer_flat_val = sess.run([user_combine_layer_flat], feed)  
        users_matrics.append(user_combine_layer_flat_val)

pickle.dump((np.array(users_matrics).reshape(-1, 200)), open('users_matrics.p', 'wb'))
users_matrics = pickle.load(open('users_matrics.p', mode='rb'))

INFO:tensorflow:Restoring parameters from ./save

users_matrics = pickle.load(open('users_matrics.p', mode='rb'))

开始推荐电影

使用生产的用户特征矩阵和电影特征矩阵做电影推荐

看过这个电影的人还看了（喜欢）哪些电影

首先选出喜欢某个电影的top_k个人，得到这几个人的用户特征向量。
然后计算这几个人对所有电影的评分
选择每个人评分最高的电影作为推荐
同样加入了随机选择

import random

def recommend_other_favorite_movie(movie_id_val, top_k = 20):
    loaded_graph = tf.Graph()  #
    with tf.Session(graph=loaded_graph) as sess:  #
        # Load saved model
        loader = tf.train.import_meta_graph(load_dir + '.meta')
        loader.restore(sess, load_dir)

        probs_movie_embeddings = (movie_matrics[movieid2idx[movie_id_val]]).reshape([1, 200])
        probs_user_favorite_similarity = tf.matmul(probs_movie_embeddings, tf.transpose(users_matrics))
        favorite_user_id = np.argsort(probs_user_favorite_similarity.eval())[0][-top_k:]
    #     print(normalized_users_matrics.eval().shape)
    #     print(probs_user_favorite_similarity.eval()[0][favorite_user_id])
    #     print(favorite_user_id.shape)
    
        print("您看的电影是：{}".format(movies_orig[movieid2idx[movie_id_val]]))
        
        print("喜欢看这个电影的人是：{}".format(users_orig[favorite_user_id-1]))
        probs_users_embeddings = (users_matrics[favorite_user_id-1]).reshape([-1, 200])
        probs_similarity = tf.matmul(probs_users_embeddings, tf.transpose(movie_matrics))
        sim = (probs_similarity.eval())
    #     results = (-sim[0]).argsort()[0:top_k]
    #     print(results)
    
    #     print(sim.shape)
    #     print(np.argmax(sim, 1))
        p = np.argmax(sim, 1)
        print("喜欢看这个电影的人还喜欢看：")

        results = set()
        while len(results) != 5:
            c = p[random.randrange(top_k)]
            results.add(c)
        for val in (results):
            print(val)
            print(movies_orig[val])
        
        return results

recommend_other_favorite_movie(1401, 20)

INFO:tensorflow:Restoring parameters from ./save
您看的电影是：[1401 'Ghosts of Mississippi (1996)' 'Drama']
喜欢看这个电影的人是：[[1568 'F' 1 10]
 [4814 'M' 18 14]
 [5217 'M' 25 17]
 [1745 'M' 45 0]
 [1763 'M' 35 7]
 [5861 'F' 50 1]
 [493 'M' 50 7]
 [3031 'M' 18 4]
 [2144 'M' 18 0]
 [1644 'M' 18 12]
 [3833 'M' 25 1]
 [5678 'M' 35 17]
 [1701 'F' 25 4]
 [3297 'M' 18 4]
 [4800 'M' 18 4]
 [1109 'M' 18 10]
 [2496 'M' 50 1]
 [100 'M' 35 17]
 [2154 'M' 25 12]
 [4085 'F' 25 6]]
喜欢看这个电影的人还喜欢看：
1132
[1148 'Wrong Trousers, The (1993)' 'Animation|Comedy']
1133
[1149 'JLG/JLG - autoportrait de d閏embre (1994)' 'Documentary|Drama']
847
[858 'Godfather, The (1972)' 'Action|Crime|Drama']
763
[773 'Touki Bouki (Journey of the Hyena) (1973)' 'Drama']
1950
[2019
 'Seven Samurai (The Magnificent Seven) (Shichinin no samurai) (1954)'
 'Action|Drama']





{763, 847, 1132, 1133, 1950}

结论

以上就是实现的常用的推荐功能，将网络模型作为回归问题进行训练，得到训练好的用户特征矩阵和电影特征矩阵进行推荐。

扩展阅读

如果你对个性化推荐感兴趣，以下资料建议你看看：

今天的分享就到这里，请多指教！