library(h2o)
# single thread
h2o.init()
#连接h2o平台
train_file <- "https://h2o-public-test-data.s3.amazonaws.com/bigdata/laptop/mnist/train.csv.gz"
test_file <- "https://h2o-public-test-data.s3.amazonaws.com/bigdata/laptop/mnist/test.csv.gz"
train <- h2o.importFile(train_file)
test <- h2o.importFile(test_file)
# To see a brief summary of the data, run the following command
summary(train)
summary(test)
y <- "C785"
x <- setdiff(names(train), y)
# We encode the response column as categorical for multinomial
#classification
train[,y] <- as.factor(train[,y])
test[,y] <- as.factor(test[,y])
# Train a Deep Learning model and valid
system.time(
model_cv <- h2o.deeplearning(x = x,
y = y,
training_frame = train,
distribution = "multinomial",
activation = "Rectifier",
hidden = c(32),
l1 = 1e-5,
epochs = 200)
)

三、最简单的案例——基于iris数据集的深度学习

本案例主要来自h2o官方手册中，h2o.deeplearning包的示例，比较简单易懂。如果你想看预测的数据可以用as.data.frame来变成R能识别的数据框格式。

[plain] view plain copy

输出的结果长成下面这个样子。

大概构成是：模型评价指标+混淆矩阵+一些指标的阈值（这个是啥？？）

看到混淆矩阵，你就差不多懂了~

[plain] view plain copy

> print(performance)
H2OBinomialMetrics: deeplearning
** Reported on training data. **
Description: Metrics reported on full training frame
MSE: 0.01030833
R^2: 0.9536125
LogLoss: 0.05097025
AUC: 1
Gini: 1
Confusion Matrix for F1-optimal threshold:
0 1 Error Rate
0 100 0 0.000000 =0/100
1 0 50 0.000000 =0/50
Totals 100 50 0.000000 =0/150
Maximum Metrics: Maximum metrics at their respective thresholds
metric threshold value idx
1 max f1 0.983179 1.000000 49
2 max f2 0.983179 1.000000 49
3 max f0point5 0.983179 1.000000 49
4 max accuracy 0.983179 1.000000 49
5 max precision 0.999915 1.000000 0
6 max recall 0.983179 1.000000 49
7 max specificity 0.999915 1.000000 0
8 max absolute_MCC 0.983179 1.000000 49
9 max min_per_class_accuracy 0.983179 1.000000 49
Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` or `h2o.gainsLift(<model>, valid=<T/F>, xval=<T/F>)`

每每以为攀得众山小，可、每每又切实