UFLDL 教程学习笔记（二）

课程链接：http://ufldl.stanford.edu/tutorial/supervised/LogisticRegression/

这一节主要讲的是梯度的概念，在实验部分，比较之前的线性回归的梯度与通过定义来计算的梯度，统计二者之间的误差。

线性回归得到的是一个连续值，有时我们想得到0或者1这样的预测值，这就要用到logistic regression。因为要得到的是概率值，

之前的表示函数显然已经不合适了，这时需要用到新的函数来表示：

我们的目标就是对theta做优化，当x属于1时，概率值为1的概率越大越好，反之越小越好。

目标函数当然也得用新的啦（关于这个函数，可参考台大的机器学习基石：http://beader.me/mlnotebook/section3/logistic-regression.html）：

作业部分就是训练识别手写0和1,需要注意的仍然是要分清各个变量的维数。跑了下训练准确率和

测试准确率都是100%

参考：http://blog.csdn.net/lingerlanlan/article/details/38390955

代码我加了点注释：

第一段代码改自ex1a_linreg.m，主要就是为了得到训练数据和测试数据，以及它们的标签。

%
%This exercise uses a data from the UCI repository:
% Bache, K. & Lichman, M. (2013). UCI Machine Learning Repository
% http://archive.ics.uci.edu/ml
% Irvine, CA: University of California, School of Information and Computer Science.
%
%Data created by:
% Harrison, D. and Rubinfeld, D.L.
% ''Hedonic prices and the demand for clean air''
% J. Environ. Economics & Management, vol.5, 81-102, 1978.
%
addpath ../common
addpath ../common/minFunc_2012/minFunc
addpath ../common/minFunc_2012/minFunc/compiled

% Load housing data from file.
data = load('housing.data');
data=data'; % put examples in columns

% Include a row of 1s as an additional intercept feature.
data = [ ones(1,size(data,2)); data ];

% Shuffle examples.
data = data(:, randperm(size(data,2)));%返回data的一列数据

% Split into train and test sets取得训练数据和测试数据，并取得相应的标签
% The last row of 'data' is the median home price.
train.X = data(1:end-1,1:400);
train.y = data(end,1:400);

test.X = data(1:end-1,401:end);
test.y = data(end,401:end);

m=size(train.X,2);
n=size(train.X,1);

% Initialize the coefficient vector theta to random values.
theta = rand(n,1);%产生n行1列的在0到1之间的数字

% Run the minFunc optimizer with linear_regression.m as the objective.
%
% TODO:  Implement the linear regression objective and gradient computations
% in linear_regression.m
%
tic;
% options = struct('MaxIter', 200);
% theta = minFunc(@linear_regression, theta, options, train.X, train.y);
% fprintf('Optimization took %f seconds.
', toc);

grad_check(@linear_regression,theta,200,train.X,train.y)

View Code

第二段代码是grad_check.m函数

function average_error = grad_check(fun, theta0, num_checks, varargin)

  delta=1e-3; 
  sum_error=0;

  fprintf(' Iter       i             err');
  fprintf('           g_est               g               f
')

  for i=1:num_checks
    T = theta0;
    j = randsample(numel(T),1);%从1~numel(T)中随机返回一个数
    T0=T; T0(j) = T0(j)-delta;
    T1=T; T1(j) = T1(j)+delta;

    [f,g] = fun(T, varargin{:});%T为目标函数，varargin为目标函数梯度
    f0 = fun(T0, varargin{:});
    f1 = fun(T1, varargin{:});

    g_est = (f1-f0) / (2*delta);
    error = abs(g(j) - g_est);

    fprintf('% 5d  % 6d % 15g % 15f % 15f % 15f
', ...
            i,j,error,g(j),g_est,f);

    sum_error = sum_error + error;
  end

  average=sum_error/num_checks;

View Code