CS224n assignment1 Q1 Softmax

(a) theory

证明对于任意输入向量x和常数c,softmax的输出不会随着c的改变而改变,即softmax(x) = softmax(x+c)
note:在实际使用中,经常利用这个性质,将每个元素x减去最大的那个元素,即最大值为0.
证明:利用softmax的公式分别将两边展开计算即可。

(b) coding

def softmax(x):
    """Compute the softmax function for each row of the input x.

    It is crucial that this function is optimized for speed because
    it will be used frequently in later code. You might find numpy
    functions np.exp, np.sum, np.reshape, np.max, and numpy
    broadcasting useful for this task.

    Numpy broadcasting documentation:
    http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html

    You should also make sure that your code works for a single
    D-dimensional vector (treat the vector as a single row) and
    for N x D matrices. This may be useful for testing later. Also,
    make sure that the dimensions of the output match the input.

    You must implement the optimization in problem 1(a) of the
    written assignment!

    Arguments:
    x -- A D dimensional vector or N x D dimensional numpy matrix.

    Return:
    x -- You are allowed to modify x in-place
    """
    orig_shape = x.shape

    if len(x.shape) > 1:
        # Matrix
        x -= np.max(x,axis = 1,keepdims=True)
        x = np.exp(x)/np.sum(np.exp(x),axis = 1,keepdims=True)
    else:
        # Vector
        x -= np.max(x)
        x = np.exp(x)/np.sum(np.exp(x))

    assert x.shape == orig_shape,"something wrong"
    return x
原文地址:https://www.cnblogs.com/bernieloveslife/p/10240730.html