PyTorch的一些类及函数

以下是一些我在使用PyTorch中遇到的一些类及函数，为了便于理解和使用，将官网中的说明摘录一些下来。

torch.nn.modules.conv1d

来源 https://pytorch.org/docs/stable/_modules/torch/nn/modules/conv.html#Conv1d

class Conv1d(_ConvNd):
    r"""Applies a 1D convolution over an input signal composed of several input planes. """
    def __init__(self, in_channels, out_channels, kernel_size, stride=1,
                 padding=0, dilation=1, groups=1,
                 bias=True, padding_mode='zeros'):
        kernel_size = _single(kernel_size)
        stride = _single(stride)
        padding = _single(padding)
        dilation = _single(dilation)
        super(Conv1d, self).__init__(
            in_channels, out_channels, kernel_size, stride, padding, dilation,
            False, _single(0), groups, bias, padding_mode)

类的说明：对由多个输入平面组成的输入信号应用一维卷积

官网中对初始化函数中一些参数的说明：

in_channels (int): Number of channels in the input image

out_channels (int): Number of channels produced by the convolution

kernel_size (int or tuple): Size of the convolving kernel

stride (int or tuple, optional): Stride of the convolution. Default: 1

padding (int or tuple, optional): Zero-padding added to both sides of the input. Default: 0

dilation (int or tuple, optional): Spacing between kernel elements. Default: 1

groups (int, optional): Number of blocked connections from input channels to output channels. Default: 1

bias (bool, optional): If True, adds a learnable bias to the output. Default: True

padding_mode (string, optional). Accepted values zeros and circular Default: zeros

机翻

in_channels (int): 输入图像中的通道数

out_channels (int): 由卷积产生的信道数

kernel_size (int or tuple): 卷积核的大小

stride (int or tuple, optional): 卷积的步幅

padding (int or tuple, optional): 输入的两边都加上了零填充

dilation (int or tuple, optional): 卷积核元素之间的间距

groups (int, optional): 从输入通道到输出通道的阻塞连接数

bias (bool, optional): 如果为“ True” ，则在输出中添加可学习的偏差

padding_mode (string, optional): 接受值“0”和“循环”

torch.nn.modules.conv2d

来源 https://pytorch.org/docs/stable/_modules/torch/nn/modules/conv.html#Conv2d

class Conv2d(_ConvNd):
    """Applies a 2D convolution over an input signal composed of several input planes."""
    def __init__(self, in_channels, out_channels, kernel_size, stride=1,
                 padding=0, dilation=1, groups=1,
                 bias=True, padding_mode='zeros'):
        kernel_size = _pair(kernel_size)
        stride = _pair(stride)
        padding = _pair(padding)
        dilation = _pair(dilation)
        super(Conv2d, self).__init__(
            in_channels, out_channels, kernel_size, stride, padding, dilation,
            False, _pair(0), groups, bias, padding_mode)

类的说明：在由多个输入平面组成的输入信号上应用二维卷积。

官网中对初始化函数中一些参数的说明：

in_channels (int): Number of channels in the input image

out_channels (int): Number of channels produced by the convolution

kernel_size (int or tuple): Size of the convolving kernel

stride (int or tuple, optional): Stride of the convolution. Default: 1

padding (int or tuple, optional): Zero-padding added to both sides of the input. Default: 0

dilation (int or tuple, optional): Spacing between kernel elements. Default: 1

groups (int, optional): Number of blocked connections from input channels to output channels. Default: 1

bias (bool, optional): If True, adds a learnable bias to the output. Default: True

padding_mode (string, optional). Accepted values zeros and circular Default: zeros

机翻

in_channels (int): 输入图像中的通道数

out_channels (int): 由卷积产生的信道数

kernel_size (int or tuple): 卷积核的大小

stride (int or tuple, optional): 卷积的步幅

padding (int or tuple, optional): 输入的两边都加上了零填充

dilation (int or tuple, optional): 卷积核元素之间的间距

groups (int, optional): 从输入通道到输出通道的阻塞连接数

bias (bool, optional): 如果为“ True” ，则在输出中添加可学习的偏差

padding_mode (string, optional): 接受值“0”和“循环”

以下是博客中对参数的含义做的进一步解释。

stride(步长)：控制cross-correlation的步长，可以设为1个int型数或者一个(int, int)型的tuple。
padding(补0)：控制zero-padding的数目。=0时，不填充，原图与卷积核进行卷积；=1时，在原图四边填充一行（一列），具体填充的数据由padding_mode控制，一般填0。
dilation(扩张)：控制kernel点（卷积核点）的间距; 也被称为 "à trous"算法. 可以在此github地址查看:Dilated convolution animations
groups(卷积核个数)：这个比较好理解，通常来说，卷积个数唯一，但是对某些情况，可以设置范围在1 —— in_channels中数目的卷积核：

At groups=1, all inputs are convolved to all outputs.

At groups=2, the operation becomes equivalent to having two conv layers side by side, each seeing half the input channels, and producing half the output channels, and both subsequently concatenated.

At groups=in_channels, each input channel is convolved with its own set of filters (of size ⌊out_channelsin_channels⌋
).

下面是官网对于nn.Conv2d输入输出shape的说明。

以及官网中给出的样例

>>> # With square kernels and equal stride

>>> m = nn.Conv2d(16, 33, 3, stride=2)

>>> # non-square kernels and unequal stride and with padding

>>> m = nn.Conv2d(16, 33, (3, 5), stride=(2, 1), padding=(4, 2))

>>> # non-square kernels and unequal stride and with padding and dilation

>>> m = nn.Conv2d(16, 33, (3, 5), stride=(2, 1), padding=(4, 2), dilation=(3, 1))

>>> input = torch.randn(20, 16, 50, 100)

>>> output = m(input)

PyTorch的一些类及函数

torch.nn.modules.conv1d

torch.nn.modules.conv2d

参考资料