三维卷积：全景图像Spherical CNNs（Code）

卷积神经网络（CNN）可以很好的处理二维平面图像的问题。然而，对球面图像进行处理需求日益增加。例如，对无人机、机器人、自动驾驶汽车、分子回归问题、全球天气和气候模型的全方位视觉处理问题。

将球形信号的平面投影作为卷积神经网络的输入的这种Too Naive做法是注定要失败的，Cnns的巨大成就来源于局部感受野的权值共享，而多层结构总能找到不同rect的相同目标，给出响应。而对于球形图像，一个目标在图片的不同位置是发生形变的，若要使用CNNs直接共享，构建的局部感受野理应描述这种转换。如下图所示，而这种平面投影引起的空间扭曲会导致CNN无法共享权重。

We propose a definition for the spherical cross-correlation that is both expressive and rotation-equivariant. The spherical correlation satisfies a generalized Fourier theorem, which allows us to compute it efficiently using a generalized(non-commutative) Fast Fourier Transform (FFT) algorithm. We demonstrate the computational efficiency, numerical accuracy, and effectiveness of spherical CNNs applied to 3D model recognition and atomization energy regression.

如何使三维图像由二维图像重构出来，解决在不同位置产生形变问题，经典的FFT方法和李群模型就成为这种桥梁。

关于SO3 作为刚体变换的阐述，参考：半闲居士视觉SLAM十四讲笔记(3)三维空间刚体运动 - par..._CSDN博客。

wocao，这个大纲写的更简洁明了：高翔《视觉SLAM十四讲》从理论到实践。

区分出三维图像和平面的细微差别，把球面图像看做是三维流形，把球面展开为离散的三维李群，把SO(3)的关系用CNNs的高层进行表示。

As shown in Figure 1, there is no good way to use translational convolution or cross-correlation1 to analyze spherical signals. The most obvious approach, then, is to change the definition of crosscorrelation by replacing filter translations by rotations. Doing so, we run into a subtle but important difference between the plane and the sphere: whereas the space of moves for the plane (2D translations) is itself isomorphic to the plane, the space of moves for the sphere (3D rotations) is a different, three-dimensional manifold called SO(3)2. It follows that the result of a spherical correlation (the output feature map) is to be considered a signal on SO(3), not a signal on the sphere, S2. For this reason, we deploy SO(3) group correlation in the higher layers of a spherical CNN (Cohen and Welling, 2016).

The implementation of a spherical CNN (S2-CNN) involves two major challenges. Whereas a square grid of pixels has discrete translation symmetries, no perfectly symmetrical grids for the sphere exist. This means that there is no simple way to define the rotation of a spherical filter by one pixel. Instead, in order to rotate a filter we would need to perform some kind of interpolation. The other challenge is computational efficiency; SO(3) is a three-dimensional manifold, so a naive implementation of SO(3) correlation is O(n6).

球形CNNs的两个难点：图像网格化的粒度，多大的粒度分解能保证重建的准确性；SO(3)的三维流形计算复杂度问题，时间复杂度是O(n6)的。

........................................

The Key moments:

使用G-FFT进行快速相关性卷积，的相关结构。It is well known that correlations and convolutions can be computed efficiently using the Fast Fourier Transform (FFT). This is a result of the Fourier theorem, which states that[f = ^ f ^ . Since the FFT can be computed in O(n log n) time and the product has linear complexity, implementing the correlation using FFTs is asymptotically faster than the naive O(n2) spatial implementation.

.................

.......................................

最重要的一点，Our code is available at： https://github.com/jonas-koehler/s2cnn .

实验效果：

Results We evaluate by RMSE and compare our results to Montavon et al. (2012) and Raj et al. (2016) (see table 3). Our learned representation outperforms all kernel-based approaches and a MLP trained on sorted Coulomb matrices. Superior performance could only be achieved for an MLP trained on randomly permuted Coulomb matrices. However, sufficient sampling of random permutations grows exponentially with N, so this method is unlikely to scale to large molecules.

文中定义了S2和SO（3）的互相关，并分析了它们的属性，进而实现了一个通用的RRT相关算法。实验的数值结果证实了该算法的稳定性和准确性，即使在深度网络上依然有效。

总之，在准确率、可扩展性、等方面是综合最有前途的一个三维网络。

进一步优化：

For intrinsically volumetric tasks like 3D model recognition, we believe that further improvements can be attained by generalizing further beyond SO(3) to the roto-translation group SE(3). The development of Spherical CNNs is an important first step in this direction. Another interesting generalization is the development of a Steerable CNN for the sphere (Cohen and Welling, 2017), which would make it possible to analyze vector fields such as global wind directions, as well as other sections of vector bundles over the sphere.

把SO(3)上的计算往SE(3)上进行转化，把旋转相关性变换到切空间的平移SE(3)，应该可以达到新的加速效果。

Appendix：

李群与李代数

三维旋转矩阵构成了特殊正交群SO(3)，而变换矩阵构成了特殊欧氏群SE(3)