论文笔记(一)Re-ranking by Multi-feature Fusion with Diffusion for Image Retrieval

0x00 预备知识 $DeclareMathOperator{vol}{vol}$

无向图上的随机游走

无向图 $G=(V,E)$,边权函数 $wcolon V imes V o R_+$ 。

若 $(u,v) otin E $ 则 $w(u,v) = w(v,u) = 0$,否则 $w(u,v) , w(v,u) > 0$

令 $d(u) = sum_{vin V} w(v, u)$

先不管建图的细节(比如 $G_m$ 的边权(edge strength 是如何确定的)),先来梳理一下 $G_m$ 上的随机游走。

$G_m$ 上的随机游走即「连通的无向图上的随机游走」。
只要给出转移矩阵 $mathbf{P}$ 就能求出稳态分布。

我们使用左随机矩阵列向量,这样转移矩阵中元素 $p_{ij}$ 的含义更为直观。

注:使用列向量是数学中的惯例。

定义 1

A probability distribution $pi$ satisfying
egin{equation}
pi^{T} = pi^{T}mathbf{P} label{E:1}
end{equation}
is called a stationary distribution of the transition matrix $P$, or of the corresponding HMC.

将 $G_m$ 上的转移矩阵定义为

$p_m(v_j | v_i) = e_m(v_i, v_j) / d_m(v_i)$

显然如此定义的转移矩阵 $mathbf{P}_m$ 是左随机矩阵,可以证明 $P$ 的平稳分布为
egin{equation}
pi_m(v_i) = d_m(v_i) / vol_m V
end{equation}

证明:边权都大于 $0$,连通图意味着 $forall vin V, quad d(v) > 0$,设 $pi^T mathbf{P} = (p_1, p_2, dots, p_n)$ 其中 $n$ 是节点数,则有
egin{aligned}
p_i &= sum_{j = 1} ^n pi_j p_{ji} \
&= sum_{j = 1} ^n frac{d_j}{vol V} frac{w(j,i)}{d_j} \
&= frac{d_i}{vol V} \
&= pi_i
end{aligned}
证毕。

现在考虑融合图 $G$ 上的随机游走过程。对于这个过程,我们提出的模型是

egin{equation}
p(v_j | v_i) = sum_m p_m(v_i) p_m(v_j | v_i) label{E:3}
end{equation}

其中 $p_m(v_i)$ 是 walker 在点 $v_i$ 时转到图 $G_m$ 中进行下一步游走的概率。按我们的想法,应当有 $p_m(v_i) propto pi_m(v_i)$,于是我们假定

egin{equation}
p_m(v_i) = k_m(v_i) pi_m(v_i) label{E:4}
end{equation}

其中的系数 $k_m(v_i)$ 未知,根据

Theorem 1

Let $mathbf{P}$ be a transition matrix on the countable state space $E$, and
let $pi$ be some probability distribution on $E$. If for all $i, j in E$, the detailed balance
equations (6.8) are satisfied, then π is a stationary distribution of P.


数据集

首先要能够按文章中的描述提取特征

2 个全局特征:

  • BOW
  • VLAD

2 个局部特征:

  • GIST
  • HSV

OpenCV 处理图像。

OpenCV

不论图像(cv::Mat)的 color model 如何,只要是彩色图像(cv::Mat::channels 返回值为 3)cv::imshow 都认为 3 个 channel 依次是 BGR 。(即 BGR 的字典序 :XD)
参考一
参考二

HSV

下文中,color space 与 color model 混用,指同一个东西。

Trouble 1: How to detect the color model of an image in OpenCV?

Info: 看到一种说法

When OpenCV loads colored images (i.e. 3 channel) from the disk, camera, or a video file, the image data will be stored in the BGR format.

另一种相似的说法指出 RBG 和 RBG 是两种不同的 color model,不过差别只在于 channel 的顺序。

OpenCV has a BGR color space which is used by default. This is similar to the RGB color space except that the B and R channels are physically switched in the image. If the physical channel ordering is important to you, you will need to convert your image with this function: cvCvtColor(defaultBGR, imageRGB, CV_BGR2RGB).

Problem1: 用 cvtColor(img, img, CV_BGR2HSV);img 转为 HSV 格式后,imshow 显示的图与原图不同。我的想法:图片的样子应该与 color model 无关。(彩图转成灰度图这类情形除外)
A:已解决。

cv::Mat 存储图片的格式的一些细节:

The color-space conversions all use the following conventions: 8-bit images are in the range 0 to 255, 16-bit images are in the range 0 to 65,536, and floating-point numbers are in the range $0.0$ to $1.0$. When grayscale images are converted to color images, all components of the resulting image are taken to be equal; but for the reverse transformation (e.g., RGB or BGR to grayscale), the gray value is computed through the perceptually weighted formula:

[Y = (0.299)R + (0.587)G + (0.114)B]

In the case of HSV or HLS representations, hue is normally represented as a value from 0 to 360 (excluding 360, of course). This can cause trouble in 8-bit representations and so when you are converting to HSV, the hue is divided by 2 when the output image is an 8-bit image.

Trouble 2: cv::Mat::at 方法(和 member function 同义)不懂。

原文地址:https://www.cnblogs.com/Patt/p/8986110.html