常用的采样方法

今天简单列举两个常用的采样方法：softmax采样和gamble采样。

在我们已知数据的概率分布后，想要根据已有的概率值，抽取出适合的数据。此时，就需要特定的采样函数拿数据。

简要代码如下：

"""
    采样方法
"""
import numpy as np

np.random.seed(1111)    # 随机种植，固定每次生成相同的数据
logits = (np.random.random(10) - 0.5 ) * 2  # 拉到-1到1

def inv_gumble_cdf(y, mu=0, beta=1, eps=1e-20):
    return mu - beta * np.log(-np.log(y+eps))

def sample_gamble(shape):
    p = np.random.random(shape)
    return inv_gumble_cdf(p)

def softmax(logits):
    max_value = np.max(logits)
    exp = np.exp(logits - max_value)
    exp_sum = np.sum(exp)
    dist = exp / exp_sum
    return dist

def sample_with_softmax(logits, size):
    pros = softmax(logits)
    print(pros)
    return np.random.choice(len(logits), size, p=pros)

def sample_with_gumbel_noise(logits, size):
    noise = sample_gamble((size, len(logits)))
    return np.argmax(logits + noise, axis=1)


print('logits:{}'.format(logits))
pop = 1
softmax_samples = sample_with_softmax(logits, pop)
print('softmax_samples:{}'.format(softmax_samples))
gamble_samples = sample_with_gumbel_noise(logits, pop)
print('gamble_sample:{}'.format(gamble_samples))

返回结果：