numpy教程：随机数模块numpy.random

http://blog.csdn.net/pipisorry/article/details/39508417

随机数种子

RandomState

RandomState exposes a number of methods for generating random numbersdrawn from a variety of probability distributions.

使用示例

prng = np.random.RandomState(123456789) # 定义局部种子
prng.rand(2, 4)

prng.chisquare(1, size=(2, 2)) # 卡方分布
prng.standard_t(1, size=(2, 3)) # t 分布
prng.poisson(5, size=10) # 泊松分布

[概率与统计分析]

[class numpy.random.RandomState]

random.seed()

random.seed(123456789) # 种子不同，产生的随机数序列也不同，随机数种子都是全局种子

要每次产生随机数相同就要设置种子，相同种子数的Random对象，相同次数生成的随机数字是完全相同的；

random.seed(1)

这样random.randint(0,6, (4,5))每次都产生一样的4*5的随机矩阵

This method is called when RandomState is initialized. It can be called again to re-seed the generator.

[numpy.random.seed]

关于种子的介绍可参见[Java - 常用函数Random函数]

皮皮blog

numpy.random模块

linspace(start, end, num): 如linspace(0,1,11)结果为[0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1];

arange(n): 产生一个从0到n-1的向量，如arange(4)结果为[0,1,2,3]

简单随机生成数据相关函数

`rand`(d0, d1, ..., dn)	Random values in a given shape.
`randn`(d0, d1, ..., dn)	Return a sample (or samples) from the “standard normal” distribution.
`randint`(low[, high, size, dtype])	Return random integers from low (inclusive) to high (exclusive).
`random_integers`(low[, high, size])	Random integers of type np.int between low and high, inclusive.
`random_sample`([size])	Return random floats in the half-open interval [0.0, 1.0).
`random`([size])	Return random floats in the half-open interval [0.0, 1.0).产生随机矩阵，如random.random([2,3])产生一个2x3维的随机数
ranf([size])	Return random floats in the half-open interval [0.0, 1.0).
`sample`([size])	Return random floats in the half-open interval [0.0, 1.0).
`choice`(a[, size, replace, p])	Generates a random sample from a given 1-D array
`bytes`(length)	Return random bytes.

[Simple random data¶]

np.random模块使用示例

np.random.rand(a, b)

from numpy import random

x = random.rand(2, 3)
print(x)
[[ 0.1169922   0.08614147  0.17997144]
 [ 0.5694889   0.43067372  0.62135592]]

x, y = random.rand(2, 3)
print(x)
print(y)
[ 0.60527337  0.78765269  0.71884661]
[ 0.67420571  0.946359    0.7632273 ]

[numpy - 基本数据类型、多维数组ndarray及函数操作]

np.random.randint(a, b, size=(c, d))

size : int or tuple of ints, optional

raw_user_item_mat = random.randint(0, 10, size=(3,4))     #指定生成随机数范围和生成的多维数组大小
print(raw_user_item_mat)
[[3 6 2 8]
 [3 1 2 4]
 [9 4 5 0]]

[numpy.random.randint]

[Random sampling (numpy.random)]

高级随机生成数据函数

二项分布函数

np.random.binomial(n,p,size=N),函数的返回值表示n中成功的次数，且以Cn^x*p^x*(1-p)^(n-x)的概率选择成功x次

每一轮抛9枚硬币：

outcome = np.random.binomial(9, 0.5, size=len(cash))

[二项分布]

超几何分布

超几何分布是统计学上一种离散概率分布。它描述了由有限个物件中抽出n个物件，成功抽出指定种类的物件的次数（不归还）。
在产品质量的不放回抽检中，若N件产品中有M件次品，抽检n件时所得次品数X=k，则P(X=k)=C（M，k）·C(N-M,n-k)/C(N,n）， C（a b）为古典概型的组合形式，a为下限，b为上限，此时我们称随机变量X服从超几何分布（hypergeometric distribution）。
（1）超几何分布的模型是不放回抽样。
（2）超几何分布中的参数是M,N,n上述超几何分布记作X~H(N,n,M）。

NumPy random模块中的hypergeometric函数可以模拟这种分布。

outcomes = np.random.hypergeometric(25, 1, 3, size=len(points))
#使用hypergeometric函数初始化游戏的结果矩阵。该函数的第一个参数为罐中普通球的数量，第二个参数为“倒霉球”的数量，第三个参数为每次采样（摸球）的数量。共进行超几何分布size次。返回size个抽样结果，也就是普通球（正品）的数目。

[超几何分布]

连续分布

连续分布可以用PDF（Probability Density Function，概率密度函数）来描述。随机变量落在某一区间内的概率等于概率密度函数在该区间的曲线下方的面积。

NumPy的random模块中有一系列连续分布的函数——beta、chisquare、exponential、f、gamma、gumbel、laplace、lognormal、logistic、multivariate_normal、noncentral_chisquare、noncentral_f、normal等。

绘制正态分布
随机数可以从正态分布中产生，它们的直方图能够直观地刻画正态分布。
import numpy as np
import matplotlib.pyplot as plt
#使用NumPy random模块中的normal函数产生指定数量的随机数。
N=10000

normal_values = np.random.normal(size=N) #lz一般使用stats.norm.rvs(loc=0, scale=0.1, size=10)来生成高斯分布随机数[Scipy教程 - 统计函数库scipy.stats]

#绘制分布直方图和理论上的概率密度函数（均值为0、方差为1的正态分布）曲线。
dummy, bins, dummy = plt.hist(normal_values, np.sqrt(N), normed=True, lw=1)
sigma = 1
mu = 0

plt.plot(bins, 1/(sigma * np.sqrt(2 * np.pi)) * np.exp( - (bins -mu)**2 / (2 * sigma**2) ),lw=2) #lz提示，也可以使用scipy.stat.norm.pdf来生成非随机的高斯分布图[Scipy教程 - 统计函数库scipy.stats]

plt.show()

对数正态分布

np.random.lognormal(size=N)

皮皮blog

random应用实例

从大小为n的原始样本集D中不放回得随机选取n1个样本点,得到样本集D1和剩下的样本集D1_left：

random_index = np.ones_like(class_labels, dtype=bool)
random_index[np.random.choice(range(len(data_arr)), n1, replace=False)] = False
D1 = data_arr[random_index]
D1_left = data_arr[~random_index]

from:http://blog.csdn.net/pipisorry/article/details/39508417

ref: Python模块：生成随机数模块random