多线程与多进程

numpy array 不管在什么情况下,运行起来,都比list要i快

资料显示,如果多线程的进程是CPU密集型的,那多线程并不能有多少效率上的提升,相反还可能会因为线程的频繁切换,导致效率下降,推荐使用多进程;如果是IO密集型,多线程进程可以利用IO阻塞等待时的空闲时间执行其他线程,提升效率。所以我们根据实验对比不同场景的效率

format

https://www.cnblogs.com/chunlaipiupiupiu/p/7978669.html

all the input array dimensions except for the concatenation axis must match exactly

result = np.concatenate((result, np.array((question, q2, q2_id, sim, tuple(q2_cut)),dtype=object)),axis=0)
不能拿一个空的result与非空的拼接

vec1 = np.array([[1, 2],
[1, 1]])
vec2 = np.array([[10, 20],
[1, 1]])

print(np.sum(vec1,axis=1))

vec1 = np.array([1, 2])
vec1 = np.tile(vec1,(5,1))
print(vec1)

[[1 2]
[1 2]
[1 2]
[1 2]
[1 2]]

# def cal():
#
# dist = np.linalg.norm(vec1-vec2,axis=1)
# sim = (1.0 / (1.0 + dist))
# return sim
#
numpy的np.linalg.norm跟scipy的底层一样,但是numpy的可以用两个二维向量相减的方式,即对一个n行矩阵,计算欧氏距离
vec1 = np.array([[1, 2],
[2,3]])
def cal():

dist = np.linalg.norm(vec1,axis=1)
sim = (1.0 / (1.0 + dist))
return sim

print(cal())

[0.30901699 0.21712927]


=====================================
如何在大规模计算欧氏距离的时候,节省时间
对比使用for循环和使用numpy广播的速度

import time
def cal(vec):

dist = np.linalg.norm(vec)
sim = (1.0 / (1.0 + dist))
return sim

def cal1(vec):

dist = np.linalg.norm(vec)
sim = (1.0 / (1.0 + dist))
return sim

vec1 = np.random.randn(50000)
vec1 = np.tile(vec1,(10000,1))

time1=time.time()
for i in vec1:
cal(i)
time2 = time.time()
print(time2-time1)

time1=time.time()
cal1(vec1)

time2 = time.time()
print(time2-time1)

0.39404821395874023
0.12035775184631348

原文地址:https://www.cnblogs.com/yjybupt/p/10270562.html