【笔记】range函数在py3里面的处理及numpy库效率比较【原创】

今天看了一下,numpy数组操作其中一段代码,主要是测试用纯python和numpy之间的性能问题

在py2环境下,代码如下:

def pysum(n):
    a = range(n)
    b = range(n)
    c = []
    i = 0
    for i in list(range(len(a))):
        a[i] = i ** 2
        b[i] = i ** 3
        c.append(a[i] + b[i])
    return c
c = pysum(10)

py3下报错,如下

'range' object does not support item assignment

可以看出,a = range(n)实际为range(0, n) 为range object,而非列表数组,需要将a转换成列表,a = list(range(n)),例  a = list(range(10))为 a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

将代码转换成

def pysum(n):
    a = list(range(n))
    b = list(range(n))
    c = []
    for i in list(range(len(a))):
        a[i] = i ** 2
        b[i] = i ** 3
        c.append(a[i] + b[i])
    return a, b, c  #同时输出a, b, c看看结果
pysum(10)

结果为

Out[42]: 
([0, 1, 4, 9, 16, 25, 36, 49, 64, 81],
 [0, 1, 8, 27, 64, 125, 216, 343, 512, 729],
 [0, 2, 12, 36, 80, 150, 252, 392, 576, 810])

用numpy实现

import numpy as np
def npsum(n):
    a = np.arange(n) ** 2
    b = np.arange(n) ** 3
    c = a + b
    return c
npsum(10)

比较一下两种实现方式的效率

#效率比较

from datetime import datetime

size = 1000

start = datetime.now()
c = pysum(size)
delta = datetime.now() - start
print("The last 2 elements of the sum", c[-2:])
print("PythonSum elapsed time in microseconds", delta.microseconds)

start = datetime.now()
c = npsum(size)
delta = datetime.now() - start
print("The last 2 elements of the sum", c[-2:])
print("NumPySum elapsed time in microseconds", delta.microseconds)

输出结果

#用pysum()输出,打印结果如下
The last 2 elements of the sum [995007996, 998001000]
PySum elapsed time in microseconds 2000

#用npsum()输出,打印结果如下
The last 2 elements of the sum [995007996 998001000]
NPSum elapsed time in microseconds 0

所以说用npsum执行的效率远远高于用纯python写出的效率,这在数据分析里面非常重要,特别是在机器学习特别耗计算资源的情况下

原文地址:https://www.cnblogs.com/yizhenfeng/p/7152652.html