pandas_series04

  1. 如何计算两个series之间的欧氏距离
        p = pd.Series([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
        q = pd.Series([10, 9, 8, 7, 6, 5, 4, 3, 2, 1])
        
        \# 方法1
        sum((p - q)**2)**.5
        
        \# 方法2
        np.linalg.norm(p-q)
       
        #>    18.16590212458495
  2. 如何在数值series中找局部最大值
    局部最大值对应二阶导局部最小值
        ser = pd.Series([2, 10, 3, 4, 9, 10, 2, 7, 3])
        
        \# 二阶导
        dd = np.diff(np.sign(np.diff(ser)))
        \# 二阶导的最小值对应的值为最大值,返回最大值的索引
        peak_locs = np.where(dd == -2)[0] + 1
        peak_locs
        
        #>    array([1, 5, 7], dtype=int64)
  3. 如何用最少出现的字符替换空格符
    my_str = 'dbc deb abed gade'
    
    # 方法
    ser = pd.Series(list('dbc deb abed gade'))
    # 统计元素的频数
    freq = ser.value_counts()
    print(freq)
    # 求最小频数的字符
    least_freq = freq.dropna().index[-1]
    # 替换
    "".join(ser.replace(' ', least_freq))
    
    #>    d    4
             3
        b    3
        e    3
        a    2
        c    1
        g    1
        dtype: int64
    
    #>    'dbcgdebgabedggade'

    27如何计算数值series的自相关系数

    ser = pd.Series(np.arange(20) + np.random.normal(1, 10, 20))
    
    # 求series的自相关系数,i为偏移量
    autocorrelations = [ser.autocorr(i).round(2) for i in range(11)]
    print(autocorrelations[1:])
    # 选择最大的偏移量
    print('Lag having highest correlation: ', np.argmax(np.abs(autocorrelations[1:]))+1)
    
    #>    [0.33, 0.41, 0.48, 0.01, 0.21, 0.16, -0.11, 0.05, 0.34, -0.24]
    #>    Lag having highest correlation:  3
  4. 如何对series进行算术运算操作
    # 如何对series之间进行算法运算
    import pandas as pd
    series1 = pd.Series([3,4,4,4],['index1','index2','index3','index4'])
    series2 = pd.Series([2,2,2,2],['index1','index2','index33','index44'])
    # 加法
    series_add = series1 + series2
    print(series_add)
    # 减法
    series_minus = series1 - series2
    # series_minus
    # 乘法
    series_multi = series1 * series2
    # series_multi
    # 除法
    series_div = series1/series2
    series_div
    series是基于索引进行算数运算操作的,pandas会根据索引对数据进行运算,若series之间有不同的索引,对应的值就为Nan。结果如下:
    #加法:
    index1     5.0
    index2     6.0
    index3     NaN
    index33    NaN
    index4     NaN
    index44    NaN
    dtype: float64
    #除法:
    index1     1.5
    index2     2.0
    index3     NaN
    index33    NaN
    index4     NaN
    index44    NaN
    dtype: float64
原文地址:https://www.cnblogs.com/huaobin/p/15687038.html