【librosa】音频特征提取

参考

log-spectrogram

计算log-scaled spectrogram,librosa库中并没有现成的函数,需要自行计算。

计算步骤:

  • load -> stft -> abs -> power -> log
y = librosa.load('test.wav', sr = sr)
ft = librosa.stft(y, n_fft=512, hop_length=256)
log_spec = librosa.power_to_db(np.abs(ft)**2)

melspectrogram

计算mel-scaled spectrogram。

源码中:melspectrogram = _spectrogram + Mel filters [np.dot]
并且:_spectrogram = stft + abs + power(default 1)
melspectrogram 源码中,default power = 2

源码参考,使用方式如下:
librosa.feature.melspectrogram(y=None, sr=22050, S=None, n_fft=2048, hop_length=512, win_length=None, window='hann', center=True, pad_mode='reflect', power=2.0, **kwargs)
输出:【S】【Mel spectrogram】np.ndarray [shape=(n_mels, t)]

mfcc

计算Mel-frequency cepstral coefficients (MFCCs)。

基本步骤:

  • 预处理(预加重-分帧加窗)
  • 逐帧:fft -> 功率谱 -> mel滤波器组-> 对数功率 -> DCT -> mfccs

源码中:mfcc = melspectrogram + power_to_db + dct
power_to_db:Convert a power spectrogram (amplitude squared) to decibel (dB) units,10 * log10(S / ref)

源码参考,使用方式如下:
librosa.feature.mfcc(y=None, sr=22050, S=None, n_mfcc=20, dct_type=2, norm='ortho', lifter=0, **kwargs)
输出:【M】【MFCC sequence】np.ndarray [shape=(n_mfcc, t)]

chroma_stft

Compute a chromagram from a waveform or power spectrogram.

源码中:chroma_stft = _spectrogram + estimate_tuning + chroma(np.dot) + normalize
bins_per_octave = n_chroma

源码参考,使用方式如下:
librosa.feature.chroma_stft(y=None, sr=22050, S=None, norm=inf, n_fft=2048, hop_length=512, win_length=None, window='hann', center=True, pad_mode='reflect', tuning=None, n_chroma=12, **kwargs)
输出:【chromagram】【Normalized energy for each chroma bin at each frame.】np.ndarray [shape=(n_chroma, t)]

chroma_cqt

计算Constant-Q色谱图。

源码中:chroma_cqt = cqt + abs + cqt_to_chroma(dot) + normalize

源码参考,使用方式如下:
librosa.feature.chroma_cqt(y=None, sr=22050, C=None, hop_length=512, fmin=None, norm=inf, threshold=0.0, tuning=None, n_chroma=12, n_octaves=7, window=None, bins_per_octave=36, cqt_mode='full')
输出:【chromagram】np.ndarray [shape=(n_chroma, t)]

delta

特征的动态信息。The derivatives of features provides the information of dynamics of features over the time(相当于在时间轴上的斜率).
Compute delta features: local estimate of the derivative of the input data along the selected axis. Delta features are computed Savitsky-Golay filtering.
其中,SD滤波器是一种基于卷积计算的低通平滑滤波器(在时域内基于多项式,通过移动窗口利用最小二乘法进行最佳拟合,即对一定长度窗口内的数据点进行k阶多项式拟合),是移动平滑算法的改进。SD filter可以提高光谱的平滑性,并降低噪音的干扰。
python代码使用以及源码参考:scipy.signal.savgol_filter
更多可参考:【Savitzky-Golay平滑去噪】和【Python 生成曲线进行快速平滑处理】。

delta在librosa中的源码参考: librosa.feature.delta,使用方式如下:
librosa.feature.delta(data, width=9, order=1, axis=-1, mode='interp', **kwargs)
The default axis along which to compute deltas is -1 (columns).
输出delta matrix of data at specified order:【delta_data】np.ndarray [shape=(d, t)]

原文地址:https://www.cnblogs.com/ytxwzqin/p/14231148.html