【翻译】Kinect v2程序设计(C++-) AudioBeam篇

Kinect v2，Microphone Array可以用来对于水平面音源方向的推测（AudioBeam）和语音识别（Speech Recognition）。这一节是介绍如何取得AudioBeam。

上一节，介绍如何使用通过Kinect SDK v2预览版，从Kinect v2预览版的Color Camera和Depth 传感器中获取数据的方法。

本节，将介绍从Kinect的Microphone Array中取得AudioBeam（水平面音源方向的推测）的方法。

Microphone Array

在第一节中介绍过，Kinect除搭载了Color Camera，Depth传感器之外，还有Microphone Array。

Microphone Array由4个Microphone构成，能进行水平面音源方向的推测（AudioBeam）和语音识别（Speech Recognition）等。

这一节，将介绍取得AudioBeam的方法。

图1 Kinect v2预览版的Microphone Array

图2 Kinect SDK v2预览版的示例程序（AudioBasics）

示例程序

用Kinect SDK v2取得AudioBeam，将结果显示在console上的示例程序。

Audio的功能（AudioBeam，Speech Recognition）的数据取得流程，在「Sensor」～「Source」这一块与之前的Image（Color，Depth，BodyIndex，Body）的取得流程一样。不过在这之后还请注意Audio功能所特有的部分，这里主要是介绍与Image所同样的数据取得流程。

这个示例程序的全部内容，在下面的github里全部公开了。

https://github.com/UnaNancyOwen/Kinect2Sample

图3 Kinect SDK v2预览版的数据取得的流程（重发）

「Sensor」

取得「Sensor」

// Sensor
IKinectSensor* pSensor;   // ……1
HRESULT hResult = S_OK;
hResult = GetDefaultKinectSensor（ &pSensor ）;  //……2
if（ FAILED（ hResult ） ）{
  std::cerr << "Error : GetDefaultKinectSensor" << std::endl;
  return -1;
}
hResult = pSensor->Open（）;  //……3
if（ FAILED（ hResult ） ）{
  std::cerr << "Error : IKinectSensor::Open（）" << std::endl;
  return -1;
}

列表1.1 相当于图1「Sensor」的部分（重发）

1 处理Kinect v2预览版的Sensor接口。

2 取得默认的Sensor。

3 打开Sensor。

「Source」

从「Sensor」取得「Source」。

// Source
IAudioSource* pAudioSource;  //……1
hResult = pSensor->get_AudioSource（ &pAudioSource ）;  //……2
if（ FAILED（ hResult ） ）{
  std::cerr << "Error : IKinectSensor::get_AudioSource（）" << std::endl;
  return -1;
}

列表1.2 相当于图1「Source」的部分

1 Audio功能的Source接口。

2 从Sensor取得Source。

「AudioBeamList」～「OpenAudioBeam」

从「Source」取得「AudioBeamList」，从List里打开指定的「AudioBeam」。

// Get Audio Beam List
IAudioBeamList* pAudioBeamList;   //……1
hResult = pAudioSource->get_AudioBeams（ &pAudioBeamList ）;  //……2
if（ FAILED（ hResult ） ）{
  std::cerr << "Error : IAudioSource::get_AudioBeams（）" << std::endl;
  return -1;
}
// Open Audio Beam
IAudioBeam* pAudioBeam;  //……3
hResult = pAudioBeamList->OpenAudioBeam（ 0， &pAudioBeam ）;  //……4
if（ FAILED（ hResult ） ）{
  std::cerr << "Error : IAudioBeamList::OpenAudioBeam（）" << std::endl;
  return -1;
}

列表1.3 取得Microphone Array和AudioBeam

1 AudioBeamList接口。

2 从Source取得AudioBeam，然后取得Microphone Array的List。

3 AudioBeam接口。

4 从List里取得AudioBeam，打开Microphone Array。

0代表第一个被找到的默认的Microphone Array。

「Get Angle and Confidence」

从「Stream」读入Audio数据，取得音源方向和推测的信赖值。

while（ 1 ）{
  // Get Angle and Confidence
  FLOAT angle = 0.0f;
  FLOAT confidence = 0.0f;
  pAudioBeam->get_BeamAngle（ &angle ）; // radian [-0.872665f， 0.872665f]  ……1
  pAudioBeam->get_BeamAngleConfidence（ &confidence ）; // confidence [0.0f， 1.0f]  ……2
  // Show Result
  // Convert from radian to degree : degree = radian * 180 / Pi
  if（ confidence > 0.5f ）{
    std::system（ "cls" ）;
    std::cout << "Angle : " << angle * 180.0f / M_PI << "， Confidence : " << confidence << std::endl;  //……3
  }
  // Input Key （ Exit ESC key ）
  if（ GetKeyState（ VK_ESCAPE ） < 0 ）{
    break;
  }
}

列表1.5 取得音源方向和推测的信赖值

1 取得音源方向。

角度单位是radian（弧度）。

2 取得音源方向的推测信赖值。

取值范围是0.0f～1.0f，数值越大表示可信度更高。

3 把radian（弧度）转为dgree（度数）在Console上输出。

只显示当前的数值，如果注释掉「std::system（"cls"）;」的话，那么之前的值也会留在Cconsole上。

可以取得的音源方向是Kinect v2的中心正面向水平方向左右+/-50°的范围。　

图4 音源方向的检测范围（+/-50°）

因为取得的音源方向的角度单位是「radian（弧度）」要利用公式1转换为「degree（度数）」。

【公式1 adian（弧度）转换为（度数）】

弧度（radian）→角度（degree）∶ degree=radian×180÷π
角度（degree）→弧度（radian）∶ radian=degree×π÷180

运行结果

运行这个示例程序，就像图5一样，将显示出声音的音源角度（Angle）和推测的信赖值（Confidence）。

如果Microphone Array的反应过度敏感或迟钝，可以通过调整操作系统的录音设备属性来改善（在［控制面板］-[声音]-[录制]的录音设备中的「麦克排列Xbox NUI Sensor」的属性里调整等级）。

图5 运行结果

在Console上显示了音源方向的角度和推测的信赖值。

总结

这一节是介绍如何使用Kinect SDK v2预览版取得AudioBeam。

与Kinect SDK v1相比，从Microphone Array取得Audio数据的处理更加简单。