Watson STT Java – Websockets Java和HTTP POST之间的结果不同

我正在尝试构建一个采用流式音频输入的应用程序(例如:麦克风中的一行),并使用IBM Bluemix(Watson)进行语音到文本。

我简要地修改了这里找到的示例Java代码。 这个例子发送了一个WAV,但我发送了一个FLAC ……这应该是无关紧要的。

结果很糟糕,非常糟糕。 这是我在使用Java Websockets代码时得到的:

{ "result_index": 0, "results": [ { "final": true, "alternatives": [ { "transcript": "it was six weeks ago today the terror ", "confidence": 0.92 } ] } ] } 

现在,将上述结果与下面的结果进行比较。 这些是发送相同内容但使用cURL(HTTP POST)时的结果:

 { "results": [ { "alternatives": [ { "confidence": 0.945, "transcript": "it was six weeks ago today the terrorists attacked the US consulate in Benghazi Libya now we've obtained email alerts that were put out by the state department as the attack unfolded as you know four Americans were killed including ambassador Christopher Stevens " } ], "final": true }, { "alternatives": [ { "confidence": 0.942, "transcript": "sharyl Attkisson has our story " } ], "final": true } ], "result_index": 0 } 

这是一个几乎完美无瑕的结果。

为什么使用Websockets时有所不同?

该问题已在3.0.0-RC1版本中修复。

您可以从以下位置获取新jar子:

  1. Maven的

      com.ibm.watson.developer_cloud java-sdk 3.0.0-RC1  
  2. 摇篮

     'com.ibm.watson.developer_cloud:java-sdk:3.0.0-RC1' 
  3. jar

    下载jar-with-dependencies (~1.4MB)


以下是如何使用WebSockets识别flac音频文件的示例

 SpeechToText service = new SpeechToText(); service.setUsernameAndPassword("", ""); FileInputStream audio = new FileInputStream("path-to-audio-file.flac"); RecognizeOptions options = new RecognizeOptions.Builder() .continuous(true) .interimResults(true) .contentType(HttpMediaType.AUDIO_FLAC) .build(); service.recognizeUsingWebSocket(audio, options, new BaseRecognizeCallback() { @Override public void onTranscription(SpeechResults speechResults) { System.out.println(speechResults); } }); 

}

要测试的FLAC文件: https : //s3.amazonaws.com/mozart-company/tmp/4.flac


注意: 3.0.0-RC1候选版本 。 我们将在下周( 3.0.1 )进行生产发布。