在sphinx4 5prealpha中运行Dialog演示时无法访问麦克风

我正在尝试运行sphinx 4 pre aplha的对话框演示，但它会出错。

我正在创建一个实时语音应用程序。

我使用maven导入了项目，并在堆栈溢出时遵循本指南： https ： //stackoverflow.com/a/25963020/2653162

该错误说明有关16 khz和通道为单声道的问题。很明显它是关于抽样的东西。关于麦克风也是如此。

我看了如何将麦克风设置更改为16 khz和16 bit，但在Windows 7中没有这样的选项

：仅赢得7中的选项

问题是HelloWorld和对话框演示在sphinx4 1.06 beta中运行良好，但在我尝试了最新版本后，它会出现以下错误：

Exception in thread "main" java.lang.IllegalStateException: javax.sound.sampled.LineUnavailableException: line with format PCM_SIGNED 16000.0 Hz, 16 bit, mono, 2 bytes/frame, little-endian not supported. at edu.cmu.sphinx.api.Microphone.(Microphone.java:38) at edu.cmu.sphinx.api.SpeechSourceProvider.getMicrophone(SpeechSourceProvider.java:18) at edu.cmu.sphinx.api.LiveSpeechRecognizer.(LiveSpeechRecognizer.java:34) at edu.cmu.sphinx.demo.dialog.Dialog.main(Dialog.java:145) Caused by: javax.sound.sampled.LineUnavailableException: line with format PCM_SIGNED 16000.0 Hz, 16 bit, mono, 2 bytes/frame, little-endian not supported. at com.sun.media.sound.DirectAudioDevice$DirectDL.implOpen(DirectAudioDevice.java:513) at com.sun.media.sound.AbstractDataLine.open(AbstractDataLine.java:121) at com.sun.media.sound.AbstractDataLine.open(AbstractDataLine.java:413) at edu.cmu.sphinx.api.Microphone.(Microphone.java:36) ... 3 more

无法弄清楚如何解决这个问题。

如果修改SpeechSourceProvider以返回常量麦克风参考，则不会尝试创建多个麦克风参考，这是问题的根源。

 public class SpeechSourceProvider { private static final Microphone mic = new Microphone(16000, 16, true, false); Microphone getMicrophone() { return mic; } }

这里的问题是您不希望多个线程尝试访问单个资源，但是对于演示，识别器会根据需要停止并启动，以便它们不会全部竞争麦克风。

正如Nickolay在源伪造论坛（此处）中所解释的那样，麦克风资源需要由当前使用它的识别器释放，以便能够使用麦克风。在修复API时，我对sphinx API中的某些类进行了以下更改，作为临时解决方法。这可能不是最佳解决方案，在提出更好的解决方案之前猜测，这将有效。

我创建了一个名为MicrophoneExtention的类，其源代码与Microphone类相同，并添加了以下方法：


     public void closeLine（）{
         line.close（）;
     }

类似地， LiveSpeechRecognizerExtention类包含LiveSpeechRecognizerExtention类的源代码，并进行了以下更改：

使用我定义的MicrohphoneExtention类：
private final MicroPhoneExtention microphone;
在构造函数中，
microphone =new MicrophoneExtention(16000, 16, true, false);
并添加以下方法：

     public void closeRecognitionLine（）{
         microphone.closeLine（）;
     }

最后我编辑了DialogDemo的主要方法。

    配置配置= new Configuration（）;
     configuration.setAcousticModelPath（ACOUSTIC_MODEL）;
     configuration.setDictionaryPath（DICTIONARY_PATH）;
     configuration.setGrammarPath（GRAMMAR_PATH）;
     configuration.setUseGrammar（真）;

configuration.setGrammarName("dialog"); LiveSpeechRecognizerExtention recognizer = new LiveSpeechRecognizerExtention(configuration); Recognizer.startRecognition(true); while (true) { System.out.println("Choose menu item:"); System.out.println("Example: go to the bank account"); System.out.println("Example: exit the program"); System.out.println("Example: weather forecast"); System.out.println("Example: digits\n"); String utterance = recognizer.getResult().getHypothesis(); if (utterance.startsWith("exit")) break; if (utterance.equals("digits")) { recognizer.stopRecognition(); recognizer.closeRecognitionLine(); configuration.setGrammarName("digits.grxml"); recognizer=new LiveSpeechRecognizerExtention(configuration); recognizeDigits(recognizer); recognizer.closeRecognitionLine(); configuration.setGrammarName("dialog"); recognizer=new LiveSpeechRecognizerExtention(configuration); recognizer.startRecognition(true); } if (utterance.equals("bank account")) { recognizer.stopRecognition(); recognizerBankAccount(Recognizer); recognizer.startRecognition(true); } if (utterance.endsWith("weather forecast")) { recognizer.stopRecognition(); recognizer.closeRecognitionLine(); configuration.setUseGrammar(false); configuration.setLanguageModelPath(LANGUAGE_MODEL); recognizer=new LiveSpeechRecognizerExtention(configuration); recognizeWeather(recognizer); recognizer.closeRecognitionLine(); configuration.setUseGrammar(true); configuration.setGrammarName("dialog"); recognizer=new LiveSpeechRecognizerExtention(configuration); recognizer.startRecognition(true); } } Recognizer.stopRecognition();

很明显， DialogDemo的方法签名需要改变……希望这有帮助…而且最后一点，我不确定我所做的事情是否合法。如果我做错了什么，请善意指出我的错误：D

aetherwalker的答案对我有用 – 更详细地说，我用我自己的实现覆盖了以下文件，其中我只改变了使用过的SpeechSourceProvider：

第一个是AbstractSpeechRecognizer：

 public class MaxAbstractSpeechRecognizer { protected final Context context; protected final Recognizer recognizer; protected ClusteredDensityFileData clusters; protected final MaxSpeechSourceProvider speechSourceProvider; /** * Constructs recognizer object using provided configuration. * @param configuration initial configuration * @throws IOException if IO went wrong */ public MaxAbstractSpeechRecognizer(Configuration configuration) throws IOException { this(new Context(configuration)); } protected MaxAbstractSpeechRecognizer(Context context) throws IOException { this.context = context; recognizer = context.getInstance(Recognizer.class); speechSourceProvider = new MaxSpeechSourceProvider(); } .......................

然后是LiveSpeechRecognizer：

 public class MaxLiveSpeechRecognizer extends MaxAbstractSpeechRecognizer { private final Microphone microphone; /** * Constructs new live recognition object. * * @param configuration common configuration * @throws IOException if model IO went wrong */ public MaxLiveSpeechRecognizer(Configuration configuration) throws IOException { super(configuration); microphone = speechSourceProvider.getMicrophone(); context.getInstance(StreamDataSource.class) .setInputStream(microphone.getStream()); }......................

最后但并非最不重要的是SpeechSourceProvider：

 import edu.cmu.sphinx.api.Microphone; public class MaxSpeechSourceProvider { private static final Microphone mic = new Microphone(16000, 16, true, false); Microphone getMicrophone() { return mic; } }

在sphinx4 5prealpha中运行Dialog演示时无法访问麦克风

如何使用CMU Sphinx 4将语音转换为带有英语voxforge模型的文本

如何查看Pocketsphinx字典中是否存在单词？

转换CMU Sphinx 4输入的音频文件

在Android上安装Pocketsphinx

使用Sphinx4进行关键字或关键短语识别

无法启动服务？（语音识别）

一起运行Pocketsphinx和Google TTS

在sphinx4 5prealpha中运行Dialog演示时无法访问麦克风

如何使用CMU Sphinx 4将语音转换为带有英语voxforge模型的文本

如何查看Pocketsphinx字典中是否存在单词？

转换CMU Sphinx 4输入的音频文件

在Android上安装Pocketsphinx

使用Sphinx4进行关键字或关键短语识别

无法启动服务？ （语音识别）

一起运行Pocketsphinx和Google TTS

无法启动服务？（语音识别）