在sphinx4 5prealpha中运行Dialog演示时无法访问麦克风

我正在尝试运行sphinx 4 pre aplha的对话框演示,但它会出错。

我正在创建一个实时语音应用程序。

我使用maven导入了项目,并在堆栈溢出时遵循本指南: https : //stackoverflow.com/a/25963020/2653162

该错误说明有关16 khz和通道为单声道的问题。 很明显它是关于抽样的东西。 关于麦克风也是如此。

我看了如何将麦克风设置更改为16 khz和16 bit,但在Windows 7中没有这样的选项

仅赢得7中的选项

问题是HelloWorld和对话框演示在sphinx4 1.06 beta中运行良好,但在我尝试了最新版本后,它会出现以下错误:

Exception in thread "main" java.lang.IllegalStateException: javax.sound.sampled.LineUnavailableException: line with format PCM_SIGNED 16000.0 Hz, 16 bit, mono, 2 bytes/frame, little-endian not supported. at edu.cmu.sphinx.api.Microphone.(Microphone.java:38) at edu.cmu.sphinx.api.SpeechSourceProvider.getMicrophone(SpeechSourceProvider.java:18) at edu.cmu.sphinx.api.LiveSpeechRecognizer.(LiveSpeechRecognizer.java:34) at edu.cmu.sphinx.demo.dialog.Dialog.main(Dialog.java:145) Caused by: javax.sound.sampled.LineUnavailableException: line with format PCM_SIGNED 16000.0 Hz, 16 bit, mono, 2 bytes/frame, little-endian not supported. at com.sun.media.sound.DirectAudioDevice$DirectDL.implOpen(DirectAudioDevice.java:513) at com.sun.media.sound.AbstractDataLine.open(AbstractDataLine.java:121) at com.sun.media.sound.AbstractDataLine.open(AbstractDataLine.java:413) at edu.cmu.sphinx.api.Microphone.(Microphone.java:36) ... 3 more 

无法弄清楚如何解决这个问题。

如果修改SpeechSourceProvider以返回常量麦克风参考,则不会尝试创建多个麦克风参考,这是问题的根源。

 public class SpeechSourceProvider { private static final Microphone mic = new Microphone(16000, 16, true, false); Microphone getMicrophone() { return mic; } } 

这里的问题是您不希望多个线程尝试访问单个资源,但是对于演示,识别器会根据需要停止并启动,以便它们不会全部竞争麦克风。

正如Nickolay在源伪造论坛( 此处 )中所解释的那样,麦克风资源需要由当前使用它的识别器释放,以便能够使用麦克风。 在修复API时,我对sphinx API中的某些类进行了以下更改,作为临时解决方法。 这可能不是最佳解决方案,在提出更好的解决方案之前猜测,这将有效。


我创建了一个名为MicrophoneExtention的类,其源代码与Microphone类相同,并添加了以下方法:


     public void closeLine(){
         line.close();
     }

类似地, LiveSpeechRecognizerExtention类包含LiveSpeechRecognizerExtention类的源代码,并进行了以下更改:

  • 使用我定义的MicrohphoneExtention类:
    private final MicroPhoneExtention microphone;
  • 在构造函数中,
    microphone =new MicrophoneExtention(16000, 16, true, false);
  • 并添加以下方法:
     public void closeRecognitionLine(){
         microphone.closeLine();
     }

最后我编辑了DialogDemo的主要方法。

    配置配置= new Configuration();
     configuration.setAcousticModelPath(ACOUSTIC_MODEL);
     configuration.setDictionaryPath(DICTIONARY_PATH);
     configuration.setGrammarPath(GRAMMAR_PATH);
     configuration.setUseGrammar(真); 

configuration.setGrammarName("dialog"); LiveSpeechRecognizerExtention recognizer = new LiveSpeechRecognizerExtention(configuration); Recognizer.startRecognition(true); while (true) { System.out.println("Choose menu item:"); System.out.println("Example: go to the bank account"); System.out.println("Example: exit the program"); System.out.println("Example: weather forecast"); System.out.println("Example: digits\n"); String utterance = recognizer.getResult().getHypothesis(); if (utterance.startsWith("exit")) break; if (utterance.equals("digits")) { recognizer.stopRecognition(); recognizer.closeRecognitionLine(); configuration.setGrammarName("digits.grxml"); recognizer=new LiveSpeechRecognizerExtention(configuration); recognizeDigits(recognizer); recognizer.closeRecognitionLine(); configuration.setGrammarName("dialog"); recognizer=new LiveSpeechRecognizerExtention(configuration); recognizer.startRecognition(true); } if (utterance.equals("bank account")) { recognizer.stopRecognition(); recognizerBankAccount(Recognizer); recognizer.startRecognition(true); } if (utterance.endsWith("weather forecast")) { recognizer.stopRecognition(); recognizer.closeRecognitionLine(); configuration.setUseGrammar(false); configuration.setLanguageModelPath(LANGUAGE_MODEL); recognizer=new LiveSpeechRecognizerExtention(configuration); recognizeWeather(recognizer); recognizer.closeRecognitionLine(); configuration.setUseGrammar(true); configuration.setGrammarName("dialog"); recognizer=new LiveSpeechRecognizerExtention(configuration); recognizer.startRecognition(true); } } Recognizer.stopRecognition();

很明显, DialogDemo的方法签名需要改变……希望这有帮助…而且最后一点,我不确定我所做的事情是否合法。 如果我做错了什么,请善意指出我的错误:D

aetherwalker的答案对我有用 – 更详细地说,我用我自己的实现覆盖了以下文件,其中我只改变了使用过的SpeechSourceProvider:

第一个是AbstractSpeechRecognizer:

 public class MaxAbstractSpeechRecognizer { protected final Context context; protected final Recognizer recognizer; protected ClusteredDensityFileData clusters; protected final MaxSpeechSourceProvider speechSourceProvider; /** * Constructs recognizer object using provided configuration. * @param configuration initial configuration * @throws IOException if IO went wrong */ public MaxAbstractSpeechRecognizer(Configuration configuration) throws IOException { this(new Context(configuration)); } protected MaxAbstractSpeechRecognizer(Context context) throws IOException { this.context = context; recognizer = context.getInstance(Recognizer.class); speechSourceProvider = new MaxSpeechSourceProvider(); } ....................... 

然后是LiveSpeechRecognizer:

 public class MaxLiveSpeechRecognizer extends MaxAbstractSpeechRecognizer { private final Microphone microphone; /** * Constructs new live recognition object. * * @param configuration common configuration * @throws IOException if model IO went wrong */ public MaxLiveSpeechRecognizer(Configuration configuration) throws IOException { super(configuration); microphone = speechSourceProvider.getMicrophone(); context.getInstance(StreamDataSource.class) .setInputStream(microphone.getStream()); }...................... 

最后但并非最不重要的是SpeechSourceProvider:

 import edu.cmu.sphinx.api.Microphone; public class MaxSpeechSourceProvider { private static final Microphone mic = new Microphone(16000, 16, true, false); Microphone getMicrophone() { return mic; } }