在sphinx4 5prealpha中运行Dialog演示时无法访问麦克风
我正在尝试运行sphinx 4 pre aplha的对话框演示,但它会出错。
我正在创建一个实时语音应用程序。
我使用maven导入了项目,并在堆栈溢出时遵循本指南: https : //stackoverflow.com/a/25963020/2653162
该错误说明有关16 khz和通道为单声道的问题。 很明显它是关于抽样的东西。 关于麦克风也是如此。
我看了如何将麦克风设置更改为16 khz和16 bit,但在Windows 7中没有这样的选项
:
问题是HelloWorld和对话框演示在sphinx4 1.06 beta中运行良好,但在我尝试了最新版本后,它会出现以下错误:
Exception in thread "main" java.lang.IllegalStateException: javax.sound.sampled.LineUnavailableException: line with format PCM_SIGNED 16000.0 Hz, 16 bit, mono, 2 bytes/frame, little-endian not supported. at edu.cmu.sphinx.api.Microphone.(Microphone.java:38) at edu.cmu.sphinx.api.SpeechSourceProvider.getMicrophone(SpeechSourceProvider.java:18) at edu.cmu.sphinx.api.LiveSpeechRecognizer.(LiveSpeechRecognizer.java:34) at edu.cmu.sphinx.demo.dialog.Dialog.main(Dialog.java:145) Caused by: javax.sound.sampled.LineUnavailableException: line with format PCM_SIGNED 16000.0 Hz, 16 bit, mono, 2 bytes/frame, little-endian not supported. at com.sun.media.sound.DirectAudioDevice$DirectDL.implOpen(DirectAudioDevice.java:513) at com.sun.media.sound.AbstractDataLine.open(AbstractDataLine.java:121) at com.sun.media.sound.AbstractDataLine.open(AbstractDataLine.java:413) at edu.cmu.sphinx.api.Microphone.(Microphone.java:36) ... 3 more
无法弄清楚如何解决这个问题。
如果修改SpeechSourceProvider
以返回常量麦克风参考,则不会尝试创建多个麦克风参考,这是问题的根源。
public class SpeechSourceProvider { private static final Microphone mic = new Microphone(16000, 16, true, false); Microphone getMicrophone() { return mic; } }
这里的问题是您不希望多个线程尝试访问单个资源,但是对于演示,识别器会根据需要停止并启动,以便它们不会全部竞争麦克风。
正如Nickolay在源伪造论坛( 此处 )中所解释的那样,麦克风资源需要由当前使用它的识别器释放,以便能够使用麦克风。 在修复API时,我对sphinx API中的某些类进行了以下更改,作为临时解决方法。 这可能不是最佳解决方案,在提出更好的解决方案之前猜测,这将有效。
我创建了一个名为MicrophoneExtention
的类,其源代码与Microphone
类相同,并添加了以下方法:
public void closeLine(){ line.close(); }
类似地, LiveSpeechRecognizerExtention
类包含LiveSpeechRecognizerExtention
类的源代码,并进行了以下更改:
- 使用我定义的MicrohphoneExtention类:
private final MicroPhoneExtention microphone;
- 在构造函数中,
microphone =new MicrophoneExtention(16000, 16, true, false);
- 并添加以下方法:
public void closeRecognitionLine(){ microphone.closeLine(); }
最后我编辑了DialogDemo
的主要方法。
配置配置= new Configuration(); configuration.setAcousticModelPath(ACOUSTIC_MODEL); configuration.setDictionaryPath(DICTIONARY_PATH); configuration.setGrammarPath(GRAMMAR_PATH); configuration.setUseGrammar(真);
configuration.setGrammarName("dialog"); LiveSpeechRecognizerExtention recognizer = new LiveSpeechRecognizerExtention(configuration); Recognizer.startRecognition(true); while (true) { System.out.println("Choose menu item:"); System.out.println("Example: go to the bank account"); System.out.println("Example: exit the program"); System.out.println("Example: weather forecast"); System.out.println("Example: digits\n"); String utterance = recognizer.getResult().getHypothesis(); if (utterance.startsWith("exit")) break; if (utterance.equals("digits")) { recognizer.stopRecognition(); recognizer.closeRecognitionLine(); configuration.setGrammarName("digits.grxml"); recognizer=new LiveSpeechRecognizerExtention(configuration); recognizeDigits(recognizer); recognizer.closeRecognitionLine(); configuration.setGrammarName("dialog"); recognizer=new LiveSpeechRecognizerExtention(configuration); recognizer.startRecognition(true); } if (utterance.equals("bank account")) { recognizer.stopRecognition(); recognizerBankAccount(Recognizer); recognizer.startRecognition(true); } if (utterance.endsWith("weather forecast")) { recognizer.stopRecognition(); recognizer.closeRecognitionLine(); configuration.setUseGrammar(false); configuration.setLanguageModelPath(LANGUAGE_MODEL); recognizer=new LiveSpeechRecognizerExtention(configuration); recognizeWeather(recognizer); recognizer.closeRecognitionLine(); configuration.setUseGrammar(true); configuration.setGrammarName("dialog"); recognizer=new LiveSpeechRecognizerExtention(configuration); recognizer.startRecognition(true); } } Recognizer.stopRecognition();
很明显, DialogDemo
的方法签名需要改变……希望这有帮助…而且最后一点,我不确定我所做的事情是否合法。 如果我做错了什么,请善意指出我的错误:D
aetherwalker的答案对我有用 – 更详细地说,我用我自己的实现覆盖了以下文件,其中我只改变了使用过的SpeechSourceProvider:
第一个是AbstractSpeechRecognizer:
public class MaxAbstractSpeechRecognizer { protected final Context context; protected final Recognizer recognizer; protected ClusteredDensityFileData clusters; protected final MaxSpeechSourceProvider speechSourceProvider; /** * Constructs recognizer object using provided configuration. * @param configuration initial configuration * @throws IOException if IO went wrong */ public MaxAbstractSpeechRecognizer(Configuration configuration) throws IOException { this(new Context(configuration)); } protected MaxAbstractSpeechRecognizer(Context context) throws IOException { this.context = context; recognizer = context.getInstance(Recognizer.class); speechSourceProvider = new MaxSpeechSourceProvider(); } .......................
然后是LiveSpeechRecognizer:
public class MaxLiveSpeechRecognizer extends MaxAbstractSpeechRecognizer { private final Microphone microphone; /** * Constructs new live recognition object. * * @param configuration common configuration * @throws IOException if model IO went wrong */ public MaxLiveSpeechRecognizer(Configuration configuration) throws IOException { super(configuration); microphone = speechSourceProvider.getMicrophone(); context.getInstance(StreamDataSource.class) .setInputStream(microphone.getStream()); }......................
最后但并非最不重要的是SpeechSourceProvider:
import edu.cmu.sphinx.api.Microphone; public class MaxSpeechSourceProvider { private static final Microphone mic = new Microphone(16000, 16, true, false); Microphone getMicrophone() { return mic; } }