We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)
我用FunASR识别实时语音,由于那边推过来的流是通过WS推送PCM,每个包大小是234,然后用示例的funasr_wss_server.py去识别,vad和online效果不好。首先vad经常识别到的内容为[],导致fun_asr_online慢,然后fun_asr也执行的很慢,所以实时识别的数据推送出来的特别慢。
pip
The text was updated successfully, but these errors were encountered:
可以把包size弄大一些,例如,100ms一次
Sorry, something went wrong.
谢谢你的回答。我已经每次接收3s左右的包,大约有30000多二进制数量的包,再进行async_vad识别,但是依然返回很慢。查看了async_vad源码,利用model_vad进行generate获取segments_result,当这个返回值的数据长度为1时,并且里面的内容start和end不为0时才会进行后续的在线或者离线识别。但是我多次测试,发现要满足segments_result,当这个返回值的数据长度为1时,并且里面的内容start和end不为0很难,或者需要等待很久的时间,例如一段话全部说话,可能是1-2分钟左右。我现在对segments_result返回值代表的含义也是不理解的。烦请解开我的疑惑? 1.segments_result返回值代表的含义; 2.segments_result,当这个返回值的数据长度为1时,并且里面的内容start和end不为0要等待很久的原因。
No branches or pull requests
Notice: In order to resolve issues more efficiently, please raise issue following the template.
(注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)
❓ Questions and Help
我用FunASR识别实时语音,由于那边推过来的流是通过WS推送PCM,每个包大小是234,然后用示例的funasr_wss_server.py去识别,vad和online效果不好。首先vad经常识别到的内容为[],导致fun_asr_online慢,然后fun_asr也执行的很慢,所以实时识别的数据推送出来的特别慢。
Before asking:
What is your question?
Code
What have you tried?
What's your environment?
pip
, source):The text was updated successfully, but these errors were encountered: