Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

实时语音识别和VAD效果不好 #1820

Open
liurongjie174 opened this issue Jun 17, 2024 · 2 comments
Open

实时语音识别和VAD效果不好 #1820

liurongjie174 opened this issue Jun 17, 2024 · 2 comments
Labels
question Further information is requested

Comments

@liurongjie174
Copy link

Notice: In order to resolve issues more efficiently, please raise issue following the template.
(注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)

❓ Questions and Help

我用FunASR识别实时语音,由于那边推过来的流是通过WS推送PCM,每个包大小是234,然后用示例的funasr_wss_server.py去识别,vad和online效果不好。首先vad经常识别到的内容为[],导致fun_asr_online慢,然后fun_asr也执行的很慢,所以实时识别的数据推送出来的特别慢。

Before asking:

  1. search the issues.
  2. search the docs.

What is your question?

Code

What have you tried?

What's your environment?

  • OS (e.g., Linux):
  • FunASR Version (e.g., 1.0.0):
  • ModelScope Version (e.g., 1.11.0):
  • PyTorch Version (e.g., 2.0.0):
  • How you installed funasr (pip, source):
  • Python version:
  • GPU (e.g., V100M32)
  • CUDA/cuDNN version (e.g., cuda11.7):
  • Docker version (e.g., funasr-runtime-sdk-cpu-0.4.1)
  • Any other relevant information:
@liurongjie174 liurongjie174 added the question Further information is requested label Jun 17, 2024
@LauraGPT
Copy link
Collaborator

可以把包size弄大一些,例如,100ms一次

@liurongjie174
Copy link
Author

谢谢你的回答。我已经每次接收3s左右的包,大约有30000多二进制数量的包,再进行async_vad识别,但是依然返回很慢。查看了async_vad源码,利用model_vad进行generate获取segments_result,当这个返回值的数据长度为1时,并且里面的内容start和end不为0时才会进行后续的在线或者离线识别。但是我多次测试,发现要满足segments_result,当这个返回值的数据长度为1时,并且里面的内容start和end不为0很难,或者需要等待很久的时间,例如一段话全部说话,可能是1-2分钟左右。我现在对segments_result返回值代表的含义也是不理解的。烦请解开我的疑惑?
1.segments_result返回值代表的含义;
2.segments_result,当这个返回值的数据长度为1时,并且里面的内容start和end不为0要等待很久的原因。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants