Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

为什么使用paraformer-large模型的长音频版运行finetune.sh脚本,还是无法识别20s以上的音频文件 #1843

Closed
lllmd opened this issue Jun 24, 2024 · 1 comment
Labels
question Further information is requested

Comments

@lllmd
Copy link

lllmd commented Jun 24, 2024

Notice: In order to resolve issues more efficiently, please raise issue following the template.
(注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)

❓ Questions and Help

Before asking:

  1. search the issues.
  2. search the docs.

之前使用paraformer-large模型,在finetune.sh文件中添加了max_token_length参数后,仍然无法识别大于20s的音频文件,在更换了paraformer-large的长音频版本后,还是出现同样的问题。

这是运行时显示的内容:
{'scp_file_list': ['/home/ubuntu1/data/list/train_wav.scp', '/home/ubuntu1/data/list/train_text.txt'], 'data_type_list': ['source', 'target'], 'jsonl_file_out': '/home/ubuntu1/data/list/train.jsonl'}
convert wav.scp text to jsonl, ncpu: 32
cpu: 0: 100%|██████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 5.67it/s]
cpu: 0: 100%|████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 4804.47it/s]
processed 5 samples
{'scp_file_list': ['/home/ubuntu1/data/list/val_wav.scp', '/home/ubuntu1/data/list/val_text.txt'], 'data_type_list': ['source', 'target'], 'jsonl_file_out': '/home/ubuntu1/data/list/val.jsonl'}
convert wav.scp text to jsonl, ncpu: 32
cpu: 0: 100%|██████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 2.29it/s]
cpu: 0: 100%|████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 3818.21it/s]
processed 2 samples
log_file: ./outputs/log.txt

这是log文件中显示的内容:
Model summary:
Class Name: BiCifParaformer
Total Number of model parameters: 225.07 M
Number of trainable parameters: 225.07 M (100.0%)
Type: torch.float32
[2024-06-24 15:32:07,818][root][INFO] - Build optim
[2024-06-24 15:32:07,822][root][INFO] - Build scheduler
[2024-06-24 15:32:07,823][root][INFO] - Build dataloader
[2024-06-24 15:32:07,823][root][INFO] - Build dataloader
[2024-06-24 15:32:07,823][root][INFO] - total_num of samplers: 1, /home/ubuntu1/data/list/train.jsonl
[2024-06-24 15:32:07,823][root][INFO] - total_num of samplers: 2, /home/ubuntu1/data/list/val.jsonl
[2024-06-24 15:32:07,823][root][WARNING] - distributed is not initialized, only single shard
[2024-06-24 15:32:07,853][root][INFO] - Train epoch: 0, rank: 0

What have you tried?

在finetune.sh中已经添加:
++dataset_conf.max_token_length=30000
但是还是没有作用

What's your environment?

  • OS (e.g., Linux):
  • FunASR Version (e.g., 1.0.27):
  • ModelScope Version (e.g., 1.15.0):
  • PyTorch Version (e.g., 2.3.0):
  • How you installed funasr (pip, source):
  • Python version:
  • GPU (e.g., Tesla P40)
  • CUDA/cuDNN version (e.g., cuda12.4):
  • Any other relevant information:
@lllmd lllmd added the question Further information is requested label Jun 24, 2024
@LauraGPT
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants