You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
这是log文件中显示的内容:
Model summary:
Class Name: BiCifParaformer
Total Number of model parameters: 225.07 M
Number of trainable parameters: 225.07 M (100.0%)
Type: torch.float32
[2024-06-24 15:32:07,818][root][INFO] - Build optim
[2024-06-24 15:32:07,822][root][INFO] - Build scheduler
[2024-06-24 15:32:07,823][root][INFO] - Build dataloader
[2024-06-24 15:32:07,823][root][INFO] - Build dataloader
[2024-06-24 15:32:07,823][root][INFO] - total_num of samplers: 1, /home/ubuntu1/data/list/train.jsonl
[2024-06-24 15:32:07,823][root][INFO] - total_num of samplers: 2, /home/ubuntu1/data/list/val.jsonl
[2024-06-24 15:32:07,823][root][WARNING] - distributed is not initialized, only single shard
[2024-06-24 15:32:07,853][root][INFO] - Train epoch: 0, rank: 0
Notice: In order to resolve issues more efficiently, please raise issue following the template.
(注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)
❓ Questions and Help
Before asking:
之前使用paraformer-large模型,在finetune.sh文件中添加了max_token_length参数后,仍然无法识别大于20s的音频文件,在更换了paraformer-large的长音频版本后,还是出现同样的问题。
这是运行时显示的内容:
{'scp_file_list': ['/home/ubuntu1/data/list/train_wav.scp', '/home/ubuntu1/data/list/train_text.txt'], 'data_type_list': ['source', 'target'], 'jsonl_file_out': '/home/ubuntu1/data/list/train.jsonl'}
convert wav.scp text to jsonl, ncpu: 32
cpu: 0: 100%|██████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 5.67it/s]
cpu: 0: 100%|████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 4804.47it/s]
processed 5 samples
{'scp_file_list': ['/home/ubuntu1/data/list/val_wav.scp', '/home/ubuntu1/data/list/val_text.txt'], 'data_type_list': ['source', 'target'], 'jsonl_file_out': '/home/ubuntu1/data/list/val.jsonl'}
convert wav.scp text to jsonl, ncpu: 32
cpu: 0: 100%|██████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 2.29it/s]
cpu: 0: 100%|████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 3818.21it/s]
processed 2 samples
log_file: ./outputs/log.txt
这是log文件中显示的内容:
Model summary:
Class Name: BiCifParaformer
Total Number of model parameters: 225.07 M
Number of trainable parameters: 225.07 M (100.0%)
Type: torch.float32
[2024-06-24 15:32:07,818][root][INFO] - Build optim
[2024-06-24 15:32:07,822][root][INFO] - Build scheduler
[2024-06-24 15:32:07,823][root][INFO] - Build dataloader
[2024-06-24 15:32:07,823][root][INFO] - Build dataloader
[2024-06-24 15:32:07,823][root][INFO] - total_num of samplers: 1, /home/ubuntu1/data/list/train.jsonl
[2024-06-24 15:32:07,823][root][INFO] - total_num of samplers: 2, /home/ubuntu1/data/list/val.jsonl
[2024-06-24 15:32:07,823][root][WARNING] - distributed is not initialized, only single shard
[2024-06-24 15:32:07,853][root][INFO] - Train epoch: 0, rank: 0
What have you tried?
在finetune.sh中已经添加:
++dataset_conf.max_token_length=30000
但是还是没有作用
What's your environment?
pip
, source):The text was updated successfully, but these errors were encountered: