Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

heygen video translation #8

Open
echokk11 opened this issue Mar 7, 2024 · 1 comment
Open

heygen video translation #8

echokk11 opened this issue Mar 7, 2024 · 1 comment

Comments

@echokk11
Copy link

echokk11 commented Mar 7, 2024

假如再大胆一点

  • whisper解决语音到字幕的问题
  • LLMs(chatgpt,google translate)解决多国语言翻译问题
  • MockingBird或者so-vits-svc-fork训练原配角色音色(声纹)
  • 根据分析出的文本时间轴,利用ffmpeg分割不同音色的视频到片段,同时用训练好的原配角色音色按照翻译后的文本生成音轨
  • (可选)再用GeneFace++或者Wav2Lip对应的口型矫正
  • 最后合并回去(ffmpeg)

这个是不是就是heygen video translation的大致实现思路,当然我是一个rookie,真的过程想必远比这个复杂,这里最大的难点是,如何识别出不同的声音的前后时间轴,中间还有相关的去背景音,识别误差校准等很多问题

@Chenyme
Copy link
Owner

Chenyme commented Mar 7, 2024

您好!非常感谢你的建议,我会尝试去复现这个流程,但由于学业压力我不得不会放缓项目进展。此外,内部消息说国内剪映在卷数字人,年底应该会有很好的开源方案。
再次谢谢你的建议~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants