The Real-time and High-resolution Interactive Digital Human

方海泉; 余点

Please submit manuscripts in either of the following two submission systems

ScholarOne Manuscripts

ScholarOne

勤云稿件系统

Search by Issue

Search by Keywords

News & AnnouncementMORE

【03-29】2015 Outstanding Reviewers
【03-27】2014 Outstanding Reviewers
【02-18】2013 Outstanding Reviewers
【12-29】The First Outstanding Reviewers
【05-04】Copyright Transfer Agreement
【04-04】To authors

Supervised by Ministry of Industry and Information Technology of The People's Republic of China Sponsored by Harbin Institute of Technology Editor-in-chief Yu Zhou ISSNISSN 1005-9113 CNCN 23-1378/T

期刊网站二维码

微信公众号二维码

Related citation:

【Print】【HTML】【PDF download】【View/Add Comment】【Download reader】【 Close 】

Back Issue Advanced Search

This paper has been: browsed 69times downloaded 35times
Shared by: Wechat More Font:larger+\|default\|smaller-
The Real-time and High-resolution Interactive Digital Human

Author Name	Affiliation	Postcode
Haiquan Fang^*	School of Public Administration, Zhejiang University of Technology, Hangzhou 310023,China	310023
Dian Yu	School of Public Administration, Zhejiang University of Technology, Hangzhou 310023,China	310023

Abstract:

Synthesizing real-time, high-resolution, and lip-sync digital human is a challenging task. Although the Wav2Lip model represents a remarkable advancement in real-time lip-sync, its clarity is still limited. To address this, we enhanced the Wav2Lip model in this study and trained it on a high-resolution video dataset produced in our laboratory. Experimental results indicate that the improved Wav2Lip model produces digital humans with greater clarity than the original model, while maintaining its real-time performance and accurate lip-sync. We implemented the improved Wav2Lip model in a government interface application, generating a government digital human. Testing revealed that this government digital human can interact seamlessly with users in real-time, delivering clear visuals and synthesized speech that closely resembles a human voice.

Key words: digital human lip-sync video generation talking face generation

DOI：10.11916/j.issn.1005-9113.24056

Clc Number:TP39

Fund:

Search by Issue

Search by Keywords

News & AnnouncementMORE

LINKS